LING581: Advanced Computational Linguistics

Pre-requisities

LING 538: Computational Linguistics. Offered in Fall semesters.

Required

Both LING 538 and 581 are required for students enrolled in the HLT Master's Program.

Classroom: Place and Time

One class per week. Spring 2007: 3-5:30pm. Social Sciences 224.

Course Objectives and Description

This is a course designed to give students more in-depth experience with natural language software packages than possible in an introductory survey course such as LING 538.

The semester will be structured into 3 or 4 week units, each of which will deal with a project on a specific natural language software package.

Students will be expected to be able to gain enough familiarity to install, run and perform project work on these packages on their own machines.

Possible projects include (and may not be limited to):

  1. Treebanks: Penn Treebank, lookup software.
  2. Ontologies and Semantic Networks: WordNet etc.
  3. Sentence Parsing using grammar and linguistic theory: PAPPI etc.
  4. Machine Translation.
  5. Finite State Automata.
  6. Corpora search.

Grading

Students will submit a separate project report for each project tackled. The final grade will be a function of the cumulative sum of the scores for each project.

Reading and Computational Resources

Required reading will be in the form of project documentation/papers and dissertations to be made available on the WWW.

Computer laboratory facilities will be made available for project work. But students should be able to work on their own machines.


Lecture Schedule

January 10th

Initial meeting. Penn Treebank 3 distribution.

Lecture Notes: lecture1.pdf

January 17th

Perl programming lecture. Task 1. Penn Treebank trees into treebankviewer Prolog trees.

Lecture Notes: lecture2.pdf

January 24th

tgrep2. Task 2. Get tgrep2 working with the treebankviewer.

Lecture Notes: lecture3.pdf (updated 1/25/07)

January 31st

Verb alternations. Task 3. The PTB and EVCA.

Lecture Notes: lecture4.pdf (updated: 1/31/07)
EVCA index file: evca93.index.

February 7th

Task 3. Case study: the verb join.

Lecture Notes: lecture5.pdf

February 14th

Task 4. Collins parser.

Lecture Notes: lecture6.pdf

February 21st

Task 5. MXPOST and Collins parser. EVCA alternations.

Lecture Notes: lecture7.pdf (Updated: 5:30pm)

March 7th

Lecture Notes: lecture8.pdf

March 14th

Spring Break: no lecture

March 21st

Lecture Notes: lecture9.pdf Introduction to WordNet 3.0 and the Prolog database files.
5papers.pdf WordNet: 5 papers.
code.pl Prolog code for metric |S| (revised: 3/23)
wn_s.pl.zip s/6 (.zip, corrected version)

(.txt suffix used to work around browser download permissions. Rename .txt file to .pl for convenience.)

March 28th

Lecture Notes: lecture10.pdf

Worked example for zealous to impassioned vs. ravenous.
code2.pl Prolog code for metric |Q|+|S|

April 4th

Lecture Notes: lecture11.pdf

Parsing WordNet gloss example sentences using the Collins parser. Task 8. Worked example for vanish.

April 11th

Lecture Notes: lecture12.pdf

Question Answering (QA). Task 9.

April 25th

Lecture Notes: lecture13.pdf (modified: 4/26)

More on Question-Answering. Extended WordNet (XWN).

May 2nd

Lecture Notes: lecture14.pdf

What next? Task 9 Presentations. Class evaluations.