LING581: Advanced Computational Linguistics

Basic info:

Pre-requisities

LING 538: Computational Linguistics. Offered in Fall semesters (taught by Dr. Sandiway Fong).

Required

Both LING 538 and 581 are required for students enrolled in the HLT Master's Program.

Course Objectives and Description

This is a seminar-format course designed to give students more in-depth experience with natural language software packages than possible in an introductory survey course such as LING 538.

The semester will be structured into 3 or 4 week units, each of which will deal with a project on a specific natural language software package.

Students will be expected to be able to gain enough familiarity to install, run and perform project work on these packages on their own machines.

Possible projects include (and may not be limited to):

  1. Treebanks: Penn Treebank, lookup software.
  2. Ontologies and Semantic Networks: WordNet etc.
  3. Tagging
  4. Sentence Parsing
  5. Corpora search.
  6. Machine Translation.

Course Requirements:

Students will submit a separate project report for each project tackled. For some larger tasks, each student is also expected to give a presentation on the task. The presentation and the report together determine the grade of each project. The final grade will be a function of the cumulative sum of the scores for each project.

Class participation: 10%
Project reports
50%
Project presentations
40%

Reading and Computational Resources

Required reading will be in the form of project documentation/papers and dissertations to be made available on D2L: http://d2l.arizona.edu. (A student ID is required for login)

The HLT server (hlt.sbs.arizona.edu) will be available for you to access the necessary data required for some tasks. An account will be created for each student who is enrolled in the class. Your user name will be the same as your NetID. Passwords will be distributed in the lab. Please change your password after your first login. But students should be able to work on their own machines, since some software installations will require administrator's access to the computer.

Tentive Schedule

This schedule is subject to change.

Jan. 16th
Initial Meeting, Penn Treebank 3 distribution
Jan. 23rd
Intro to Perl programming. Transforming PTB trees
Jan. 30th
Installing and querying trees with TGREP2.
Feb. 6th
Alternation of verbs in PTB
Feb. 13th
Tagging, Testing the MXPOST tagger
Feb. 20th
Broad-coverage parsing, Collins parser
Feb. 27th
Other parsers (Charniak, etc.)
Mar. 5th
Comparing different broad-coverage parsers
Mar. 12th
Wordnet
Mar. 19th
Spring Break -- No class
Mar. 26th
Word similarity metrics
April 2nd
Apply parsing to extract verb frames
April 9th
Question Answering
April 16th
Question Answering
April 23rd
Other applications of parsing
April 30th
Parallel Corpora, Statistical Machine Translation
May 7th
TBA
   
*Note: we may have a guest lecture. Once the date has been finalized, the schedule will be adjusted accordingly.