|
This is a introductory course in computational linguistics at an
advanced level.
Reference Textbook We will make use of selected chapters from Speech and Language Processing 2nd edition, by D. Jurafsky and J.H. Martin, Prentice-Hall 2008. Email List Hosted at listserv.arizona.edu The name of the list is LING538@LISTSERV.ARIZONA.EDU
Software We will use Perl and SWI-Prolog (freely available) in the computer laboratory classes. Students will implement finite state automata, transducers, parsers and translation programs based on grammar rules in a series of computer laboratory exercises. In the case of numerical calculations, we will make use of Microsoft Excel for worked examples and homework questions. |
Instructor: Sandiway Fong sandiway@email.arizona.edu
Office: Douglass 311
|
| Date | Lecture Notes | Number of Slides |
Topic | |
|---|---|---|---|---|
| Powerpoint | ||||
| 8/26 | lecture1.pdf | lecture1.pptx | 24 | Administrivia and Introduction. Homework 1.
Updated: 8/26 9pm |
| 8/28 | lecture2.pdf | lecture2.pptx | 9 | Quiz. Introduction to Perl notes. Homework 2.
Updated: 8/29 5pm |
| Date | Lecture Notes | Number of Slides |
Topic | |
|---|---|---|---|---|
| Powerpoint | ||||
| 9/2 | lecture3.pdf | lecture3.pptx | 6 | Introduction to Perl contd. Homework 3.
Updated: 3pm 9/2 |
| 9/4 | lecture4.pdf | lecture4.pptx | 5 | Introduction to Perl contd.
Updated: 3pm 9/4. |
| 9/9 | lecture5.pdf | lecture5.pptx | 18 | Homework 3 review. Regular Expressions.
Homework exercise - ungraded Updated 11:39pm 9/9. |
| 9/11 | lecture6.pdf | lecture6.pptx | 11 | Homework exercise review. Regular Expressions contd.
Homework 6. |
| 9/16 | lecture7.pdf | lecture7.pptx | 21 | Finite State Automata (FSA). Implementation in Perl. Epsilon
transitions. Non-deterministic FSA. Set-of-states construction.
Ungraded homework. Updated 2:10pm 9/16 |
| 9/18 | lecture8.pdf | lecture8.pptx | 24 | Homework 6 Review.
Ungraded homework from last time (review).
FSA and REs. Updated: 2:30pm 9/16 Updated: 2:40pm 9/30 |
| 9/23 | lecture9.pdf | lecture9.pptx | 11 | Guest lecture: Dr. Ray Tillman, US Air Force Research Labs, Mesa
AZ
Lecture slides. Link to Podcast here |
| 9/25 | lecture10.pdf | lecture10.pptx | 15 | Introduction to the Chomsky Hierarchy. Regular
Grammars. SWI-Prolog.
Ungraded Homework Exercise. Updated: 4:14pm 9/25 Updated: 2pm 9/30 |
| 9/30 | lecture11.pdf | lecture11.pptx | 20 | Regular Grammars and Prolog contd. Converting FSA to REs.
Homework 11. Updated: 2pm 9/30 |
| Date | Lecture Notes | Number of Slides |
Topic | |
|---|---|---|---|---|
| Powerpoint | ||||
| 10/2 | lecture12.pdf | lecture12.pptx | 18 | Regular Grammars and left recursion. Beyond regular languages. The Pumping Lemma for regular languages. |
| 10/7 | lecture13.pdf | lecture13.pptx | 28 | Important dates: midterm exam and guest lectures #2 and #3.
Homework 11 review. Morphology and stemming. Google and stemming. Updated: 2:30pm 10/7 Updated: 8:30pm 10/14 |
| 10/9 | lifescience.pdf iplant.pdf suebrown.pdf |   | 85 | Guest lecture: Nirav Mechant and Prof. Sue Brown, iPlant
Project, University of Arizona
Podcast: here Title: Enhancing the discovery life cycle: Application of information mining and retrieval in life sciences ... Text mining plays a very important role in connecting and transforming text into more "computable" forms that facilitate integration of data, we will discuss some of the challenges and impediments along with advances and standards such as BioCreative and semantic web. To effectively utilize the emerging tools and future data sets we need a fundamental shift in paradigm at multiple levels of the discovery process; based at the University of Arizona the iPlant Collaborative (iPC) is a distributed, cyberinfrastructure-centered, international community of plant and computing researchers to enable new conceptual advances through adoption of computational thinking to address compelling grand challenges in the plant sciences and associated, cutting-edge research challenges in the computing sciences. We will discuss recent activities and roadmap for iPC along with opportunities for students to interact with various groups involved in iPlant. |
| 10/14 | lecture15.pdf | lecture15.pptx | 29 | Stemming: the Porter Stemmer. Finite State Transducers. |
| 10/16 |   |   |   | No lecture. |
| 10/21 | lecture16.pdf | lecture16.pptx | 20 | Spelling errors. Edit Distance Computation.
Updated: 2:30p, 10/21 |
| 10/23 | Slides not available |   |   | Guest lecture: David Pinkus, Google (Tempe AZ)
Title: Natural Language Processing and the next 9 years of search. Abstract: Google's mission is to organize the world's information and make it universally accessible and useful. To do this requires not just continually crawling and indexing the world wide web (among other sources), but also translating, on demand, that information into potentially any one of the currently 100 languages supported by Google. This talk will explore some of what Google can do with its large corpus of information, and specifically some successes in language translation. |
| 10/28 | lecture18.pdf | lecture18.pptx | 12 | Midterm exam. |
| 10/30 | lecture19.pdf | lecture19.pptx | 31 | Introduction to probability. N-grams. N-gram software. |
| Date | Lecture Notes | Number of Slides |
Topic | |
|---|---|---|---|---|
| Powerpoint | ||||
| 11/4 | lecture20.pdf | lecture20.pptx | 35 | N-grams contd. Smoothing. Back-off interpolation.
Excel: addone.xls, wb.xls |
| 11/6 | lecture21.pdf | lecture21.pptx | 33 | Midterm review session. Homework 21.
Corpus for Homework 21: WSJ9_041.txt Updated: 4:30pm 11/6 Updated: 4:15am 11/8: Typo in HW: Bristol-Myers is correct spelling |
| 11/11 |   |   |   | Veterans Day: No lecture. |
| 11/13 | lecture22.pdf | lecture22.pptx | 24 | Homework review.
Updated: 2:30pm 11/13 |
| 11/18 | lecture23.pdf | lecture23.pptx | 38 | Part of speech tagging.
Updated: 8pm 11/18 |
| 11/20 | lecture24.pdf | lecture24.pptx | 26 | Context-free grammars. The uses for extra arguments in DCGs: parse tree computation and agreement. |
| 11/25 | lecture25.pdf | lecture25.pptx | 34 | Context-free grammars contd. Using lookahead to deal with
recursion.
Treebanks. The Penn Treebank. Tgrep2.
Homework handed out: due next Tuesday Updated: 11/25 3:20p |
| 11/27 |   |   |   | Thanksgiving: no lecture. |
| Date | Lecture Notes | Number of Slides |
Topic | |
|---|---|---|---|---|
| Powerpoint | ||||
| 12/2 | lecture26.pdf | lecture26.pptx | 14 |
Presentation assignments Treebanks. The Penn Treebank. Demos: Tgrep2. tregex Final homework: out today, due next Tuesday Optional Project Updated: 3pm 12/2 |
| 12/4 | lecture27.pdf | lecture27.pptx | 45 | Context-free grammars parsing: left corner, LR parsing. |
| 12/9 | lecture28.pdf | lecture28.pptx | 4 | Class presentations. |