Computational Linguistics Syllabus

Course Outline

Linguistics 581

Day

Reading

Assignment

Lecture

Background

Code

Thu Jan 21 Chapter 1 and Section 2.1 of Chapter 2. Jurafsky and Martin (J&M) History of Computational Linguistics. Regular Expressions and Unix demo. Assignment 1. Some history. Textbook intro. Polish fried fish. What is Computational Linguistics?  
Tue Jan 26 Chapter 2, 2.1-2.4 Finite-State Automata   Textbook Ch. 2 slides Non-deterministic automata, epsilon transitions    
Thu Jan 28 Jury Duty Jury Duty Jury Duty Jury Duty Jury Duty
Tue Feb 02 Reading: Sections 3.1-3.3 of Chapter 3. Introducing words and word parts. Sections 3.4-3.9 of Chapter 3. Assignment 2 Installing Python. Textbook Ch. 3 slides Introduction to transducers    
Thu Feb 04          
Tue Feb 09 Chapter 3. XFST intro. XFST assignment (Due Feb. 18) [Compling lab (if needed): SHW 243]      
Thu Feb 11          
Tue Feb 16 Chapter 4: J&M. 4.1-4.3. Word counting, frequency dictionaries, simple ngram models, the training corpus.   Ngrams Brief probability intro Introduction to NLTK. Peter Norvig on ngrams, word segmentation, spelling correction, statistical machine translation.    
Thu Feb 18          
Tue Feb 23 Section 4.1-4.5 of chapter 4. Practicalities. Sections 4.5.3 Chapter 4. Sections 4.4 and 4.5.1, 4.5.2 Chapter 4. Smoothing, Add-1 smoothing, Kneser-Ney smoothing. Unknown words. NLTK assignment, using Pylab, importing NLTK corpora. Lecture. Smoothing Lecture. Kneser-Ney Lecture.    
Thu Feb 25          
Tue Mar 01 5.1-5.4. Word-class and part of speech tagging. Rule-based taggers, 6.1, 6.2, 6.4 (decoding with HMMs). The Pollard assignment and smoothing assignment are due next week. Lecture. Tagging slides    
Thu Mar 03   Code example covered in class      
Tue Mar 08 Chapter 5 of the NLTK book.Taggers used on data. Tagging Assignment. Tagging slides    
Thu Mar 10          
Tue Mar 15 Chapter 5 of the NLTK book.Taggers used on data.   Tagging slides    
Thu Mar 17   Computing Viterbi by hand., Code for computing Brown tag counts. (discussied in class), Pollard Smoothing solution (in class code)      
Tue Mar 22 6.1-6.5 HMM Taggers/HMM models   Tokenization basics, Practical issues: tokenization, Viterbi in Python, Why Viterbi works, example Chapter 5 of the NLTK book.Taggers used on data, Manning on the state of the art of POS tagging, Manning 2011 paper on state of the art in POS tagging.    
Thu Mar 24   Answer to Viterbi tagging problem (use for midterm prep),      
Tue Mar 29 H'day H'day H'day H'day H'day
Thu Mar 31 H'day H'day H'day H'day H'day
Tue Apr 05 Chapter 12. Context Free Grammars of English, Treebanks Assignment: Compling midterm 2016. Jurafsky/ Martin parsing, topdown lecture, Lecture: Top down parsing,. Parsing as search.   td_parser-0.1:an implementation of a recursive descent top down recognizer.
Thu Apr 07          
Tue Apr 12 Chapter 13.1 Parsing. 13.4.1.2 CKY algorithm (bottum up parsing with a chart), Earley algorithm (top down parsing with chart)        
Thu Apr 14   Grammar assignment Parsing assignment (CKY).      
Tue Apr 19 Chapter 14. Probabilistic Context Free Grammars   Lecture, Formal properties of PCFGs (for the mathematically inclined).    
Thu Apr 21   Assignment: Prob parsing assignment, CKY assignment solution, CKY implementation in these notes (not assigned).      
Tue Apr 26 Chapter 25. Machine Translation Prob parsing assignment solution. Machine translation notes on IBM alignment Model and EM.    
Thu Apr 28   Assignment: Machine translation assignment      
Tue May 03          
Thu May 05 Chapter 25 Machine translation continued. Final, MT assignment solution, MT evaluation slides. Machine translation notes on IBM alignment Model and EM, Last class day