San Diego State University logo

Department of Linguistics and Oriental Languages

Contents

Goals

Required Text

Course outline

Prequisites

Grading

Place and Time

Course home

Contact Info


Sites

Statistical MT Home

Baseline System

Textbook site

Statistical MT links

Python links wiki

Stat Textbook

Statistical NLP Site

General NLP Site

Unix Tools

Computational Linguistics

Advanced Statistical Methods Syllabus

Linguistics 696


Goals

For parsing we will begin a review Context free parsing, focusing on CKY, the bottum up algorithm that has the most direct application to PCFGs (Probabilistic Context-Free Grammars). We will continue with some enhancements to PCFGs which are loosenings of its context freeness assumptions which have come to be called Generalized PCFGs, which actually don't change theor computational properties but greatly enhance their predictive power. We will look at issues of parser evaluation, search, lexicalizing PCFGs, and discriminative parsing.

For MT, we will review some classic MT systems, move on to the Noisy Channel model that has been so influential in statistical MT (SMT), and then cover basic components of a modern system, the word-alignment training, phrase alignment training, target language-modeling, and decoding. We will look at the contributions made by introducing classes and hierarchical synatctic information. Then we will read some papers and do some simple experiments with sense-disambiguation.

Course Outline

Here.

Required Text

The text for the class will be Jurafsky and Martin, Speech and Natural Language Processing, with some material from the 2nd Edition, focusing on Chapters 19 and 24, the word sense and MT chapters. There are also additional readings available online (see course outline).

Prerequisites and Grading

Prequisite: Some computer science or some linguistics; preferably Ling 581.

Grading will be based on exercises/projects a take-home midterm and final.

    Assignments 30 %
    Presentation 20 %
    Final Project 50 %

Back to top.

Place and Time

Wed 4:00-6:40
AH 3150
Adams (the) Humanity

Website

http://www-rohan.sdsu.edu/~gawron/ling696

Contact Info

Mailing address:
Jean Mark Gawron
Department of Linguistics and Oriental Languages
San Diego State University
5500 Campanile Drive
San Diego, CA 92182-7727
Telephone: (619) 594-0252
Office Hours: Tu Th 10:00-12:30, BAM 321

Back to top.


Unix | Computational Linguistics Lab