CS 682 Speech Processing
Class meets: MW, 19:00-20:15, GMCS-254, Syllabus (PDF)
Note that in addition to normal office hours, you can see me on Wednesday mornings at 9:00 provided you send an e-mail the day before (make sure you do early enough that I have a chance to read it).
Text: Spoken Language Processing, Huang, Acero & Hon, Prentice Hall 2002.
Final Exam
The final exam for the course will be held in our normal classroom on Saturday, May 17th from 10:00 - 12:00. Bring a blue book (available at the SDSU book store).
Course materials
Tentative schedule for the semester.
Slides
- Motivation and architecture
- Sound, speech, and perception
- Digital systems, convolution demo
- Classifiers Part I, Part II
- Cepstral features
- Language modeling
- hidden Markov models (HMMs) Part I, Part II
- Decisions for acoustic modeling
- Search
Additional texts on speech recognition or audio
If you have difficulties with a presentation in Huang, Acero, and Hon, the simplest method is to simply ask me to explain it during office hours. If you so choose, you might want to consult another text. Here are a few relevant texts:- Digital Speech Processing, Synthesis, and Recognition, Second edition, Furui, 2000.
- Fundamentals of Speech Recognition, Rabiner and Juang, 1993.
- Speech Communications: Human and Machine, Second edition, O'Shaughnessy, 2000.
- Statistical Methods for Speech Recognition, Jelinek, 1998.
- The following texts are not on speech recognition, but offer accesible
presentations to relevant material:
- The Science of Musical Sounds, Sundberg, 1991.
- Fundamentals of Hearing, Third edition, Yost, 1994.
- Signals and Systems Made Ridiculously Simple, Karu, 1994.
- A Course in Phonetics, Ladefoged, Heinle & Heinle, 2001. Ladefoged has nice point and click to listen chart of the IPA vowels and consonants.
Frequently Asked Questions
- rohan accounts - How to obtain one? How to transfer files to or from?
- How can I submit soft-copies of code for assignments?
- How can I access a GUI program on rohan using XWin-32 from GMCS 425 or the library?
- How can I listen to or records speech (Wavesurfer)?
- Instructions for mapping Windows networked drives can be found in the course documents section of this course's Blackboard site.
- IPA/TIMIT/CMU phone mappings
- CMU Pronunciation Dictionary
- How can I set the PATH for Windows or UNIX?
- Guide to ciations (IEEE style).
