The speech processing lab has several active projects with collaborators both internal and external to San Diego State University.
Speaker recognition
Speaker recognition is a form of biometric where the goal is to construct algorithms to determine a person's identity based upon their voice. Projects we have worked on include recognition in the presence of noise, speaker segmentation, extensions to Gaussian selection to set thresholds independently of the number of dimensions, and high level speaker recognition. While speaker recognition remains an area of interest, much of our new work focuses on bioacoustics.
Current students: None at present.
Bioacoustics
We work with the
Marine Physical Lab's Whale Acoustics group led by Dr. John Hildebrand at The Scripps
Institution of Oceanography. Current projects include the identification of odontocete (dolphin) species and the recognition of
specific call types in baleen whales. In addition, we work with Hans-Werner Braun of HPWREN,
Dr. Dan Moriarty from The University of San Diego, and Kim Miller of The California Wolf Center,
on a system to categorize the calls of a captive wolf population.
Providing automated analysis of sounds produced by animals helps biologists to better understand the ecosystem by providing information about population dynamics (abundance, seasonality, etc.) and may eventually yield information about how they communicate with one another.
Current students: Shyam Kumar Madhusudhana
Control systems for hearing aids
In collaboration with the
Assistive Devices Lab led by Dr. Richard Hurtig in
The Dept. of Speech Pathology and Audiology at The University of Iowa, we design dynamic control systems for
hearing aids. Our current system parameterizes a frequency compression algorithm which reduces the frequency
bandwidth of a signal while maintaining formant ratios. Formants are the harmonic frequencies which are
reinforced by our vocal tract and the ratio between formants is known to be important for perception.
By remapping the frequency domain, we move the formants to an audible range for a severely hearing impaired listener.
Current students: Jim Du
A note to students interested in joining the lab
It is extremely rare to accept a student without the student having successfully completed CS 682, Speech Processing, which provides a graduate level introduction to speech and speaker recognition. Typically, we accept one or two new students per year, so successful completion of the course is not a guarantee that there will be an open position. Students who are interested in taking CS 682 which is offered each spring should endeavor to make sure they meet the minimum prerequisites for the course: CS 310 or COMPE 260 (Data Structures, higher level CS/COMPE typically expected), Math 254 (Lineary Algebra), and Stat 551A (Statistics I). Although not required, additional background in statistics is highly desireable.
Lab Alumni
- Rhonda Hoenigman, “Support vector machine classification for applications of auditory scene analysis,” M.S. computer science, summer 2007.
- Deborah Curless, “Automated sensor acquisition and classification,” M.S. computer science, Spring 2007.
- Sonia Arteaga, B.S computer engineering, Spring 2006.
- Jinyi Wang, “A maximum entropy approach for high level speaker recognition,” M.S. computer science,Summer 2006
- Tong Huang, “Audio Scene Analysis for Hearing Aids,” M.S. computer science, Summer 2005.
- Jing Liu, “Instantaneous labeling of gender,” M.S. computer science, Spring 2004.
- Yanliang Chang, “Application of the Bayesian information criterion to speaker change point detection,” M.S. computer science, Fall 2003.
- Min-Wei Chan, “A Study of integration approximation methods for the integral decode,” M.S. computer science,Spring 2003.

