SDSU logo

Marie A. Roch
Associate Professor of Computer Science


 

Speech Processing Lab Overview

 


Summer 2005 picnic - lab members, alumni, & families

The speech processing lab has several active projects with collaborators both internal and external to San Diego State University.

Speaker recognition

Speaker recognition is a form of biometric where the goal is to construct algorithms to determine a person's identity based upon their voice. Projects we have worked on include recognition in the presence of noise, speaker segmentation, extensions to Gaussian selection to set thresholds independently of the number of dimensions, and high level speaker recognition. While speaker recognition remains an area of interest, much of our new work focuses on bioacoustics.

Current students: None at present.

Bioacoustics

Pacific white-sided dolphinWe work with the Marine Physical Lab's Whale Acoustics group led by Dr. John Hildebrand at The Scripps Institution of Oceanography. Current projects include the identification of odontocete (dolphin) species and the recognition of specific call types in baleen whales. In addition, we work with Hans-Werner Braun of HPWREN, Dr. Dan Moriarty from The University of San Diego, and Kim Miller of The California Wolf Center, on a system to categorize the calls of a captive wolf population.

Providing automated analysis of sounds produced by animals helps biologists to better understand the ecosystem by providing information about population dynamics (abundance, seasonality, etc.) and may eventually yield information about how they communicate with one another.

Current students: Shyam Kumar Madhusudhana

Control systems for hearing aids

In collaboration with the Assistive Devices Lab led by Dr. Richard Hurtig in The Dept. of Speech Pathology and Audiology at The University of Iowa, we design dynamic control systems for hearing aids. Our current system parameterizes a frequency compression algorithm which reduces the frequency bandwidth of a signal while maintaining formant ratios. Formants are the harmonic frequencies which are reinforced by our vocal tract and the ratio between formants is known to be important for perception. By remapping the frequency domain, we move the formants to an audible range for a severely hearing impaired listener.

Current students: Jim Du

A note to students interested in joining the lab

It is extremely rare to accept a student without the student having successfully completed CS 682, Speech Processing, which provides a graduate level introduction to speech and speaker recognition. Typically, we accept one or two new students per year, so successful completion of the course is not a guarantee that there will be an open position. Students who are interested in taking CS 682 which is offered each spring should endeavor to make sure they meet the minimum prerequisites for the course: CS 310 or COMPE 260 (Data Structures, higher level CS/COMPE typically expected), Math 254 (Lineary Algebra), and Stat 551A (Statistics I). Although not required, additional background in statistics is highly desireable.

Lab Alumni