Jim Fuerstnau
A Method of Speech Enhancement Based on Across-Frequency Envelope Correlation.
Monday, May 14, 2001
Abstract
Understanding human speech in the presence of background noise is challenging for most people, and even more so for the hearing impaired. This problem is magnified when the interfering sounds are similar to the target speech, for example in a multi-talker (cocktail-party) environment. This dissertation proposes a signal-processing scheme that uses across- frequency envelope correlation to address this problem by enhancing speech in background noise. Based on this premise, four separate algorithms are examined. Two of the algorithms apply a threshold to the correlation coefficients estimated in each time frame from the envelopes of 56 channel auditory filter-bank outputs to determine the gain in each channel. The other two algorithms use the elements of the eigenvectors of the correlation coefficient matrix corresponding to the largest eigenvalues to determine the 56 channel gains. To test the algorithms, speech from five different speakers, each masked by three different types of background noises at five different input signal-to-noise ratios, were processed and four different metrics of speech enhancement were obtained. The main metric was signal-to- noise ratio (SNR) improvement. Subjective ratings by a human listener were also used to estimate improvements in intelligibility and quality of the speech. In addition, an objective quality indicator was computed by comparing the original undegraded speech with both processed and unprocessed speech in noise. Results measured as SNR improvement were positive in several cases depending upon the type of noise, signal-to-noise ratio, and algorithm used. The first two algorithms showed considerable improvement in white noise with lesser improvements for multi-talker babble noise. The other two algorithms showed less improvement in all three types of noise. An objective measurement technique correlated with the observations of signal- to-noise ratio improvement. Subjective verification of perceived speech intelligibility and quality using a single human listener produced mixed results. Compared to previous investigation of single-input speech-enhancement schemes, the best of the present algorithms appear to be at least competitive and, in several cases, superior. Enhancing speech degraded by interfering sounds on the basis of across-frequency envelope correlation is indeed promising and may offer an effective method to improve hearing-aid technology.
Committee Members: Prof. S�ren Buus (Co-advisor) Prof. Dana Brooks (Co-advisor) Prof. Hanoch Lev-Ari