Guruprasad Saikumar

MMI Training for Automatic Segmentation of Conversational Telephone Speech

Date: Thursday, August 04, 2005

Abstract:
The last several years have seen significant improvement in speech recognition accuracy by using discriminative training methods such as Maximum Mutual Information (MMI). In this thesis, we apply MMI-based discriminative training for automatic segmentation of conversational telephone speech. We discuss the details of implementing MMI based training and provide experimental results showing the effects of different model complexity and number of training iterations. We compare the performance of the segmentation trained with MMI to both the Maximum Likelihood (ML) based segmentation and manual segmentation. The results show that MMI consistently outperforms ML in terms of word error rate. Moreover the performance is close or equal to that achieved by human annotated segmentation.

Committee:
Dr. John Makhoul (Advisor)
Prof. Dana Brooks
Prof. Jennifer Dy
Mr. Daben Liu (BBN Technologies)