Amit Srivastava

Story Segmentation in Audio Indexing

Tuesday, August 25, 1999
2:00 PM
406 Egan

Abstract

More and more audio and video data are produced everyday by increasingly available multimedia systems. Several advanced video management systems have emerged to meet the needs of efficient data storage and management. In contrast, systems for managing pure audio streams are only beginning to emerge. The Rough'n'Ready system, developed at BBN Corporation in a collaborative effort with the CDSP Center at Northeastern University, is an integrated audio indexing and management system. Management here is defined as content-based information indexing, browsing and retrieving. It takes audio streams as input and automatically outputs ''rough'' transcriptions that are ''ready'' for intelligent management. State-of-the-art techniques like speech recognition, speaker-change detection, speaker identification, named-entity extraction, and story segmentation are used to produce a fully indexed document database. The story segmentation system, developed during the course of this thesis, segments the raw transcripts obtained after speech recognition into story-like units that are relatively homogeneous in topic content. This effectively imposes a document model onto speech that can then be used by advanced text-based approaches of information extraction and retrieval.

Thesis Committee:
Prof. John Makhoul (advisor)
Prof. Bahram Shafai
Mr. Francis Kubala (BBN Corporation)