NLP on spoken documents without ASR

Mark Dredze, Aren Jansen, Glen Coppersmith, Ken Church

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There is considerable interest in interdisciplinary combinations of automatic speech recognition (ASR), machine learning, natural language processing, text classification and information retrieval. Many of these boxes, especially ASR, are often based on considerable linguistic resources. We would like to be able to process spoken documents with few (if any) resources. Moreover, connecting black boxes in series tends to multiply errors, especially when the key terms are out-of-vocabulary (OOV). The proposed alternative applies text processing directly to the speech without a dependency on ASR. The method finds long (∼ 1 sec) repetitions in speech, and clusters them into pseudo-terms (roughly phrases). Document clustering and classification work surprisingly well on pseudoterms; performance on a Switchboard task approaches a baseline using gold standard manual transcriptions.

Original languageEnglish (US)
Title of host publicationEMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
Pages460-470
Number of pages11
StatePublished - 2010
Externally publishedYes
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2010 - Cambridge, MA, United States
Duration: Oct 9 2010Oct 11 2010

Publication series

NameEMNLP 2010 - Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference

Conference

ConferenceConference on Empirical Methods in Natural Language Processing, EMNLP 2010
Country/TerritoryUnited States
CityCambridge, MA
Period10/9/1010/11/10

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'NLP on spoken documents without ASR'. Together they form a unique fingerprint.

Cite this