Adapting n-gram maximum entropy language models with conditional entropy regularization

Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accurate estimates of language model parameters are critical for building quality text generation systems, such as automatic speech recognition. However, text training data for a domain of interest is often unavailable. Instead, we use semi-supervised model adaptation; parameters are estimated using both unlabeled in-domain data (raw speech audio) and labeled out of domain data (text.) In this work, we present a new semi-supervised language model adaptation procedure for Maximum Entropy models with n-gram features. We augment the conventional maximum likelihood training criterion on out-of-domain text data with an additional term to minimize conditional entropy on in-domain audio. Additionally, we demonstrate how to compute conditional entropy efficiently on speech lattices using first- and second-order expectation semirings. We demonstrate improvements in terms of word error rate over other adaptation techniques when adapting a maximum entropy language model from broadcast news to MIT lectures.

Original languageEnglish (US)
Title of host publication2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings
Pages220-225
Number of pages6
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011 - Waikoloa, HI, United States
Duration: Dec 11 2011Dec 15 2011

Publication series

Name2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011, Proceedings

Conference

Conference2011 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2011
Country/TerritoryUnited States
CityWaikoloa, HI
Period12/11/1112/15/11

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Adapting n-gram maximum entropy language models with conditional entropy regularization'. Together they form a unique fingerprint.

Cite this