Efficient structured language modeling for speech recognition

Ariya Rastrow, Mark Dredze, Sanjeev Khudanpur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The structured language model (SLM) of [1] was one of the first to successfully integrate syntactic structure into language models. We extend the SLM framework in two new directions. First, we propose a new syntactic hierarchical interpolation that improves over previous approaches. Second, we develop a general information-theoretic algorithm for pruning the underlying Jelinek-Mercer interpolated LM used in [1], which substantially reduces the size of the LM, enabling us to train on large data. When combined with hill-climbing [2] the SLM is an accurate model, space-efficient and fast for rescoring large speech lattices. Experimental results on broadcast news demonstrate that the SLM outperforms a large 4-gram LM.

Original languageEnglish (US)
Title of host publication13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Pages1658-1661
Number of pages4
StatePublished - 2012
Externally publishedYes
Event13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012 - Portland, OR, United States
Duration: Sep 9 2012Sep 13 2012

Publication series

Name13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Volume2

Conference

Conference13th Annual Conference of the International Speech Communication Association 2012, INTERSPEECH 2012
Country/TerritoryUnited States
CityPortland, OR
Period9/9/129/13/12

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Communication

Fingerprint

Dive into the research topics of 'Efficient structured language modeling for speech recognition'. Together they form a unique fingerprint.

Cite this