Presentation 2005/6/16
Language Model Adaptation for ASR Using Machine-Translated Data
JENSSON Arnar THOR, Edward W.D. WHITTAKER, Koji IWANO, Sadaoki FURUI,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Text corpus size is an important issue when building a language model (LM). This is a particularly important issue for languages where little data is available. This paper introduces a LM adaptation technique to improve a LM built using a small amount of task dependent text with the help of a machine-translated text corpus. Perplexity experiments were performed using data, machine translated (MT) from English to French on a sentence-by-sentence basis and using dictionary lookup on a word-by-word basis. Then perplexity and word error rate experiments using MT data from English to Icelandic were done on a word-by-word basis. For the latter, the baseline word error rate was 41.4%. LM interpolation reduced word error rate significantly to 37.6%.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Language Model Adaptation / Automatic Speech Recognition / Machine Translation / Sparse Text Corpus / Resource Deficient Languages
Paper # SP2005-23
Date of Issue

Conference Information
Committee SP
Conference Date 2005/6/16(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Speech (SP)
Language ENG
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Language Model Adaptation for ASR Using Machine-Translated Data
Sub Title (in English)
Keyword(1) Language Model Adaptation
Keyword(2) Automatic Speech Recognition
Keyword(3) Machine Translation
Keyword(4) Sparse Text Corpus
Keyword(5) Resource Deficient Languages
1st Author's Name JENSSON Arnar THOR
1st Author's Affiliation Department of Computer Science, Tokyo Institute of Technology()
2nd Author's Name Edward W.D. WHITTAKER
2nd Author's Affiliation Department of Computer Science, Tokyo Institute of Technology
3rd Author's Name Koji IWANO
3rd Author's Affiliation Department of Computer Science, Tokyo Institute of Technology
4th Author's Name Sadaoki FURUI
4th Author's Affiliation Department of Computer Science, Tokyo Institute of Technology
Date 2005/6/16
Paper # SP2005-23
Volume (vol) vol.105
Number (no) 132
Page pp.pp.-
#Pages 5
Date of Issue