Presentation 2005/12/15
Speaking Style Transformation of Language Model Based on Statistical Machine Translation Framework
Yuya AKITA, Tatsuya KAWAHARA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) One of the most significant problems in language modeling of spontaneous speech such as meetings and lectures is that only limited amount of matched training data, i.e. faithful transcript for the relevant task domain, is available. In this paper, we propose a novel transformation approach to estimate language model statistics of spontaneous speech from a document-style text database, which is often available with a large scale. The proposed statistical transformation model is designed for modeling characteristic linguistic phenomena in spontaneous speech and estimating their occurrence probabilities. These contextual patterns and probabilities are derived from a small amount of parallel aligned corpus of the faithful transcripts and their document-style texts. To realize wide coverage and reliable estimation, a model based on part-of-speech (POS) is also prepared to provide a back-off scheme from a word-based model. The approach has been successfully applied to estimation of the language model for National Congress meetings from their minute archives, and significant reduction of test-set perplexity is achieved.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Spontaneous speech / Speech recognition / Language model / Statistical machine translation / Speaking style
Paper # NLC2005-75,SP2005-108
Date of Issue

Conference Information
Committee NLC
Conference Date 2005/12/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Speaking Style Transformation of Language Model Based on Statistical Machine Translation Framework
Sub Title (in English)
Keyword(1) Spontaneous speech
Keyword(2) Speech recognition
Keyword(3) Language model
Keyword(4) Statistical machine translation
Keyword(5) Speaking style
1st Author's Name Yuya AKITA
1st Author's Affiliation Academic Center for Computing and Media Studies, Kyoto University()
2nd Author's Name Tatsuya KAWAHARA
2nd Author's Affiliation Academic Center for Computing and Media Studies, Kyoto University
Date 2005/12/15
Paper # NLC2005-75,SP2005-108
Volume (vol) vol.105
Number (no) 494
Page pp.pp.-
#Pages 6
Date of Issue