Presentation 2004/12/14
Evaluation of Class N-gram Language Model Based on Semantic Attributes
Haruki IKEYA, Takashi FUKUDA, Hirobumi YAMADA, Kouichi KATSURADA, Tsuneo NITTA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In general, a class N-gram model: (1) can be trained with small size of corpora, and (2) have a potential to deal with the problem of unknown words, however, it gives less improvement on performance. This paper describes an attempt to apply the class n-gram language model (LM), in which each lexicon corresponds to semantic attributes in a Japanese lexicon so-called "Goi-Taikei", to an ASR system. The proposed language model is designed by: (1) correlating nouns in training corpora with their semantic attributes, and (2) merging words that do not exist in the Goi-Taikei into a class. The proposed method with well-structured semantic attributes can be trained quickly as compared with auto-clustering methods based on perplexity. Experiments were conducted to evaluate the proposed language model by comparing the performance among LMs. Furthermore, an auto-clustering method with the semantic attributes in "Goi-Taikei" as initial clusters was investigated. We showed that the proposed method achieved significant improvement in the experiments of spoken dialog recognition, and could eliminate the computation needed at the clustering of a class N-gram.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Spoken Dialogue / Language Model / Class N-gram / Semantic Attributes
Paper # NLC2004-61,SP2004-101
Date of Issue

Conference Information
Committee NLC
Conference Date 2004/12/14(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Evaluation of Class N-gram Language Model Based on Semantic Attributes
Sub Title (in English)
Keyword(1) Spoken Dialogue
Keyword(2) Language Model
Keyword(3) Class N-gram
Keyword(4) Semantic Attributes
1st Author's Name Haruki IKEYA
1st Author's Affiliation Graduate School of Engineering, Toyohashi University of Technology()
2nd Author's Name Takashi FUKUDA
2nd Author's Affiliation Graduate School of Engineering, Toyohashi University of Technology
3rd Author's Name Hirobumi YAMADA
3rd Author's Affiliation Graduate School of Engineering, Toyohashi University of Technology
4th Author's Name Kouichi KATSURADA
4th Author's Affiliation Graduate School of Engineering, Toyohashi University of Technology
5th Author's Name Tsuneo NITTA
5th Author's Affiliation Graduate School of Engineering, Toyohashi University of Technology
Date 2004/12/14
Paper # NLC2004-61,SP2004-101
Volume (vol) vol.104
Number (no) 539
Page pp.pp.-
#Pages 6
Date of Issue