Presentation | 2004/12/14 Evaluation of Class N-gram Language Model Based on Semantic Attributes Haruki IKEYA, Takashi FUKUDA, Hirobumi YAMADA, Kouichi KATSURADA, Tsuneo NITTA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In general, a class N-gram model: (1) can be trained with small size of corpora, and (2) have a potential to deal with the problem of unknown words, however, it gives less improvement on performance. This paper describes an attempt to apply the class n-gram language model (LM), in which each lexicon corresponds to semantic attributes in a Japanese lexicon so-called "Goi-Taikei", to an ASR system. The proposed language model is designed by: (1) correlating nouns in training corpora with their semantic attributes, and (2) merging words that do not exist in the Goi-Taikei into a class. The proposed method with well-structured semantic attributes can be trained quickly as compared with auto-clustering methods based on perplexity. Experiments were conducted to evaluate the proposed language model by comparing the performance among LMs. Furthermore, an auto-clustering method with the semantic attributes in "Goi-Taikei" as initial clusters was investigated. We showed that the proposed method achieved significant improvement in the experiments of spoken dialog recognition, and could eliminate the computation needed at the clustering of a class N-gram. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Spoken Dialogue / Language Model / Class N-gram / Semantic Attributes |
Paper # | NLC2004-61,SP2004-101 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2004/12/14(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Evaluation of Class N-gram Language Model Based on Semantic Attributes |
Sub Title (in English) | |
Keyword(1) | Spoken Dialogue |
Keyword(2) | Language Model |
Keyword(3) | Class N-gram |
Keyword(4) | Semantic Attributes |
1st Author's Name | Haruki IKEYA |
1st Author's Affiliation | Graduate School of Engineering, Toyohashi University of Technology() |
2nd Author's Name | Takashi FUKUDA |
2nd Author's Affiliation | Graduate School of Engineering, Toyohashi University of Technology |
3rd Author's Name | Hirobumi YAMADA |
3rd Author's Affiliation | Graduate School of Engineering, Toyohashi University of Technology |
4th Author's Name | Kouichi KATSURADA |
4th Author's Affiliation | Graduate School of Engineering, Toyohashi University of Technology |
5th Author's Name | Tsuneo NITTA |
5th Author's Affiliation | Graduate School of Engineering, Toyohashi University of Technology |
Date | 2004/12/14 |
Paper # | NLC2004-61,SP2004-101 |
Volume (vol) | vol.104 |
Number (no) | 539 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |