Presentation | 1998/12/10 MULTI CLASS COMPOSITE N-GRAM LANGUAGE MODEL BASED ON CONNECTION DIRECTION Hirofumi Yamamoto, Yoshinori Sagisaka, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | A new word-clustering rechnique is proposed to efficiently build statistically salient class 2-grams from language corpora. By splitting word neighboring characteristics into word-preceding and following directions, multiple(two-dimensional)word classes are assigned to each word. In each side, word classes are merged into larger clusters independently according to preceding or following word distributions. This word-clustering can provide more efficient and statistically reliable word clusters. Further, we extend it to Multi-Class Composite N-gram that unit is Multi-Class 2-gram and joined word. Multi-Class Composite N-gram showed better performance both in perplexity and recognition rates with one thousandth smaller logical parameter size than conventional word 2-grams. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Class N-gram / Variable Order N-gram / Automatic Clustering / Joined Word |
Paper # | NLC98-38,SP98-102 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 1998/12/10(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | MULTI CLASS COMPOSITE N-GRAM LANGUAGE MODEL BASED ON CONNECTION DIRECTION |
Sub Title (in English) | |
Keyword(1) | Class N-gram |
Keyword(2) | Variable Order N-gram |
Keyword(3) | Automatic Clustering |
Keyword(4) | Joined Word |
1st Author's Name | Hirofumi Yamamoto |
1st Author's Affiliation | ATR Interpreting Telecommunications Res.Labs.() |
2nd Author's Name | Yoshinori Sagisaka |
2nd Author's Affiliation | ATR Interpreting Telecommunications Res.Labs. |
Date | 1998/12/10 |
Paper # | NLC98-38,SP98-102 |
Volume (vol) | vol.98 |
Number (no) | 460 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |