Presentation 1998/12/10
A Study on Sentence-Level Mixture N-gram based on Sentence Clustering
Tohru Shimizu, Teruo Ohno, Shingo Kuroiwa, Norio Higuchi,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper proposes a new method for developing statistical N-gram language models which integrate sentence-level mixture N-grams and selective use of similar task data. In this method, component N-gram parameters are estimated using both target topic data and similar task data, then sentence-level mixture N-gram model is adapted by usng only target topic data. This approach has the advantage that it can use more data for training and remove useless clusters, which are far from target topic data. The experiment results show that this method achieves the cross-entropy reduction compared with the standard trigram.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Clustering / Statistical language model / Mixture N-gram / Conversational speech
Paper # NLC98-37,SP98-101
Date of Issue

Conference Information
Committee NLC
Conference Date 1998/12/10(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Study on Sentence-Level Mixture N-gram based on Sentence Clustering
Sub Title (in English)
Keyword(1) Clustering
Keyword(2) Statistical language model
Keyword(3) Mixture N-gram
Keyword(4) Conversational speech
1st Author's Name Tohru Shimizu
1st Author's Affiliation KDD R&D Laboratories Inc.()
2nd Author's Name Teruo Ohno
2nd Author's Affiliation KDD R&D Laboratories Inc.
3rd Author's Name Shingo Kuroiwa
3rd Author's Affiliation KDD R&D Laboratories Inc.
4th Author's Name Norio Higuchi
4th Author's Affiliation KDD R&D Laboratories Inc.
Date 1998/12/10
Paper # NLC98-37,SP98-101
Volume (vol) vol.98
Number (no) 460
Page pp.pp.-
#Pages 8
Date of Issue