Presentation 1997/12/12
An Improvement of n-gram Model by a Change of the Prediction Unit
Shinsuke Mori, Osamu Yamaji, Makoto Nagao,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we define a string-based n-gram model and a phrase-based n-gram mode as expansions of character n-gram model and word-based n-gram model, and we propose a method to improve an n-gram model in terms of prediction. The objective function in model search is the average cross entropy, which is proven to be effective for word clustering. This criterion is, like deleted interpolation, based on the idea of separation of the corpus for evaluation and the corpus for model estimation. As an experimental result on a Japanese corpus, we obtained the entorpeis as follows: the string-based n-gram model had 4.3791, which is less than the character n-gram model's 5.4105, and the phrase-based n-gram mode had 4.4555, which is less than the word-based n-gram model's 4.6053.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) n-gram model / stochastic language model / string / phrase / EDR corpus
Paper # NLC97-48
Date of Issue

Conference Information
Committee NLC
Conference Date 1997/12/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An Improvement of n-gram Model by a Change of the Prediction Unit
Sub Title (in English)
Keyword(1) n-gram model
Keyword(2) stochastic language model
Keyword(3) string
Keyword(4) phrase
Keyword(5) EDR corpus
1st Author's Name Shinsuke Mori
1st Author's Affiliation Department of Electrical Engineering, Kyoto University()
2nd Author's Name Osamu Yamaji
2nd Author's Affiliation Department of Electrical Engineering, Kyoto University
3rd Author's Name Makoto Nagao
3rd Author's Affiliation Department of Electrical Engineering, Kyoto University
Date 1997/12/12
Paper # NLC97-48
Volume (vol) vol.97
Number (no) 440
Page pp.pp.-
#Pages 8
Date of Issue