文字列パターンのN-gramによる文節モデルの検討

Presentation	1995/12/15 Language Modelling by String Pattern N-gram Akinori Ito, Masaki Kohda,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	Markov model based language models (N-gram) are popular among sentence/dialog speech recognition. On applying these models to Japanese speech recognition, one has to decide what to be a unit of N-gram. As Japanese sentence is not divided into words, the morphemic analysis is required before word-by-word processing. But it is difficult to get the precise analysis automatically for spontaneous speech transcription. In this paper, we propose several language models which enable fully automatic construction of the model. We examined three types of models: N-gram by string pattern, N-gram by automatic morphemic analysis and string pattern class N-gram. These models were compared by perplexity. From the experimental results, the string pattern class N-gram got better performance than morpheme N-gram.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Language Model / N-gram / word similarity
Paper #	NLC95-61,SP95-96
Date of Issue

Paper Information
Registration To	Natural Language Understanding and Models of Communication (NLC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Language Modelling by String Pattern N-gram
Sub Title (in English)
Keyword(1)	Language Model
Keyword(2)	N-gram
Keyword(3)	word similarity
1st Author's Name	Akinori Ito
1st Author's Affiliation	Faculty of Engineering, Yamagata University()
2nd Author's Name	Masaki Kohda
2nd Author's Affiliation	Faculty of Engineering, Yamagata University
Date	1995/12/15
Paper #	NLC95-61,SP95-96
Volume (vol)	vol.95
Number (no)	429
Page	pp.pp.-
#Pages	6
Date of Issue