Presentation 1995/12/15
Language Modelling by String Pattern N-gram
Akinori Ito, Masaki Kohda,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Markov model based language models (N-gram) are popular among sentence/dialog speech recognition. On applying these models to Japanese speech recognition, one has to decide what to be a unit of N-gram. As Japanese sentence is not divided into words, the morphemic analysis is required before word-by-word processing. But it is difficult to get the precise analysis automatically for spontaneous speech transcription. In this paper, we propose several language models which enable fully automatic construction of the model. We examined three types of models: N-gram by string pattern, N-gram by automatic morphemic analysis and string pattern class N-gram. These models were compared by perplexity. From the experimental results, the string pattern class N-gram got better performance than morpheme N-gram.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Language Model / N-gram / word similarity
Paper # NLC95-61,SP95-96
Date of Issue

Conference Information
Committee NLC
Conference Date 1995/12/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Vice Chair

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Language Modelling by String Pattern N-gram
Sub Title (in English)
Keyword(1) Language Model
Keyword(2) N-gram
Keyword(3) word similarity
1st Author's Name Akinori Ito
1st Author's Affiliation Faculty of Engineering, Yamagata University()
2nd Author's Name Masaki Kohda
2nd Author's Affiliation Faculty of Engineering, Yamagata University
Date 1995/12/15
Paper # NLC95-61,SP95-96
Volume (vol) vol.95
Number (no) 429
Page pp.pp.-
#Pages 6
Date of Issue