確率的LSAに基づくngramモデルの変分ベイズ学習を利用した文脈適応化

三品 拓也; 山本 幹雄

Presentation	2002/12/13 Context adaptation using variational Bayesian learning for ngram models based on probabilistic LSA Takuya MISHINA, Mikio YAMAMOTO,
PDF Download Page	PDF download Page Link
Abstract(in Japanese)	(See Japanese page)
Abstract(in English)	This paper describes a context adaptation method using variational Bayesian learning for a statistical language model based on PLSA (Probabilistic Latent Semantic Analysis) which models global context. Gildea and Hofmann (1999) proposed an original training and adaptation method for PLSA which is based on EM algorithm. However, the EM adaptation method tends to over fit to a context, because the context which can be used for dynamic adaptation is so smaller than that for training. To avoid over-fitting, we use a variational Bayesian learning method for the adaptation which could be tolerant to the over-fitting problem. We compare two methods in test-set perplexity of unigram and trigram models. The experiments show a stable high performance of the Bayesian adaptation for small contexts made up of medium frequency words in perplexity compared to the EM adaptation. For contexts made up of high and medium frequency words, a unigram perplexity of the EM adaptation is comparable or lower than that of the Bayesian adaptation, but the Bayesian adaptation is better in perplexity of trigram models.
Keyword(in Japanese)	(See Japanese page)
Keyword(in English)	Probabilistic LSA / Statistical language model / Variational Bayesian learning / EM algorithm
Paper #	NLC2002-73
Date of Issue

Conference Information
Committee	NLC
Conference Date	2002/12/13(1days)
Place (in Japanese)	(See Japanese page)
Place (in English)
Topics (in Japanese)	(See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To	Natural Language Understanding and Models of Communication (NLC)
Language	JPN
Title (in Japanese)	(See Japanese page)
Sub Title (in Japanese)	(See Japanese page)
Title (in English)	Context adaptation using variational Bayesian learning for ngram models based on probabilistic LSA
Sub Title (in English)
Keyword(1)	Probabilistic LSA
Keyword(2)	Statistical language model
Keyword(3)	Variational Bayesian learning
Keyword(4)	EM algorithm
1st Author's Name	Takuya MISHINA
1st Author's Affiliation	Master's Program in Science and Engineering, University of Tsukuba()
2nd Author's Name	Mikio YAMAMOTO
2nd Author's Affiliation	Institute of Information Sciences and Electronics, University of Tsukuba
Date	2002/12/13
Paper #	NLC2002-73
Volume (vol)	vol.102
Number (no)	528
Page	pp.pp.-
#Pages	6
Date of Issue