Presentation 2002/12/13
Context adaptation using variational Bayesian learning for ngram models based on probabilistic LSA
Takuya MISHINA, Mikio YAMAMOTO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper describes a context adaptation method using variational Bayesian learning for a statistical language model based on PLSA (Probabilistic Latent Semantic Analysis) which models global context. Gildea and Hofmann (1999) proposed an original training and adaptation method for PLSA which is based on EM algorithm. However, the EM adaptation method tends to over fit to a context, because the context which can be used for dynamic adaptation is so smaller than that for training. To avoid over-fitting, we use a variational Bayesian learning method for the adaptation which could be tolerant to the over-fitting problem. We compare two methods in test-set perplexity of unigram and trigram models. The experiments show a stable high performance of the Bayesian adaptation for small contexts made up of medium frequency words in perplexity compared to the EM adaptation. For contexts made up of high and medium frequency words, a unigram perplexity of the EM adaptation is comparable or lower than that of the Bayesian adaptation, but the Bayesian adaptation is better in perplexity of trigram models.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Probabilistic LSA / Statistical language model / Variational Bayesian learning / EM algorithm
Paper # NLC2002-73
Date of Issue

Conference Information
Committee NLC
Conference Date 2002/12/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Context adaptation using variational Bayesian learning for ngram models based on probabilistic LSA
Sub Title (in English)
Keyword(1) Probabilistic LSA
Keyword(2) Statistical language model
Keyword(3) Variational Bayesian learning
Keyword(4) EM algorithm
1st Author's Name Takuya MISHINA
1st Author's Affiliation Master's Program in Science and Engineering, University of Tsukuba()
2nd Author's Name Mikio YAMAMOTO
2nd Author's Affiliation Institute of Information Sciences and Electronics, University of Tsukuba
Date 2002/12/13
Paper # NLC2002-73
Volume (vol) vol.102
Number (no) 528
Page pp.pp.-
#Pages 6
Date of Issue