Presentation | 2016-07-06 A New Probabilistic Topic Model Based on Variable Bin Width Histogram Hideaki Kim, Tomoharu Iwata, Hiroshi Sawada, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Probabilistic topic models, as represented by latent Dirichlet allocation (LDA), have been widely used for analyzing not only categorical but also continuous data such as times of word appearance and price information. In the topic model for continuous data, however, the component distributions needs to be simple exponential families like normal distributions to perform the efficient parameter estimation, which limits the representative power of the model. In this paper, by incorporating the nonparametric histogram density estimator into the topic model, we construct a new probabilistic topic model to overcome the limitation. The estimation of the parameters, including the bin width selection, is performed by using efficient collapsed Gibbs sampling. We derive the estimation algorithms for the regular and variable bin width scenarios. We apply the proposed method to synthetic data, confirming that it performs well. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | LDA / topic model / histogram / bin width selection |
Paper # | IBISML2016-6 |
Date of Issue | 2016-06-28 (IBISML) |
Conference Information | |
Committee | NC / IPSJ-BIO / IBISML / IPSJ-MPS |
---|---|
Conference Date | 2016/7/4(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Okinawa Institute of Science and Technology |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Machine Learning Approach to Biodata Mining, and General |
Chair | Shigeo Sato(Tohoku Univ.) / / Kenji Fukumizu(ISM) |
Vice Chair | Masafumi Hagiwara(Keio Univ.) / / Masashi Sugiyama(Univ. of Tokyo) / Hisashi Kashima(Kyoto Univ.) |
Secretary | Masafumi Hagiwara(Kyoto Sangyo Univ.) / (Tokyo Inst. of Tech.) / Masashi Sugiyama / Hisashi Kashima(Univ. of Tokyo) / (Nagoya Inst. of Tech.) |
Assistant | Hisanao Akima(Tohoku Univ.) / Yoshihisa Shinozawa(Keio Univ.) / / Toshihiro Kamishima(AIST) / Tomoharu Iwata(NTT) |
Paper Information | |
Registration To | Technical Committee on Neurocomputing / Special Interest Group on Bioinformatics and Genomics / Technical Committee on Infomation-Based Induction Sciences and Machine Learning / Special Interest Group on Mathematical Modeling and Problem Solving |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A New Probabilistic Topic Model Based on Variable Bin Width Histogram |
Sub Title (in English) | |
Keyword(1) | LDA |
Keyword(2) | topic model |
Keyword(3) | histogram |
Keyword(4) | bin width selection |
1st Author's Name | Hideaki Kim |
1st Author's Affiliation | Nippon Telegraph and Telephone Corporation(NTT) |
2nd Author's Name | Tomoharu Iwata |
2nd Author's Affiliation | Nippon Telegraph and Telephone Corporation(NTT) |
3rd Author's Name | Hiroshi Sawada |
3rd Author's Affiliation | Nippon Telegraph and Telephone Corporation(NTT) |
Date | 2016-07-06 |
Paper # | IBISML2016-6 |
Volume (vol) | vol.116 |
Number (no) | IBISML-121 |
Page | pp.pp.217-223(IBISML), |
#Pages | 7 |
Date of Issue | 2016-06-28 (IBISML) |