Presentation | 2012-11-07 Nested-Hierarchical Dirichlet Process Mixtures for Simultaneous Document-Topic Clustering Shoji TOMINAGA, Masamichi SHIMOSAKA, Rui FUKUI, Tomomasa SATO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we propose a nonparametric Bayesian framework for natural language processing (NLP). Our framework is based on two applied Dirichlet processes (DP), hierarchical DP and nested DP, and simultaneously optimizes the document clusters and topics, estimating the number of both of them. We also provide closed-form posterior estimation methods for the framework with variational inference and blocked Gibbs sampler, so our method gives performance tradeoff according to the data size. Experimental results using real corpus data show that our framework gives another vision to the field of NLP and has higher prediction scores to existing nonparametric generative models. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | document clustering / topic analyses / nonparametric Bayes / nested-hierarchical Dirichlet processes |
Paper # | IBISML2012-56 |
Date of Issue |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2012/10/31(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Information-Based Induction Sciences and Machine Learning (IBISML) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Nested-Hierarchical Dirichlet Process Mixtures for Simultaneous Document-Topic Clustering |
Sub Title (in English) | |
Keyword(1) | document clustering |
Keyword(2) | topic analyses |
Keyword(3) | nonparametric Bayes |
Keyword(4) | nested-hierarchical Dirichlet processes |
1st Author's Name | Shoji TOMINAGA |
1st Author's Affiliation | Graduate School of Information Science and Engineering, the University of Tokyo() |
2nd Author's Name | Masamichi SHIMOSAKA |
2nd Author's Affiliation | Graduate School of Information Science and Engineering, the University of Tokyo |
3rd Author's Name | Rui FUKUI |
3rd Author's Affiliation | Graduate School of Information Science and Engineering, the University of Tokyo |
4th Author's Name | Tomomasa SATO |
4th Author's Affiliation | Graduate School of Information Science and Engineering, the University of Tokyo |
Date | 2012-11-07 |
Paper # | IBISML2012-56 |
Volume (vol) | vol.112 |
Number (no) | 279 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |