Presentation 2012-11-07
Nested-Hierarchical Dirichlet Process Mixtures for Simultaneous Document-Topic Clustering
Shoji TOMINAGA, Masamichi SHIMOSAKA, Rui FUKUI, Tomomasa SATO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we propose a nonparametric Bayesian framework for natural language processing (NLP). Our framework is based on two applied Dirichlet processes (DP), hierarchical DP and nested DP, and simultaneously optimizes the document clusters and topics, estimating the number of both of them. We also provide closed-form posterior estimation methods for the framework with variational inference and blocked Gibbs sampler, so our method gives performance tradeoff according to the data size. Experimental results using real corpus data show that our framework gives another vision to the field of NLP and has higher prediction scores to existing nonparametric generative models.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) document clustering / topic analyses / nonparametric Bayes / nested-hierarchical Dirichlet processes
Paper # IBISML2012-56
Date of Issue

Conference Information
Committee IBISML
Conference Date 2012/10/31(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information-Based Induction Sciences and Machine Learning (IBISML)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Nested-Hierarchical Dirichlet Process Mixtures for Simultaneous Document-Topic Clustering
Sub Title (in English)
Keyword(1) document clustering
Keyword(2) topic analyses
Keyword(3) nonparametric Bayes
Keyword(4) nested-hierarchical Dirichlet processes
1st Author's Name Shoji TOMINAGA
1st Author's Affiliation Graduate School of Information Science and Engineering, the University of Tokyo()
2nd Author's Name Masamichi SHIMOSAKA
2nd Author's Affiliation Graduate School of Information Science and Engineering, the University of Tokyo
3rd Author's Name Rui FUKUI
3rd Author's Affiliation Graduate School of Information Science and Engineering, the University of Tokyo
4th Author's Name Tomomasa SATO
4th Author's Affiliation Graduate School of Information Science and Engineering, the University of Tokyo
Date 2012-11-07
Paper # IBISML2012-56
Volume (vol) vol.112
Number (no) 279
Page pp.pp.-
#Pages 8
Date of Issue