Presentation 2008-06-19
Temporal Clustering of Internet News Articles with Excluding Single Articles
Tomohiro NAKAMURA, Takayoshi HIRANO, Yu HIRATE, Hayato YAMANA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Clustering of internet news articles makes it possible to detect various useful information, for example, related articles, and latest topic words. From the TDT project down, this area is widely researched. Conventional clustering methods have difficulties to detect single article as a single cluster even though many single articles exists. In this paper, we propose a method to cluster news articles that exclude single articles in advance by using proper noun information, topographic information and other characteristics between single and non-single articles. In evaluation, we use half a year Japanese news articles. Compared to the Single-Link Method, which alone is difficult to judge articles single, our proposing method improves precision 10.2% and reduces the computation time to approximately a third.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Document Clustering / News Articles
Paper # DE2008-11,PRMU2008-29
Date of Issue

Conference Information
Committee DE
Conference Date 2008/6/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Temporal Clustering of Internet News Articles with Excluding Single Articles
Sub Title (in English)
Keyword(1) Document Clustering
Keyword(2) News Articles
1st Author's Name Tomohiro NAKAMURA
1st Author's Affiliation Graduate School of Fundamental Science and Engineering, Waseda University()
2nd Author's Name Takayoshi HIRANO
2nd Author's Affiliation Graduate School of Fundamental Science and Engineering, Waseda University
3rd Author's Name Yu HIRATE
3rd Author's Affiliation Media Network Center, Waseda University
4th Author's Name Hayato YAMANA
4th Author's Affiliation Science and Engineering, Waseda University:National Institute of Informatics
Date 2008-06-19
Paper # DE2008-11,PRMU2008-29
Volume (vol) vol.108
Number (no) 93
Page pp.pp.-
#Pages 6
Date of Issue