Presentation | 2003/10/30 Evaluation of the Document Clustering Method Based on Commonality Analysis of Multiple Documents(Natural Language Understanding and Models of Communication) Takahiko KAWATANI, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes evaluation of a non-hierarchical clustering method based on multi-document commonality analysis proposed by the author. In the method, a document extracted as a seed grows up to a cluster by iteratively merging documents with the same topic. It features in obtaining document-cluster similarity that it uses a new similarity measure reflecting term co-occur information and that specific terms and term pairs extracted from the current cluster are used. In experiments using 7546 documents extracted from 38 events in TDT2 corpus, 36 events were extracted as the clusters with 94.41% clustering accuracy. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | |
Paper # | NLC2003-31 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2003/10/30(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Evaluation of the Document Clustering Method Based on Commonality Analysis of Multiple Documents(Natural Language Understanding and Models of Communication) |
Sub Title (in English) | |
Keyword(1) | |
1st Author's Name | Takahiko KAWATANI |
1st Author's Affiliation | Hewlett-Packard Labs Japan, Hewlett-Packard Japan() |
Date | 2003/10/30 |
Paper # | NLC2003-31 |
Volume (vol) | vol.103 |
Number (no) | 407 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |