Presentation 2001/7/11
Incremental Document Clustering Based on Forgetting Factors
Yoshiharu Ishikawa, Hiroyuki Kitagawa,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Clustering plays important roles in various on-line applications such as extraction of useful information from news feeding services and selection of relevant documents from the incoming scientific articles in digital libraries. In on-line environments, users generally have interests on newer documents than older ones and have no interests on obsolete old documents. Based on this observation, we propose an on-line document clustering method that incorporates the notion of a forgetting factor to calculate document similarities. The idea is that every document gradually losses its weight(or memory) as time passes according to this factor. Since our method generates clusters using a document similarity measure based on the forgetting factor, newer documents have much effects on the resulting cluster structure than older ones. In this paper, we present the fundamental idea of our clustering method and describe its details such as the similarity measure and an the efficient incremental statistics maintenance method.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) clustering / on-line document processing / incremental processing / forgetting
Paper # DE2001-55
Date of Issue

Conference Information
Committee DE
Conference Date 2001/7/11(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Incremental Document Clustering Based on Forgetting Factors
Sub Title (in English)
Keyword(1) clustering
Keyword(2) on-line document processing
Keyword(3) incremental processing
Keyword(4) forgetting
1st Author's Name Yoshiharu Ishikawa
1st Author's Affiliation Institute of Information Sciences and Electronics, University of Tsukuba()
2nd Author's Name Hiroyuki Kitagawa
2nd Author's Affiliation Institute of Information Sciences and Electronics, University of Tsukuba
Date 2001/7/11
Paper # DE2001-55
Volume (vol) vol.101
Number (no) 192
Page pp.pp.-
#Pages 8
Date of Issue