Presentation 2003/7/11
Probing Text Databases and Clustering to Extract New Topic Documents
Takanori MOURI, Hiroyuki KITAGAWA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) There are many information sources which provide their database contents through query interfaces. Hidden Web sites are typical examples. Usually, their database contents dynamically change, new documents on emerging topics being appended. In applications like topic detection and trend analysis, we want to discover newly emerging contents in the databases. However, it is very difficult for ordinary users to detect them only through the query interfaces without support by the database contents administrators. We proposed a method to automatically discover such content. The proposed method generates biased query probes using a classifier to be issued to a given text database with a keyword-based query interface. In this paper, we improve the method using a hierarchical clustering instead of a classifier. We evaluate its effectiveness with preliminary experiments.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Topic Detection / Text Database / Knowledge Discovery / Clustering
Paper # DE2003-95
Date of Issue

Conference Information
Committee DE
Conference Date 2003/7/11(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Data Engineering (DE)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Probing Text Databases and Clustering to Extract New Topic Documents
Sub Title (in English)
Keyword(1) Topic Detection
Keyword(2) Text Database
Keyword(3) Knowledge Discovery
Keyword(4) Clustering
1st Author's Name Takanori MOURI
1st Author's Affiliation Graduate School of Systems and Information Engineering, University of Tsukuba()
2nd Author's Name Hiroyuki KITAGAWA
2nd Author's Affiliation Institute of Information Sciences and Electronics, University of Tsukuba
Date 2003/7/11
Paper # DE2003-95
Volume (vol) vol.103
Number (no) 192
Page pp.pp.-
#Pages 6
Date of Issue