Presentation 2006-03-07
Refining Training Data Based on Document Similarity for Semi-automatic Building Domain-Specific Web Search Engines
Reiko MIYAGAWA, Koji IWANUMA, Hidetomo HIDETOMO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) spice proposed by Kokubo et al. is an approach for building a domain-specific web search engine which shows high precision and recall: However, the approach requires manually classifying 2,000 Web pagas into positive and negative examples which are training data for learning a keyword spice. Since the classification is done by human, it consumes a great deal of time. For solving this problem, we propose a new refinement technique to create training data semi-automatically. Our approach requires only a few positive examples which are used for classifying Web pages by a similarity measure. The experimental results show that a keyword spice learned from semi-automatically generated training-data has comparatively high precision and recall close to the original approach.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) refinement of training examples / special purpose search engine / keyword spices
Paper # AI2005-52
Date of Issue

Conference Information
Committee AI
Conference Date 2006/2/28(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Artificial Intelligence and Knowledge-Based Processing (AI)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Refining Training Data Based on Document Similarity for Semi-automatic Building Domain-Specific Web Search Engines
Sub Title (in English)
Keyword(1) refinement of training examples
Keyword(2) special purpose search engine
Keyword(3) keyword spices
1st Author's Name Reiko MIYAGAWA
1st Author's Affiliation Department of Computer Science and Media Enfineering, Faculty of Engineering, University of Yamanashi()
2nd Author's Name Koji IWANUMA
2nd Author's Affiliation Graduate School of Medical and Engineering Science Department of Research, University of Yamanashi
3rd Author's Name Hidetomo HIDETOMO
3rd Author's Affiliation Graduate School of Medical and Engineering Science Department of Research, University of Yamanashi
Date 2006-03-07
Paper # AI2005-52
Volume (vol) vol.105
Number (no) 640
Page pp.pp.-
#Pages 6
Date of Issue