Presentation | 2006-03-07 Refining Training Data Based on Document Similarity for Semi-automatic Building Domain-Specific Web Search Engines Reiko MIYAGAWA, Koji IWANUMA, Hidetomo HIDETOMO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | spice proposed by Kokubo et al. is an approach for building a domain-specific web search engine which shows high precision and recall: However, the approach requires manually classifying 2,000 Web pagas into positive and negative examples which are training data for learning a keyword spice. Since the classification is done by human, it consumes a great deal of time. For solving this problem, we propose a new refinement technique to create training data semi-automatically. Our approach requires only a few positive examples which are used for classifying Web pages by a similarity measure. The experimental results show that a keyword spice learned from semi-automatically generated training-data has comparatively high precision and recall close to the original approach. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | refinement of training examples / special purpose search engine / keyword spices |
Paper # | AI2005-52 |
Date of Issue |
Conference Information | |
Committee | AI |
---|---|
Conference Date | 2006/2/28(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Artificial Intelligence and Knowledge-Based Processing (AI) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Refining Training Data Based on Document Similarity for Semi-automatic Building Domain-Specific Web Search Engines |
Sub Title (in English) | |
Keyword(1) | refinement of training examples |
Keyword(2) | special purpose search engine |
Keyword(3) | keyword spices |
1st Author's Name | Reiko MIYAGAWA |
1st Author's Affiliation | Department of Computer Science and Media Enfineering, Faculty of Engineering, University of Yamanashi() |
2nd Author's Name | Koji IWANUMA |
2nd Author's Affiliation | Graduate School of Medical and Engineering Science Department of Research, University of Yamanashi |
3rd Author's Name | Hidetomo HIDETOMO |
3rd Author's Affiliation | Graduate School of Medical and Engineering Science Department of Research, University of Yamanashi |
Date | 2006-03-07 |
Paper # | AI2005-52 |
Volume (vol) | vol.105 |
Number (no) | 640 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |