Presentation 2000/7/12
An evaluation of English Stemming Data in Full-Text Retrieval
Sakiko HONMA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Various studies have focused on the effect of stemming on IR tasks. Experiments using a test collection like TREC have shown that the overall improvement of stemming is not significant because its effects on independent queries are so inconsistent that the damage to some queries may cancel out the benefits to others. When stemming indexing terms, we should avoid the risk of ill effects from overstemming. To understand the extent to which we should stem indexing terms, we conducted a set of experiments using TREC-7 and TREC-8 adhoc tasks. Targets for stemming are set in the following four steps:1.Conflation of inflectionally related forms 2.Conflation of derivationally related forms excluding their minimal stem 3.Conflation of derivationally related forms including their minimal stem 4.Conflation of spelling variants The result shows that most of the ill effects are caused by conflating derivational variants including their ultimate stems(step 3). It also shows that the other steps damage only a few queries and produce fairly consistent improvements.
Keyword(in Japanese) (See Japanese page)
Keyword(in English)
Paper # NLC2000-17
Date of Issue

Conference Information
Committee NLC
Conference Date 2000/7/12(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An evaluation of English Stemming Data in Full-Text Retrieval
Sub Title (in English)
Keyword(1)
1st Author's Name Sakiko HONMA
1st Author's Affiliation Software Research Center, RICOH Co., Ltd.()
Date 2000/7/12
Paper # NLC2000-17
Volume (vol) vol.100
Number (no) 201
Page pp.pp.-
#Pages 8
Date of Issue