Presentation | 2007/7/17 Attribute-value extraction from description of exhibits for facetted search in net auction system Jun NISHIMURA, Rintaro MIYAZAKI, Naoto MAEDA, Tatsunori MORI, Shore O, Yusuke ISHIKAWA, Hiroyuki KOBAYASHI, Yuya TANAKA, Fuyuko KIDO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In order to achieve flexible facetted search for descriptions of exhibits in net auction system, in this paper, we studied automated extraction of attributes and their values, which appear in those descriptions, based on a machine learning technique. First of all, we examined a set of attributes that should be indexed for the facetted search. Especially, we focused on attributes that can be annotated stably by different annotators, and that are needed for search We also studied a way to deal with the diversity of attributes and values in descriptions of exhibits. When surface expressions are directly used as one of features, the result of learning may be unwillingly over-fitted to training corpora, and consequently the performance of information extraction will be degraded. Therefore, we introduced the category information of a thesaurus, which does not depend on surface expression directly, and examined the effectiveness of the feature. With regard to the extraction method, we adopted a standard character-based chunking method, which are usually used for named entity extraction. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | net auction / attribute / information extraction / chunking |
Paper # | NLC2007-27 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2007/7/17(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Attribute-value extraction from description of exhibits for facetted search in net auction system |
Sub Title (in English) | |
Keyword(1) | net auction |
Keyword(2) | attribute |
Keyword(3) | information extraction |
Keyword(4) | chunking |
1st Author's Name | Jun NISHIMURA |
1st Author's Affiliation | Yokohama National University() |
2nd Author's Name | Rintaro MIYAZAKI |
2nd Author's Affiliation | Yokohama National University |
3rd Author's Name | Naoto MAEDA |
3rd Author's Affiliation | Yokohama National University |
4th Author's Name | Tatsunori MORI |
4th Author's Affiliation | Yokohama National University |
5th Author's Name | Shore O |
5th Author's Affiliation | Yahoo Japan Corporation |
6th Author's Name | Yusuke ISHIKAWA |
6th Author's Affiliation | Yahoo Japan Corporation |
7th Author's Name | Hiroyuki KOBAYASHI |
7th Author's Affiliation | Yahoo Japan Corporation |
8th Author's Name | Yuya TANAKA |
8th Author's Affiliation | Yahoo Japan Corporation |
9th Author's Name | Fuyuko KIDO |
9th Author's Affiliation | Yahoo Japan Corporation |
Date | 2007/7/17 |
Paper # | NLC2007-27 |
Volume (vol) | vol.107 |
Number (no) | 158 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |