Presentation | 2006-07-14 A Proposal of Information Extraction System with Data Cleaning Facility Yoshiharu ISHIKAWA, Sayumi KUROKAWA, Jianwei ZHANG, Hiroyuki KITAGAWA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Information extraction to acquire useful information from a large amount of text sources such as Web is one of the important research topics in data engineering. For useful information extraction, errors and noises included in extraction results should be reduced. In this paper, we propose an approach to an information extraction system with high accuracy by integrating data cleaning into information extraction and using interactive feedbacks from users. The approach is based on the bootstrap record extraction method and includes data cleaning in the process of record extraction. User feedbacks are reflected in the evaluation of the extracted records and the extraction patterns. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | information extraction / record extraction / data cleaning / bootstrapping |
Paper # | DE2006-102 |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2006/7/7(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Proposal of Information Extraction System with Data Cleaning Facility |
Sub Title (in English) | |
Keyword(1) | information extraction |
Keyword(2) | record extraction |
Keyword(3) | data cleaning |
Keyword(4) | bootstrapping |
1st Author's Name | Yoshiharu ISHIKAWA |
1st Author's Affiliation | Information Technology Center, Nagoya University() |
2nd Author's Name | Sayumi KUROKAWA |
2nd Author's Affiliation | Department of Computer Science, Graduate School of Systems and Information Engineering |
3rd Author's Name | Jianwei ZHANG |
3rd Author's Affiliation | Department of Computer Science, Graduate School of Systems and Information Engineering |
4th Author's Name | Hiroyuki KITAGAWA |
4th Author's Affiliation | Department of Computer Science, Graduate School of Systems and Information Engineering:Center for Computational Sciences University of Tsukuba |
Date | 2006-07-14 |
Paper # | DE2006-102 |
Volume (vol) | vol.106 |
Number (no) | 150 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |