Presentation | 2018-02-17 A Method of Extracting Related Terms in a Specialty Area Satoshi Sunaga, Tsunenari Saitoh, Hiroshi Miyao, Yamato Harada, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In a variety of information retrieval, if a related term dictionary exists, it is effective because it can be used for associative retrieval and fuzzy search. However, it is costly to manually construct and update a related term dictionary. Therefore, we are working on automatically extracting relevant words using co-occurrence of words from document files. In related word extraction by co-occurrence, there are two problems. These problems are that unrelated or irrelevant words (incorrect related words) are extracted and there are unextracted correct related words. For the former problem, we are working on finding the features of incorrect related words and solve them by excluding them. However, the latter problem requires a strategy to increase the number of cooccurrent words to be extracted so as to include correct related words before excluding incorrect related terms. In this paper, as a method to increase the number of co-occurring words, we propose a method for extracting related word candidates by co-occurrence from synonyms. The effectiveness of the proposed method by experiment is shown. In addition, as a consideration, we explain that similar relation appears between the range of correct related words extracted and the range meaning of synonyms. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Full-text Search / Information Extraction / Related Terms / Co-occurrence |
Paper # | NLC2017-50 |
Date of Issue | 2018-02-09 (NLC) |
Conference Information | |
Committee | NLC / IPSJ-IFAT |
---|---|
Conference Date | 2018/2/16(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | T.O.G. |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | The Twelveth Text Analytics Symposium |
Chair | Hiroshi Kanayama(IBM) |
Vice Chair | Takeshi Sakaki(Hottolink) / Kazutaka Shimada(Kyushu Inst. of Tech.) |
Secretary | Takeshi Sakaki(Ryukoku Univ.) / Kazutaka Shimada(NTT) |
Assistant | Mitsuo Yoshida(Toyohashi Univ. of Tech.) / Takeshi Kobayakawa(NICT) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Information Fundamentals and Access Technologies |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Method of Extracting Related Terms in a Specialty Area |
Sub Title (in English) | |
Keyword(1) | Full-text Search |
Keyword(2) | Information Extraction |
Keyword(3) | Related Terms |
Keyword(4) | Co-occurrence |
1st Author's Name | Satoshi Sunaga |
1st Author's Affiliation | NTT Corporation(NTT) |
2nd Author's Name | Tsunenari Saitoh |
2nd Author's Affiliation | NTT Corporation(NTT) |
3rd Author's Name | Hiroshi Miyao |
3rd Author's Affiliation | NTT Corporation(NTT) |
4th Author's Name | Yamato Harada |
4th Author's Affiliation | NTT Corporation(NTT) |
Date | 2018-02-17 |
Paper # | NLC2017-50 |
Volume (vol) | vol.117 |
Number (no) | NLC-439 |
Page | pp.pp.51-56(NLC), |
#Pages | 6 |
Date of Issue | 2018-02-09 (NLC) |