Presentation 2018-02-17
A Method of Extracting Related Terms in a Specialty Area
Satoshi Sunaga, Tsunenari Saitoh, Hiroshi Miyao, Yamato Harada,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In a variety of information retrieval, if a related term dictionary exists, it is effective because it can be used for associative retrieval and fuzzy search. However, it is costly to manually construct and update a related term dictionary. Therefore, we are working on automatically extracting relevant words using co-occurrence of words from document files. In related word extraction by co-occurrence, there are two problems. These problems are that unrelated or irrelevant words (incorrect related words) are extracted and there are unextracted correct related words. For the former problem, we are working on finding the features of incorrect related words and solve them by excluding them. However, the latter problem requires a strategy to increase the number of cooccurrent words to be extracted so as to include correct related words before excluding incorrect related terms. In this paper, as a method to increase the number of co-occurring words, we propose a method for extracting related word candidates by co-occurrence from synonyms. The effectiveness of the proposed method by experiment is shown. In addition, as a consideration, we explain that similar relation appears between the range of correct related words extracted and the range meaning of synonyms.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Full-text Search / Information Extraction / Related Terms / Co-occurrence
Paper # NLC2017-50
Date of Issue 2018-02-09 (NLC)

Conference Information
Committee NLC / IPSJ-IFAT
Conference Date 2018/2/16(2days)
Place (in Japanese) (See Japanese page)
Place (in English) T.O.G.
Topics (in Japanese) (See Japanese page)
Topics (in English) The Twelveth Text Analytics Symposium
Chair Hiroshi Kanayama(IBM)
Vice Chair Takeshi Sakaki(Hottolink) / Kazutaka Shimada(Kyushu Inst. of Tech.)
Secretary Takeshi Sakaki(Ryukoku Univ.) / Kazutaka Shimada(NTT)
Assistant Mitsuo Yoshida(Toyohashi Univ. of Tech.) / Takeshi Kobayakawa(NICT)

Paper Information
Registration To Technical Committee on Natural Language Understanding and Models of Communication / Special Interest Group on Information Fundamentals and Access Technologies
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A Method of Extracting Related Terms in a Specialty Area
Sub Title (in English)
Keyword(1) Full-text Search
Keyword(2) Information Extraction
Keyword(3) Related Terms
Keyword(4) Co-occurrence
1st Author's Name Satoshi Sunaga
1st Author's Affiliation NTT Corporation(NTT)
2nd Author's Name Tsunenari Saitoh
2nd Author's Affiliation NTT Corporation(NTT)
3rd Author's Name Hiroshi Miyao
3rd Author's Affiliation NTT Corporation(NTT)
4th Author's Name Yamato Harada
4th Author's Affiliation NTT Corporation(NTT)
Date 2018-02-17
Paper # NLC2017-50
Volume (vol) vol.117
Number (no) NLC-439
Page pp.pp.51-56(NLC),
#Pages 6
Date of Issue 2018-02-09 (NLC)