Presentation 2022-09-13
A study on keyword extraction based on phrase-level context information acquisition
Yumeto Inaoka, Mitsuo Yoshida,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We study a method for extracting words and phrases that belong to the desired class (e.g. magazine name) as keywords from documents. The task is similar to Named Entity Recognition (NER) task. However, in keyword extraction, the extraction target is not limited to named entity and it collects keywords without assigning named entity labels to the named entity in document texts. Furthermore, the input is not a set of labeled documents, but a set of keywords that are examples of words and phrases belonging to the desired class. In this paper, we study a keyword extraction method based on the phrase-level context information acquisition. As a result, we found that the method can achieve high accuracy without training on large datasets. On the other hand, we demonstrated the problems such as the extracted strings that cannot be used as keywords.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Keyword extraction / Named entity recognition / n-gram
Paper # NLC2022-5
Date of Issue 2022-09-06 (NLC)

Conference Information
Committee NLC
Conference Date 2022/9/13(1days)
Place (in Japanese) (See Japanese page)
Place (in English) Keio Univ. Yagami Campus.
Topics (in Japanese) (See Japanese page)
Topics (in English) The 19th Text Analytics Symposium
Chair Mitsuo Yoshida(Univ. of Tsukuba)
Vice Chair Hiroki Sakaji(Univ. of Tokyo) / Takeshi Kobayakawa(NHK)
Secretary Hiroki Sakaji(NTT) / Takeshi Kobayakawa(Hiroshima Univ. of Economics)
Assistant Kanjin Takahashi(Sansan) / Yasuhiro Ogawa(Nagoya Univ.)

Paper Information
Registration To Technical Committee on Natural Language Understanding and Models of Communication
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A study on keyword extraction based on phrase-level context information acquisition
Sub Title (in English)
Keyword(1) Keyword extraction
Keyword(2) Named entity recognition
Keyword(3) n-gram
1st Author's Name Yumeto Inaoka
1st Author's Affiliation Faber Company Inc.(Faber Company)
2nd Author's Name Mitsuo Yoshida
2nd Author's Affiliation University of Tsukuba(Univ. of Tsukuba)
Date 2022-09-13
Paper # NLC2022-5
Volume (vol) vol.122
Number (no) NLC-180
Page pp.pp.5-8(NLC),
#Pages 4
Date of Issue 2022-09-06 (NLC)