Presentation | 1995/5/26 A Keyword Extraction Method by Using Statistical Text Information Hidekazu Nakawatase, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper describes a new method to extract free keywords automatically from a Japanese text. Morphological analysis is necessary to recognize words from a text for extraction of keywords. There exist, however, problems of unknown words recognition and ambiguity of compound words recognition, so dictionaries and complex heuristics are necessary to resolve them. This method is based on the N-gram method which need not morphological analysis. It includes 2 steps, evaluation of major strings using the N-gram statistics and the exclusion of nonsense strings. Therefore, this keyword extraction method is very simple and easily applicable to a large scale texts. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | keyword extraction / N-gram / natural language analysis / morphological analysis / statistics |
Paper # | |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 1995/5/26(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Keyword Extraction Method by Using Statistical Text Information |
Sub Title (in English) | |
Keyword(1) | keyword extraction |
Keyword(2) | N-gram |
Keyword(3) | natural language analysis |
Keyword(4) | morphological analysis |
Keyword(5) | statistics |
1st Author's Name | Hidekazu Nakawatase |
1st Author's Affiliation | NTT Information Systems Laboratories() |
Date | 1995/5/26 |
Paper # | |
Volume (vol) | vol.95 |
Number (no) | 81 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |