Presentation | 2001/5/4 Automatic Segmentation of Compound Word in Japanese using Contextual Information Dongli Han, Koichi Kato, Teiji Furugori, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Analyzing compound words is one of the crucial problems in constructing practieal natural language processing systems. In this paper, we propose a method for segmenting compound word, which consists of a long sequence of Kanji characters, in text by using statistics on word co-occurrences. We conducted an experiment that used the co-occurrence information within the compound word and the context in whieh it appreared. Its result shows a success rate of over 90% in dividing the compound words into their unit words. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Compound word / Segmentation / Co-occurrence / Mutual information / Contextual information |
Paper # | NLC2001-5 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2001/5/4(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Automatic Segmentation of Compound Word in Japanese using Contextual Information |
Sub Title (in English) | |
Keyword(1) | Compound word |
Keyword(2) | Segmentation |
Keyword(3) | Co-occurrence |
Keyword(4) | Mutual information |
Keyword(5) | Contextual information |
1st Author's Name | Dongli Han |
1st Author's Affiliation | Department of Computer Science, The University of Electro-Communications() |
2nd Author's Name | Koichi Kato |
2nd Author's Affiliation | Department of Computer Science, The University of Electro-Communications |
3rd Author's Name | Teiji Furugori |
3rd Author's Affiliation | Department of Computer Science, The University of Electro-Communications |
Date | 2001/5/4 |
Paper # | NLC2001-5 |
Volume (vol) | vol.101 |
Number (no) | 40 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |