Presentation | 2002/7/9 Comparative Experiments of Chinese Analyzers between Support Vector Machines and Minimum Connective Costs Method Tatsumi YOSHIDA, Kiyonori OHTAKE, Kazuhide YAMAMOTO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We will report performances of the Chinese morphological analyzers using analysis tools and language resources, each of which is currently available to the public. We use YamCha, a tool based on Support Vector Machines, and MOZ, which based on minimum connective costs method. We employ the Penn Chinese Treebank (100 thousand words), known as the most common Chinese language resource. Combining these tools and the resource, we measure the performances of Chinese morphological analysis, i.e., word segmentation and part-of-speech tagging. We found that the accuracy using YamCha attains around 88%, which is over 4% higher than that of MOZ, although it is computationally very expensive. We also employ the tagged corpus of Renmin Ribao (1.1 million words) that is bigger than the Penn Chinese Treebank. We found that the accuracies of morphological analysis by YamCha and MOZ attain around 92% and 89%, respectively. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Chinese morphological analysis / SVM / YamCha / MOZ |
Paper # | NLC2002-32 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2002/7/9(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Comparative Experiments of Chinese Analyzers between Support Vector Machines and Minimum Connective Costs Method |
Sub Title (in English) | |
Keyword(1) | Chinese morphological analysis |
Keyword(2) | SVM |
Keyword(3) | YamCha |
Keyword(4) | MOZ |
1st Author's Name | Tatsumi YOSHIDA |
1st Author's Affiliation | Dept. of Knowledge-based Information Engineering, Toyohashi University of Technology() |
2nd Author's Name | Kiyonori OHTAKE |
2nd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
3rd Author's Name | Kazuhide YAMAMOTO |
3rd Author's Affiliation | ATR Spoken Language Translation Research Laboratories |
Date | 2002/7/9 |
Paper # | NLC2002-32 |
Volume (vol) | vol.102 |
Number (no) | 200 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |