Presentation 2002/7/9
Comparative Experiments of Chinese Analyzers between Support Vector Machines and Minimum Connective Costs Method
Tatsumi YOSHIDA, Kiyonori OHTAKE, Kazuhide YAMAMOTO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We will report performances of the Chinese morphological analyzers using analysis tools and language resources, each of which is currently available to the public. We use YamCha, a tool based on Support Vector Machines, and MOZ, which based on minimum connective costs method. We employ the Penn Chinese Treebank (100 thousand words), known as the most common Chinese language resource. Combining these tools and the resource, we measure the performances of Chinese morphological analysis, i.e., word segmentation and part-of-speech tagging. We found that the accuracy using YamCha attains around 88%, which is over 4% higher than that of MOZ, although it is computationally very expensive. We also employ the tagged corpus of Renmin Ribao (1.1 million words) that is bigger than the Penn Chinese Treebank. We found that the accuracies of morphological analysis by YamCha and MOZ attain around 92% and 89%, respectively.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Chinese morphological analysis / SVM / YamCha / MOZ
Paper # NLC2002-32
Date of Issue

Conference Information
Committee NLC
Conference Date 2002/7/9(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Comparative Experiments of Chinese Analyzers between Support Vector Machines and Minimum Connective Costs Method
Sub Title (in English)
Keyword(1) Chinese morphological analysis
Keyword(2) SVM
Keyword(3) YamCha
Keyword(4) MOZ
1st Author's Name Tatsumi YOSHIDA
1st Author's Affiliation Dept. of Knowledge-based Information Engineering, Toyohashi University of Technology()
2nd Author's Name Kiyonori OHTAKE
2nd Author's Affiliation ATR Spoken Language Translation Research Laboratories
3rd Author's Name Kazuhide YAMAMOTO
3rd Author's Affiliation ATR Spoken Language Translation Research Laboratories
Date 2002/7/9
Paper # NLC2002-32
Volume (vol) vol.102
Number (no) 200
Page pp.pp.-
#Pages 6
Date of Issue