Presentation | 2012-12-19 Recognizing Variations of Japanese "Good Morning" Phrases in Twitter Yoshinari Fujinuma, Hikaru Yokono, Pascual Martinez-Gomez, Akiko Aizawa, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Recently, the rapid growth of Consumer Generated Media (CGM) such as Twitter introduced much expressive variations and informal representations into textual resources. Although word segmentation is the first step in most Japanese language applications, current word segmentation tools are not sufficiently adapted to such informal text yet. In this paper, we focus on a most frequent phrase expression in Japanese morning twitter, "おはようございます", and construct a CRF-based extractor of the variations. Using 500 manually annotated samples, we obtain F1 score of over 0.91 for both the head span ("おはよう") and the entire span (including the attachment part such as "ございます"). We also show that the extracted variations contain normalization pattern which are not defined in JUMAN 7.0. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Informal text / Rule extraction / Twitter / CRF |
Paper # | NLC2012-39 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2012/12/12(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Recognizing Variations of Japanese "Good Morning" Phrases in Twitter |
Sub Title (in English) | |
Keyword(1) | Informal text |
Keyword(2) | Rule extraction |
Keyword(3) | |
Keyword(4) | CRF |
1st Author's Name | Yoshinari Fujinuma |
1st Author's Affiliation | The University of Tokyo() |
2nd Author's Name | Hikaru Yokono |
2nd Author's Affiliation | National Institute of Informatics |
3rd Author's Name | Pascual Martinez-Gomez |
3rd Author's Affiliation | The University of Tokyo/National Institute of Informatics |
4th Author's Name | Akiko Aizawa |
4th Author's Affiliation | The University of Tokyo/National Institute of Informatics |
Date | 2012-12-19 |
Paper # | NLC2012-39 |
Volume (vol) | vol.112 |
Number (no) | 367 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |