Presentation | 2008/7/10 Construction of Japanese Idiom Corpus and its Application to Japanese Idiom Identification Chikara HASHIMOTO, Daisuke KAWAHARA, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Some phrases can be interpreted as either idiomatically (figuratively) or literally in context, and the precise identification of idioms is indispensable for full-fledged NLP. To this end, we have been constructing a Japanese idiom corpus that we hope provides a solution. This paper reports on the current status of the corpus and the result of Japanese idiom identification experiment using the corpus. The corpus targets 146 ambiguous idioms, and consists of 113,460 sentences, each of which is annotated with a literal/idiom label. The sentences have all been collected from the Web. As for Japanese idiom identification, we adopted a word sense disambiguation method, and targeted those 93 idioms for which more than 50 sentences for both literal and idiomatic usages were available. As a result, our system showed a performance that seemed equally well or better than that reported earlier on English idiom identification. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Japanese idiom / corpus / idiom identification / language resources |
Paper # | NLC2008-1 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2008/7/10(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Construction of Japanese Idiom Corpus and its Application to Japanese Idiom Identification |
Sub Title (in English) | |
Keyword(1) | Japanese idiom |
Keyword(2) | corpus |
Keyword(3) | idiom identification |
Keyword(4) | language resources |
1st Author's Name | Chikara HASHIMOTO |
1st Author's Affiliation | Graduate School of Science and Engineering, Yamagata University() |
2nd Author's Name | Daisuke KAWAHARA |
2nd Author's Affiliation | National Institute of Information and Communications Technology |
Date | 2008/7/10 |
Paper # | NLC2008-1 |
Volume (vol) | vol.108 |
Number (no) | 141 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |