Presentation 2001/10/10
Word Translation Based on Machine Learning Models Using Translation Memory and Corpora
Kiyotaka UCHIMOTO, Satoshi SEKINE, Masaki MURATA, Hiroshi ISAHARA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) The second contest on word sensedisambiguation, SENSEVAL-2, was held in Spring, 2001. It consists of several tasks in various languages. In this paper, we describe our system that is used for one of these tasks: the Japanese translation task. In this task, senses of a word are defined in terms of the word's translations. Given an input sentence and a target word in the sentence, our system first estimates the similarity between the input sentence and parallel example sets called translation memory. It then selects an appropriate translation of the target word by using the example set with the highest similarity. The similarity is calculated using dynamic programming and a machine learning model which uses the following features: similarity of a string, words to the left and to the right of the target word in the input sentence, content words in the input sentence and their translations, and cooccurrence of content words in bilingual and monolingual corpora in English and Japanese. Our system achieves an accuracy of 63.4%, finishing the contest in third place among nine systems developed by seven groups.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Translation memory / Corpus / Similarity / Machine learning / Word translation
Paper # NLC 2001-41
Date of Issue

Conference Information
Committee NLC
Conference Date 2001/10/10(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Word Translation Based on Machine Learning Models Using Translation Memory and Corpora
Sub Title (in English)
Keyword(1) Translation memory
Keyword(2) Corpus
Keyword(3) Similarity
Keyword(4) Machine learning
Keyword(5) Word translation
1st Author's Name Kiyotaka UCHIMOTO
1st Author's Affiliation Communications Research Laboratory()
2nd Author's Name Satoshi SEKINE
2nd Author's Affiliation New York University
3rd Author's Name Masaki MURATA
3rd Author's Affiliation Communications Research Laboratory
4th Author's Name Hiroshi ISAHARA
4th Author's Affiliation Communications Research Laboratory
Date 2001/10/10
Paper # NLC 2001-41
Volume (vol) vol.101
Number (no) 351
Page pp.pp.-
#Pages 8
Date of Issue