Presentation | 1995/7/20 Unknown Word Extraction from Corpora Using n-gram Statistics Shinsuke Mori, Makoto Nagao, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Dictionaries are indispensable for NLP as a source of information of grammatical functions or meanings of words. Much endeavor is being made to reinforce their vocabulary. Given continuous increase of new words or technical terms, building a dictionary takes vast effort and unknown words are inevitable at any step of analysis and this causes a grand problem. To solve this problem, we propose a method to extract words from a corpus and estimate part-of-speeches(POSs) which they belong to simultaneously using n-gram statistics, based on the supposition that distributions of strings preceding or following words belonging to the same POS are similar. Experiments have shown that this method is effective to infer the POS of unknown words and build a dictionary. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Unknown Word / Part-of-speech / Dictionary / Corpus / n-gram statistics |
Paper # | |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 1995/7/20(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Unknown Word Extraction from Corpora Using n-gram Statistics |
Sub Title (in English) | |
Keyword(1) | Unknown Word |
Keyword(2) | Part-of-speech |
Keyword(3) | Dictionary |
Keyword(4) | Corpus |
Keyword(5) | n-gram statistics |
1st Author's Name | Shinsuke Mori |
1st Author's Affiliation | Department of Electrical Engineering, Kyoto University() |
2nd Author's Name | Makoto Nagao |
2nd Author's Affiliation | Department of Electrical Engineering, Kyoto University |
Date | 1995/7/20 |
Paper # | |
Volume (vol) | vol.95 |
Number (no) | 168 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |