Presentation | 2002/7/9 Selection of a Basic Vocabulary Based on Word Familiarity Ratings Tomoko KANASUGI, Kaname KASAHARA, Nozomu INAGO, Shigeaki AMANO, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | As the first step of constructing a dictionary of word concepts, the "Commonsense Concept Database," which will be a base for language processing technologies regarding meaning, we selected basic words which are supposed to be commonly used by Japanese adults. We selected the basic words from a Japanese dictionary in which the number of word entries is about 95,000. In a previous study, the size of the basic words which a Japanese child of twelve years knew was estimated to be 25,000. From the another recent psychological study estimating the number of the vocabulary in Japanese speakers, we were able to estimate that 25,000 of the Japanese basic words were known by 94% of Japanese adults. Therefore, we selected the number of basic words for Commonsense Concept Database to be 25,000. As a measure of selecting the basic word, we used word familiarity ratings. We did farther psychological experiments of rating familiarity of words in the Japanese dictionary which had not been listed in the word familiarity database previously published. Finally, we selected all words with a familiarity rating above five (between seven point scale) which gave us around 27,000 words out of the 95,000 entries of the dictionary. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Word Familiarity / basic words / ontology / taxonomy |
Paper # | NLC2002-27 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2002/7/9(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Selection of a Basic Vocabulary Based on Word Familiarity Ratings |
Sub Title (in English) | |
Keyword(1) | Word Familiarity |
Keyword(2) | basic words |
Keyword(3) | ontology |
Keyword(4) | taxonomy |
1st Author's Name | Tomoko KANASUGI |
1st Author's Affiliation | NTT Advanced Technology Corporation() |
2nd Author's Name | Kaname KASAHARA |
2nd Author's Affiliation | NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation |
3rd Author's Name | Nozomu INAGO |
3rd Author's Affiliation | NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation |
4th Author's Name | Shigeaki AMANO |
4th Author's Affiliation | NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation |
Date | 2002/7/9 |
Paper # | NLC2002-27 |
Volume (vol) | vol.102 |
Number (no) | 200 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |