Presentation 2002/7/9
Selection of a Basic Vocabulary Based on Word Familiarity Ratings
Tomoko KANASUGI, Kaname KASAHARA, Nozomu INAGO, Shigeaki AMANO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) As the first step of constructing a dictionary of word concepts, the "Commonsense Concept Database," which will be a base for language processing technologies regarding meaning, we selected basic words which are supposed to be commonly used by Japanese adults. We selected the basic words from a Japanese dictionary in which the number of word entries is about 95,000. In a previous study, the size of the basic words which a Japanese child of twelve years knew was estimated to be 25,000. From the another recent psychological study estimating the number of the vocabulary in Japanese speakers, we were able to estimate that 25,000 of the Japanese basic words were known by 94% of Japanese adults. Therefore, we selected the number of basic words for Commonsense Concept Database to be 25,000. As a measure of selecting the basic word, we used word familiarity ratings. We did farther psychological experiments of rating familiarity of words in the Japanese dictionary which had not been listed in the word familiarity database previously published. Finally, we selected all words with a familiarity rating above five (between seven point scale) which gave us around 27,000 words out of the 95,000 entries of the dictionary.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Word Familiarity / basic words / ontology / taxonomy
Paper # NLC2002-27
Date of Issue

Conference Information
Committee NLC
Conference Date 2002/7/9(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Natural Language Understanding and Models of Communication (NLC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Selection of a Basic Vocabulary Based on Word Familiarity Ratings
Sub Title (in English)
Keyword(1) Word Familiarity
Keyword(2) basic words
Keyword(3) ontology
Keyword(4) taxonomy
1st Author's Name Tomoko KANASUGI
1st Author's Affiliation NTT Advanced Technology Corporation()
2nd Author's Name Kaname KASAHARA
2nd Author's Affiliation NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
3rd Author's Name Nozomu INAGO
3rd Author's Affiliation NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
4th Author's Name Shigeaki AMANO
4th Author's Affiliation NTT Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
Date 2002/7/9
Paper # NLC2002-27
Volume (vol) vol.102
Number (no) 200
Page pp.pp.-
#Pages 6
Date of Issue