Presentation 2010-05-28
Prevalence Survey of Unknown Words in Japanese Web Text
Shun HATTORI, Hiroyuki KAMEDA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Mining the Web to extract various knowledge from the growing source has become one of the hottest research topics. However, while such a Natural Language Processing (NLP) as morphological analysis or semantic analysis for Web text, the existence of "Unknown Words" that are not registered in a NLP system's dictionary (lexical database) is a serious impediment. In this paper, we survey the prevalence of unknown words in various domains of Japanese Web Text, e.g., dependency on its type of Web media, topics and upload date.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Unknown Words / Web Docunments / Unregistered Words / New Words / Web Mining / Unknown Word Processing / Natural Language Processing (NLP)
Paper # TL2010-2
Date of Issue

Conference Information
Committee TL
Conference Date 2010/5/21(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Thought and Language (TL)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Prevalence Survey of Unknown Words in Japanese Web Text
Sub Title (in English)
Keyword(1) Unknown Words
Keyword(2) Web Docunments
Keyword(3) Unregistered Words
Keyword(4) New Words
Keyword(5) Web Mining
Keyword(6) Unknown Word Processing
Keyword(7) Natural Language Processing (NLP)
1st Author's Name Shun HATTORI
1st Author's Affiliation School of Computer Science, Tokyo University of Technology()
2nd Author's Name Hiroyuki KAMEDA
2nd Author's Affiliation School of Computer Science, Tokyo University of Technology
Date 2010-05-28
Paper # TL2010-2
Volume (vol) vol.110
Number (no) 63
Page pp.pp.-
#Pages 6
Date of Issue