Presentation 2015-06-05
Categorization of Tweets Mentioning Books Based on Text Clustering
Shuntaro Yada, Kyo Kageura,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) We classified tweets mentioning books by their pattern of mentioning, using K-means clustering method. This attempt derive from our desire to gather training data for distinguisher of tweets mentioning books. First, we built a book title dictionary for Japanese morpheme analyser and gathered tweets containing book title from Twitter Streaming. Second, we clustered tweets using morpheme as feature. Finally, each tweet of the clusters was analysed and classified mannually. According this analysis, We discussed what method and device are available to distinguish book title as named entity from general tweets.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Twitter / Clustering / K-means / Named Entity Recognition
Paper # TL2015-11,NLC2015-11
Date of Issue 2015-05-28 (TL, NLC)

Conference Information
Committee NLC / TL
Conference Date 2015/6/4(2days)
Place (in Japanese) (See Japanese page)
Place (in English) The University of Tokushima
Topics (in Japanese) (See Japanese page)
Topics (in English) Application of natural language proessing and linguistic analysis, and general topic of NLP
Chair Koichi Takeuchi(Okayama Univ.) / Tadahisa Kondo(Kogakuin Univ.)
Vice Chair Hiroshi Kanayama(IBM) / Makoto Ichise(NTT DoCoMo) / Chiaki Kubomura(Yamano College of Aesthetics) / Masami Suzuki(KDDI R&D Labs.)
Secretary Hiroshi Kanayama(Univ. of Tokyo/Hottolink) / Makoto Ichise(Ryukoku Univ.) / Chiaki Kubomura(Univ. of Tsukuba) / Masami Suzuki(Kyorin Univ.)
Assistant Kazutaka Shimada(Kyushu Inst. of Tech.) / Ryuichiro Higashinaka(NTT) / Eiji Tomida(Ehime Univ.) / Yasushi Tsubota(Kyoto Univ.)

Paper Information
Registration To Technical Committee on Natural Language Understanding and Models of Communication / Technical Committee on Thought and Language
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Categorization of Tweets Mentioning Books Based on Text Clustering
Sub Title (in English)
Keyword(1) Twitter
Keyword(2) Clustering
Keyword(3) K-means
Keyword(4) Named Entity Recognition
1st Author's Name Shuntaro Yada
1st Author's Affiliation The University of Tokyo(UTokyo)
2nd Author's Name Kyo Kageura
2nd Author's Affiliation The University of Tokyo(UTokyo)
Date 2015-06-05
Paper # TL2015-11,NLC2015-11
Volume (vol) vol.115
Number (no) TL-69,NLC-70
Page pp.pp.61-66(TL), pp.61-66(NLC),
#Pages 6
Date of Issue 2015-05-28 (TL, NLC)