Presentation | 2016-06-04 Identification of Tweets that Mention Books Shuntaro Yada, Kyo Kageura, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | We report performances of a classifier that identify Tweets that Mention Books (TMB) from tweets that contain the same strings as book titles in Japanese. The classifier we developped performed reasonably good in terms of F1-measure (about 0.7) with the combination of Maximum Entropy Modelling and a Bag-of-Words based feature set. In this paper, in order to improve our classifier, we analyse effects to classification performance, of (1) training data augmentation using a simple search based method with book/reading related keywords, and of (2) feature dimension reduction via Latent Semantic Analysis (LSA). In addition, we compare our classifier to Maltilayer Perceptron activated by Sigmoid function in terms of feature dimension reduction on a trial basis. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Twitter / Named Entity Recognition / Classification / Logistic Regression / Maximum Entropy Modelling / Multilayer Perceptron |
Paper # | TL2016-7,NLC2016-7 |
Date of Issue | 2016-05-28 (TL, NLC) |
Conference Information | |
Committee | NLC / TL |
---|---|
Conference Date | 2016/6/4(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Otaru University of Commerce |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Application of natural language proessing and linguistic analysis, and general topic of NLP |
Chair | Hiroshi Kanayama(IBM) / Masami Suzuki(KDDI R&D Labs.) |
Vice Chair | Makoto Ichise(NTT DoCoMo) / Takeshi Sakaki(Univ. of Tokyo/Hottolink) / Chiaki Kubomura(Yamano College of Aesthetics) |
Secretary | Makoto Ichise(Ryukoku Univ.) / Takeshi Sakaki(Kyushu Inst. of Tech.) / Chiaki Kubomura(Ehime Univ.) |
Assistant | Ryuichiro Higashinaka(NTT) / Mitsuo Yoshida(Toyohashi Univ. of Tech.) / Yasushi Tsubota(Kyoto Univ.) / Nobuyuki Jincho(Waseda Univ.) |
Paper Information | |
Registration To | Technical Committee on Natural Language Understanding and Models of Communication / Technical Committee on Thought and Language |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Identification of Tweets that Mention Books |
Sub Title (in English) | Effects of Features, Data Size, and ML Algorithms |
Keyword(1) | |
Keyword(2) | Named Entity Recognition |
Keyword(3) | Classification |
Keyword(4) | Logistic Regression |
Keyword(5) | Maximum Entropy Modelling |
Keyword(6) | Multilayer Perceptron |
1st Author's Name | Shuntaro Yada |
1st Author's Affiliation | The University of Tokyo(UTokyo) |
2nd Author's Name | Kyo Kageura |
2nd Author's Affiliation | The University of Tokyo(UTokyo) |
Date | 2016-06-04 |
Paper # | TL2016-7,NLC2016-7 |
Volume (vol) | vol.116 |
Number (no) | TL-77,NLC-78 |
Page | pp.pp.29-34(TL), pp.29-34(NLC), |
#Pages | 6 |
Date of Issue | 2016-05-28 (TL, NLC) |