Presentation | 2015-02-05 Extracting Similar Documents by Eigenvector Algorithm Shoko KATO, Kazumi SAITO, Kazuhiko KAZAMA, |
---|---|
PDF Download Page | ![]() |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | In this paper, we extract some similar documents from large number of text documents by calculating eigenvector of document-term similarlity matrics. Namely, we propose a Weighted-SR (WSR) method based on the Spectral-Relaxation (SR) method. The SR method is one of core extraction methods of complex networks. We also consider LSA-WSR and MDS-WSR methods based on LSA and MDS. In our experiments using a text document dataset from Yahoo! News, We demonstrate that these methods extract documents which consist of mixed topics and split one topic into some core portions. We also show that the number of extracted documents is decreased and similar documents narrowed down by increasing η which is an arbitrary parameter. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Documents Extraction / Core Analysis / Eigenvector / Topic Extraction |
Paper # | NLC2014-46 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2015/1/29(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Extracting Similar Documents by Eigenvector Algorithm |
Sub Title (in English) | |
Keyword(1) | Documents Extraction |
Keyword(2) | Core Analysis |
Keyword(3) | Eigenvector |
Keyword(4) | Topic Extraction |
1st Author's Name | Shoko KATO |
1st Author's Affiliation | Graduate School of Management and Information of Innovation, University of Shizuoka() |
2nd Author's Name | Kazumi SAITO |
2nd Author's Affiliation | Graduate School of Management and Information of Innovation, University of Shizuoka |
3rd Author's Name | Kazuhiko KAZAMA |
3rd Author's Affiliation | Faculty of Systems Engineering, Wakayama University |
Date | 2015-02-05 |
Paper # | NLC2014-46 |
Volume (vol) | vol.114 |
Number (no) | 444 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |