Presentation | 2001/7/9 Dimensionality Reduction of VectorSpace Model for Information Retrieval using Simple Principal Compornent Analysis Shingo Kuroiwa, Satoru Tsuge, Hironori Tani, Tai Xiaoying, Masami Shishibori, Kenji Kita, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | The Vector Space Model (VSM) is a popular information retrieval model, which represents a document collection by a term-by-document matrix. Since term-by-document matrices are usually high-dimensional and sparse, they are susceptible to noise and are also difficult to capture the underlying semantic structure. Additionally, computing resources necessary for the storage and processing of such data is enormous. Dimensionality reduction is a way to overcome these problems. Principal Component Analysis (PCA) and Singular Value Decomposition (SVD) are popular techniques for dimensionality reduction based on matrix decomposition. However, such methods consume a large amount of computation resources. In the work described here, we use Simple Principal Component Analysis (SPCA), which is a data-oriented fast method, for dimensionality reduction of the vector space mopdel. Experiments based on the MEDLINE collection showed that SPCA achieved significant improvement compared to the conventional vector space model. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Simple PCA / Information retrieval / LSI / VSM / Dimensionality reduction |
Paper # | NLC2001-17 |
Date of Issue |
Conference Information | |
Committee | NLC |
---|---|
Conference Date | 2001/7/9(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Natural Language Understanding and Models of Communication (NLC) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Dimensionality Reduction of VectorSpace Model for Information Retrieval using Simple Principal Compornent Analysis |
Sub Title (in English) | |
Keyword(1) | Simple PCA |
Keyword(2) | Information retrieval |
Keyword(3) | LSI |
Keyword(4) | VSM |
Keyword(5) | Dimensionality reduction |
1st Author's Name | Shingo Kuroiwa |
1st Author's Affiliation | The University of Tokushima() |
2nd Author's Name | Satoru Tsuge |
2nd Author's Affiliation | The University of Tokushima |
3rd Author's Name | Hironori Tani |
3rd Author's Affiliation | The University of Tokushima |
4th Author's Name | Tai Xiaoying |
4th Author's Affiliation | The University of Tokushima |
5th Author's Name | Masami Shishibori |
5th Author's Affiliation | The University of Tokushima |
6th Author's Name | Kenji Kita |
6th Author's Affiliation | The University of Tokushima |
Date | 2001/7/9 |
Paper # | NLC2001-17 |
Volume (vol) | vol.101 |
Number (no) | 189 |
Page | pp.pp.- |
#Pages | 6 |
Date of Issue |