Presentation | 2000/7/19 A Note on the Compressed Suffix Arrays Kunihiko Sadakane, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | The suffix array, a kind of full-text indices, is more space-efficient than other full-text indices;however it is still larger than word indices such as the inverted files. Though the recently proposed compressed suffix array solved the problem in part, its index size is always larger than the text because it uses the text itself for keyword queries. In this paper we extend the text search algorithm using the compressed suffix array in order to work without the text itself. We also propose an algorithm to recover the whole or in part of the text. By using the algorithms both text comression and fast queries can be achieved. Experimental results on a large collection of English and Japanese texts and DNA sequences show that the compressed suffix array is more effective than using the inverted files or sequential searches if the number of occurrences of keywords is not too large. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | full-text index / suffix array / compression |
Paper # | DE2000-29 |
Date of Issue |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2000/7/19(1days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | |
Chair | |
Vice Chair | |
Secretary | |
Assistant |
Paper Information | |
Registration To | Data Engineering (DE) |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Note on the Compressed Suffix Arrays |
Sub Title (in English) | |
Keyword(1) | full-text index |
Keyword(2) | suffix array |
Keyword(3) | compression |
1st Author's Name | Kunihiko Sadakane |
1st Author's Affiliation | Department of System Information Science Graduate School of Information Sciences, Tohoku University() |
Date | 2000/7/19 |
Paper # | DE2000-29 |
Volume (vol) | vol.100 |
Number (no) | 226 |
Page | pp.pp.- |
#Pages | 8 |
Date of Issue |