Presentation 2007-04-26
Suffix Array Based Computation of Substring Equivalence Classes
Kazuyuki NARISAWA, Shunsuke INENAGA, Hideo BANNAI, Masayuki TAKEDA,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper considers enumerating substring equivalence classes introduced by Blumer et al. They used the equivalence classes to define an index structure called compact directed acyclic word graphs (CDAWGs). In text analysis, considering these equivalence classes is useful since they group together redundant substring with essentially identical occurrences. In this paper, we present how to enumerate those equivalence classes using suffix arrays. Our algorithm uses rank and lcp arrays for traversing the correspoding suffix trees, but does not need any other additional data structure. The algorithm runs in linear time in the length of the input string. We show experimental results comparing the running times and space consumptions of our algorithm, suffix tree and CDAWG based approaches.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) equivalence class / suffix tree / suffix array
Paper # COMP2007-9
Date of Issue

Conference Information
Committee COMP
Conference Date 2007/4/19(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Theoretical Foundations of Computing (COMP)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Suffix Array Based Computation of Substring Equivalence Classes
Sub Title (in English)
Keyword(1) equivalence class
Keyword(2) suffix tree
Keyword(3) suffix array
1st Author's Name Kazuyuki NARISAWA
1st Author's Affiliation Department of Informatics, Graduate School of Information Science and Electrical Engineering, Kyushu University()
2nd Author's Name Shunsuke INENAGA
2nd Author's Affiliation Department of Computer Science and Communication Engineering, Kyushu University
3rd Author's Name Hideo BANNAI
3rd Author's Affiliation Department of Informatics, Faculty of Information Science and Electrical Engineering, Kyushu University
4th Author's Name Masayuki TAKEDA
4th Author's Affiliation Department of Informatics, Faculty of Information Science and Electrical Engineering, Kyushu University:SORST, Japan Science and Technology Agency(JST)
Date 2007-04-26
Paper # COMP2007-9
Volume (vol) vol.107
Number (no) 24
Page pp.pp.-
#Pages 8
Date of Issue