Presentation 2010-07-22
IT2010-12 Data Compression based on the Optimal Context Determined by the Minimal Expected Codeword Length in a Suffix Tree
Keita TOMOKUNI, Hirosuke YAMAMOTO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) In this paper, we consider a data compression method, which encodes each byte of data by using a suffix tree. In the same way as PPM and a dynamic anti-dictionary method based on a suffix tree, each byte is encoded by an arithmetic code using a suffix tree, which is constructed dynamically from the past data that has already been encoded. Two methods, threshold method and minimum-expected-codeword-length method, are proposed to select a context (or a suffix node) for encoding in a suffix tree. The performance of the proposed methods are compared with other methods for the Calgary Corpus, and it is shown that the minimum-expected-codeword-length method can achieve better compression rate than PPM in the case of text files with large size.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) suffix tree / arithmetic coding / data compression / PPM
Paper # IT2010-12
Date of Issue

Conference Information
Committee IT
Conference Date 2010/7/15(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information Theory (IT)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) IT2010-12 Data Compression based on the Optimal Context Determined by the Minimal Expected Codeword Length in a Suffix Tree
Sub Title (in English)
Keyword(1) suffix tree
Keyword(2) arithmetic coding
Keyword(3) data compression
Keyword(4) PPM
1st Author's Name Keita TOMOKUNI
1st Author's Affiliation Graduate School of Frontier Sciences, The University of Tokyo()
2nd Author's Name Hirosuke YAMAMOTO
2nd Author's Affiliation Graduate School of Frontier Sciences, The University of Tokyo
Date 2010-07-22
Paper # IT2010-12
Volume (vol) vol.110
Number (no) 137
Page pp.pp.-
#Pages 6
Date of Issue