Presentation 2011-03-04
An improved PPM* Algorithm based on the Minimal Expected Codeword Length and Switching Escape Probability Estimation
Keita TOMOKUNI, Hirosuke YAMAMOTO,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) For the PPM data compression algorithm, several methods have been proposed to estimate the ESC probability, where ESC represents the occurrence of a new symbol at a node in the suffix tree used in the PPM. However, the estimated ESC probability is often deviated from the actual frequencies of ESC, and this deviation worsens the compression rate of the PPM. In this paper, we propose a new method to estimate the ESC probability. We show by Calgary, Canterbury, and Large Corpuses that the estimation method of PPMX is a good estimator when a node in the suffix tree has many child nodes, but the estimated ESC probability is much lower than the actual frequency when a node has a few child nodes. So, in our method, the ESC probability is estimated by switching the original and our modified estimation methods of PPMX based on the number of child nodes at each node. By combining this ESC probability estimation method and a variant of PPM, in which a coding context is determined based on the minimum expected codeword length, we show by the evaluation of compression rate for the corpuses that the proposed PPM can beat the PPMZ, which was the best variant in the family of PPM.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) PPM / escape probability / data compression / suffix tree / arithmetic coding
Paper # IT2010-104,ISEC2010-108,WBS2010-83
Date of Issue

Conference Information
Committee ISEC
Conference Date 2011/2/24(1days)
Place (in Japanese) (See Japanese page)
Place (in English)
Topics (in Japanese) (See Japanese page)
Topics (in English)
Chair
Vice Chair
Secretary
Assistant

Paper Information
Registration To Information Security (ISEC)
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) An improved PPM* Algorithm based on the Minimal Expected Codeword Length and Switching Escape Probability Estimation
Sub Title (in English)
Keyword(1) PPM
Keyword(2) escape probability
Keyword(3) data compression
Keyword(4) suffix tree
Keyword(5) arithmetic coding
1st Author's Name Keita TOMOKUNI
1st Author's Affiliation Graduate School of Frontier Sciences, The University of Tokyo()
2nd Author's Name Hirosuke YAMAMOTO
2nd Author's Affiliation Graduate School of Frontier Sciences, The University of Tokyo
Date 2011-03-04
Paper # IT2010-104,ISEC2010-108,WBS2010-83
Volume (vol) vol.110
Number (no) 443
Page pp.pp.-
#Pages 8
Date of Issue