Presentation | 2016-05-20 A Grammar-Based Data Compression Algorithm Using Chomsky Normal Form of Context Free Grammar Mitsuharu Arimura, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper proposes a new class of grammar-based lossless source code. For the simplicity, a Semi-Chomsky Normal Form (semi-CNF) of context free grammar is introduced. A semi-CNF is a modified form of the CNF proposed in this paper. Grammar-based code is a class of universal data compression algorithm using a context-free grammar. The proposed algorithm translates a given sequence to the irreducible grammar generating only a given sequence in two step. In the first step, a semi-CNF of the set of production rules is constructed using repeated substitution from a pair of symbols or variables to a new variable. In the second step, semi-CNF is translated to an irreducible or smaller grammar by eliminating production rules which are used only once in the other production rules. LZ78, Multilevel Pattern Matching (MPM) and Byte Pair Encoding (BPE) algorithms can be treated as examples of this class of codes. These algorithms use only the first step of this two-step procedure. Therefore, the proposed method can improve the compression performance of these algorithms by the unified procedure. This method has an advantage that, transformation from a given sequence to the grammar is quite simple and efficient, by using the two step algorithm through semi-CNF. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Lossless Data Compression / Context Free Grammar / Chomsky Normal Form |
Paper # | IT2016-12,EMM2016-12 |
Date of Issue | 2016-05-12 (IT, EMM) |
Conference Information | |
Committee | IT / EMM |
---|---|
Conference Date | 2016/5/19(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Otaru Economic Center |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Information Security, Information Theory, Information Hiding, etc. |
Chair | Yasutada Oohama(Univ. of Electro-Comm.) / Akinori Ito(Tohoku Univ.) |
Vice Chair | Tadashi Wadayama(Nagoya Inst. of Tech.) / Masashi Unoki(JAIST) / Masaki Kawamura(Yamaguchi Univ.) |
Secretary | Tadashi Wadayama(Univ. of Electro-Comm.) / Masashi Unoki(Wakayama Univ.) / Masaki Kawamura(Univ. of Electro-Comm.) |
Assistant | Takuya Kusaka(Okayama Univ.) / Motoi Iwata(Osaka Pref. Univ.) / Kazuhiro Kohno(Kansai Univ.) |
Paper Information | |
Registration To | Technical Committee on Information Theory / Technical Committee on Enriched MultiMedia |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A Grammar-Based Data Compression Algorithm Using Chomsky Normal Form of Context Free Grammar |
Sub Title (in English) | |
Keyword(1) | Lossless Data Compression |
Keyword(2) | Context Free Grammar |
Keyword(3) | Chomsky Normal Form |
1st Author's Name | Mitsuharu Arimura |
1st Author's Affiliation | Shonan Institute of Technology(Shonan Inst. Tech.) |
Date | 2016-05-20 |
Paper # | IT2016-12,EMM2016-12 |
Volume (vol) | vol.116 |
Number (no) | IT-33,EMM-34 |
Page | pp.pp.69-74(IT), pp.69-74(EMM), |
#Pages | 6 |
Date of Issue | 2016-05-12 (IT, EMM) |