Best Paper Award
Equivalences among some information measures for individual sequences and their applications for fixed-length coding problems[IEICE TRANS. FUNDAMENTALS, VOL.E107–A, NO.3 MARCH 2024]


A wide variety of problems in information theory, such as the source coding problem and the channel coding problem, can be formulated by introducing information measures such as entropy.
In this paper, three new information measures are proposed from two different perspectives, "overlapping" and "non-overlapping": non-overlapping max-entropy, overlapping smooth max-entropy, and non-overlapping smooth max-entropy. These information measures are defined based on the frequency of occurrence of sub-sequences in the individual sequence drawn from the source. The overlapping smooth max-entropy is defined based on counting the frequency of occurrence of overlapping sub-sequences, while the other two information measures are defined based on counting non-overlapping sub-sequences.
Conventionally, the formulation of the fixed-length source coding problem for individual sequences has used information measures called topological entropy and Ziv entropy, depending on the problem setting. This paper shows that topological entropy coincides with non-overlapping max-entropy and that Ziv entropy coincides with overlapping smooth max-entropy and non-overlapping smooth max-entropy. It also shows that when an individual sequence is drawn from an ergodic source, the overlapping smooth max-entropy and non-overlapping smooth max-entropy coincide with the entropy rate of the sources.
Furthermore, in this paper, fixed-length universal coding schemes applying the proposed information measures are proposed for both cases where the decoding error probability is zero and where the decoding error probability converges asymptotically to zero, respectively, and it is shown that these coding methods asymptotically achieve the optimal coding rate. As a result, the fixed-length source coding problem for individual sequences is reformulated. In addition, the proposed coding schemes are practical because they significantly reduce the coding complexity compared to conventional coding schemes.
The concepts of information measures and coding schemes proposed in this paper have the potential to be applied to various problems in information theory, not limited to the fixed-length coding problem considered in this paper, and thus make a very significant contribution to information theory and related fields.