Constant Composition Distribution Matching
Constant Composition Distribution Matching
1, JANUARY 2016
Abstract— Distribution matching transforms independent and The author of [14, Sec. 4.8] suggests to concatenate short
Bernoulli(1/2) distributed input bits into a sequence of output codes and Mondelli et al. [4] employ a forward error
symbols with a desired distribution. Fixed-to-fixed length, invert- correction decoder to build an f2f length matcher. The
ible, and low complexity encoders and decoders based on constant
composition and arithmetic coding are presented. The encoder dematchers of [4] and [14] cannot always recover the
achieves the maximum rate, namely, the entropy of the desired input sequence with zero error. Hence systematic errors are
distribution, asymptotically in the blocklength. Furthermore, the introduced that cannot be corrected by the error correction
normalized divergence of the encoder output and the desired code or by retransmission. The thesis [15] proposes an
distribution goes to zero in the blocklength. invertible f2f length distribution matcher called adaptive
Index Terms— Distribution matching, fixed length, arithmetic arithmetic distribution matcher (aadm). The algorithm is
coding, asymptotically optimal algorithm. computationally complex.
In this work we propose practical, invertible, f2f length
I. I NTRODUCTION distribution matchers. They are asymptotically optimal and are
based on constant composition codes indexed by arithmetic
A DISTRIBUTION MATCHER transforms independent
Bernoulli( 12 ) distributed input bits into output symbols
with a desired distribution. We measure the distance between
coding. The paper is organized as follows. In Section II
we formally define distribution matching. We analyze con-
the matcher output distribution and the desired distribution by stant composition codes in Section III. In Section IV we
normalized informational divergence [1, p. 7]. Informational show how a constant composition distribution matcher (ccdm)
divergence is also known as Kullback-Leibler divergence or and dematcher can be implemented efficiently by arithmetic
relative entropy [2, Sec. 2.3]. A dematcher performs the coding.
inverse operation and recovers the input bits from the output
symbols. A distribution matcher is a building block of the II. P ROBLEM S TATEMENT
bootstrap scheme [3] that achieves the capacity of arbitrary The entropy of a discrete random variable A with alphabet A
discrete memoryless channels [4]. Distribution matchers are and distribution PA is
used in [5, Sec. VI] for rate adaptation and in [6] to achieve
the capacity of the additive white Gaussian noise channel. H (A) = −PA (a) log2 PA (a) (1)
Prefix-free distribution matching was proposed in a∈supp( PA )
[7, Sec. IV.A]. In [8] and [9], Huffman codes are used where supp(PA ) ⊆ A is the support of PA . The informational
for matching. Optimal variable-to-fixed and fixed-to-variable divergence of two distributions on A is
length distribution matchers are proposed in [10] and [11],
P (a)
respectively. The codebooks of the matchers in [8]–[11] D PÂ ||PA = PÂ (a) log2 Â . (2)
must be generated offline and stored. This is infeasible for PA (a)
a∈supp( PÂ )
large codeword lengths, which are necessary to achieve the
maximum rate. This problem is solved in [12] and [13] by The normalized informational divergence for length n random
using arithmetic coding to calculate the codebook online. vectors Ân = Â1 . . .Ân and An is defined as
The matchers proposed in [12] and [13] are asymptotically
D PÂn ||PAn
optimal. All approaches [8]–[13] are variable length, which . (3)
can lead to varying transmission rate, large buffer sizes, n
error propagation and synchronization problems [8, Sec. I]. For random vectors with independent and identically distrib-
Fixed-to-fixed (f2f) length codes do not have these issues. uted (iid) entries, we write
Manuscript received March 18, 2015; revised August 10, 2015; accepted
n
Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on October 01,2023 at 15:52:20 UTC from IEEE Xplore. Restrictions apply.
SCHULTE AND BÖCHERER: CONSTANT COMPOSITION DISTRIBUTION MATCHING 431
Definition 1: A matching rate R = m/n is achievable for a fccdm : {0, 1}m → T Pn . (12)
Ā
distribution PA if for any α > 0 and sufficiently large n there
is an invertible mapping f : {0, 1}m → An for which The actual mapping fccdm can be implemented efficiently by
arithmetic coding, as we will show in Section IV. The constant
D P f (Bm ) ||PAn composition codebook is now given by the image of fccdm , i.e.,
≤ α. (5)
n
Cccdm = fccdm ({0, 1}m ). (13)
The following proposition in [16] relates the rate R and (5).
Proposition 1 (Converse, [16, Proposition 8]): There exists Since fccdm is invertible, the codebook size is |Cccdm | = 2m .
a positive-valued function β with
α→0 B. Analysis
β(α) −→ 0 (6)
We show that fccdm asymptotically achieves all rates satis-
such that (5) implies fying (8). We can bound m by
m H (A)
≤ + β(α). (7) m = log2 |T Pn | ≥ log2 |T Pn | − 1. (14)
n H (B) Ā Ā
Proposition 1 bounds the maximum rate that can be achieved Recall that the matcher output distribution is PÃn . We have
under condition (5). Since H (B) = 1 we have
2−m PĀ (a )
n n
Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on October 01,2023 at 15:52:20 UTC from IEEE Xplore. Restrictions apply.
432 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 1, JANUARY 2016
Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on October 01,2023 at 15:52:20 UTC from IEEE Xplore. Restrictions apply.
SCHULTE AND BÖCHERER: CONSTANT COMPOSITION DISTRIBUTION MATCHING 433
Bernoulli 12 process. We describe the output model by a
random vector
Ān = Ā1 Ā2 . . . Ān (24)
with marginals PĀi = PĀ and the uniform distribution
1
PĀn (a n ) = ∀a n ∈ T Pn .
|T Pn | Ā
Ā
for the output sequence with the lowest interval border. As a Proposition 2: After n refinements of the output interval the
result, the codebook Cccdm of Example 2 is {‘0011’, ‘0110’, model used for the refinement step stated above creates equally
‘1001’, ‘1100’}. In general Cccdm has cardinality 2m with spaced (equally probable) intervals that are labeled with all
2m ≤ |T Pn | < 2m+1 according to (14). It is not possible sequences in T Pn .
Ā
Ā
to index the whole set T Pn unless 2m = |T Pn |. The analy- Proof: All symbols in the bag are chosen at some
Ā Ā point. Consequently only sequences in T Pn may appear.
sis of the code (Section III-B) is valid for all codebooks Ā
Cccdm ⊆ T Pn . The actual subset is implicitly defined by the All possibilities associated with the chosen string are products
Ā
arithmetic encoder. of fractions n a /n, where n takes on all values from the initial
We now discuss the online algorithm that processes the input value to 1 because every symbol is drawn at some point. Thus
sequentially. Initially, the input interval spans from 0 to 1. for each string we obtain for its probability an expression that
is independent of the realization itself:
As the input model is Bernoulli 12 we split the interval into
two equally sized intervals and continue with the upper interval n a=0 ! · · · n a=k−1 ! 1
PĀn (a n ) = = n ∀a n ∈ T Pn . (26)
in case the first input bit is ‘1’; otherwise we continue with the n! |T P | Ā
Ā
lower interval. After the next input bit arrives we repeat the last
step. After m input bits we reach a size 2−m interval. After
every refinement of the input interval the algorithm checks Numerical problems for representing the input interval and the
for a sure prefix of the output sequence, e.g., in Fig. 4 we output interval occur after a certain number of input bits. For
see that if the input starts with 1 the output must start with 1. this reason we introduce a rescaling each time a new output
Every time we extend the sure prefix by a new symbol, we symbol is known. We explain this next.
must calculate the probability of the next symbol given the
sure prefix. That means we determine the output intervals A. Scaling Input and Output Intervals
within the sure interval of the prefix. The model for calculating After we identify a sure prefix, we are no longer interested
the conditioned probabilities is based on drawing without in code sequences that do not have that prefix. We scale
replacement. There is a bag with n symbols of k discriminable the input and output interval such that the output interval
kinds. n a denotes how many symbols of kind a are initially in is [0,1). Fig. 6 illustrates the mapping of intervals (in1 , out1 )
the bag and n a is the current number. The probability to draw a to (in2 , out2 ). The refinement for the second symbol works as
symbol of type a is n a /n. If we pick a symbol a both n and n a described in Example 3. If the second input bit is 0, we know
decrement by 1. that 10 must be a prefix of the output. The resulting scaling is
Example 3: Fig. 5 shows a refinement of the output inter- shown in Fig. 6 as (in2 , out2 ) to (in3 , out3). A more detailed
vals. Initially there are 2 ‘0’s and 2 ‘1’s in the bag. explanation of scaling for arithmetic coding can be found for
1 Please note that in this case no distribution matcher is needed. However, instance in [20, Ch. 4]. We provide an implementation of ccdm
this indexing problem is of interest in its own right. online [21].
Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on October 01,2023 at 15:52:20 UTC from IEEE Xplore. Restrictions apply.
434 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 62, NO. 1, JANUARY 2016
Authorized licensed use limited to: NUST School of Electrical Engineering and Computer Science (SEECS). Downloaded on October 01,2023 at 15:52:20 UTC from IEEE Xplore. Restrictions apply.