0% found this document useful (0 votes)
69 views

Reed-Solomon Codes: 8.1 Linear Codes Over Finite Fields

This document discusses Reed-Solomon codes. It begins by introducing linear codes over finite fields and the Singleton bound for these codes. Reed-Solomon codes are introduced as a class of maximum distance separable linear codes that meet the Singleton bound with equality. Reed-Solomon codes are defined using a valuation map that evaluates the information polynomial at distinct elements of the finite field to obtain the codeword.

Uploaded by

The Quan Bui
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views

Reed-Solomon Codes: 8.1 Linear Codes Over Finite Fields

This document discusses Reed-Solomon codes. It begins by introducing linear codes over finite fields and the Singleton bound for these codes. Reed-Solomon codes are introduced as a class of maximum distance separable linear codes that meet the Singleton bound with equality. Reed-Solomon codes are defined using a valuation map that evaluates the information polynomial at distinct elements of the finite field to obtain the codeword.

Uploaded by

The Quan Bui
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chapter 8

Reed-Solomon codes

In the previous chapter we discussed the properties of finite fields, and showed that there exists
an essentially unique finite field Fq with q = pm elements for any prime p and integer m ≥ 1.
We now consider (n, k, d) linear codes over a finite field Fq . The most useful such codes are the
Reed-Solomon (RS) codes (1960), whose length is limited to n ≤ q, but which have arbitrary
dimension k, 1 ≤ k ≤ n, and which have a minimum distance of d = n − k + 1. This meets a
certain fundamental bound called the Singleton bound. RS codes are easily decoded, and have
a wide range of uses.
We also briefly mention binary Bose-Chaudhuri-Hocquenghem (BCH) codes (1959), which are
binary derivatives of RS codes. The parameters (n, k, d) of BCH codes are sometimes slightly
better than those of Reed-Muller (RM) codes.

8.1 Linear codes over finite fields


An (n, k) linear code C over a finite field Fq is a k-dimensional subspace of the vector space Fnq
of all n-tuples over Fq . For q = 2, this reduces to our previous definition of binary linear codes.
A subset C ⊆ Fnq is a subspace if it is closed under componentwise addition (has the group
property) as well as under multiplication by scalars (elements of Fq ). A linear code C thus
necessarily contains the all-zero n-tuple 0.
A set of k linearly independent generators (i.e., a basis) (g1 , . . . , gk ) for C may be found by a
greedy algorithm similar to that used for binary linear block codes in Chapter 6. Then C is the
set of all q k q-ary linear combinations of the generators; i.e.,
k
X
C={ aj gj , aj ∈ Fq , 1 ≤ j ≤ k}.
j=1

Thus C has q k distinct codewords.


Example 1. The following nine 4-tuples over F3 form a (4, 2, 3) linear code over the ternary
field F3 = {0, 1, 2}, with generators g1 = (1110) and g2 = (0121):

C = {0000, 1110, 2220, 0121, 1201, 2011, 0212, 1022, 2102}.

101
102 CHAPTER 8. REED-SOLOMON CODES

By the group (permutation) property, C = C − c for any codeword c ∈ C, so the set of


Hamming distances between any codeword c ∈ C and all other codewords is independent of c.
The minimum Hamming distance d between codewords in a linear code C is thus equal to the
minimum Hamming weight of any nonzero codeword (the minimum distance between 0 and any
other codeword). An (n, k) linear code over Fq with minimum Hamming distance d is called an
(n, k, d) linear code.
More generally, the group property shows that the number of codewords at Hamming distance
w from any codeword c ∈ C is the number Nw of codewords of Hamming weight w in C.
Much of classical algebraic coding theory has been devoted to optimizing the parameters
(n, k, d); i.e., maximizing the size q k of the code for a given length n, minimum distance d and
field Fq , or maximizing d for a given n, k and q. The practical motivation for this research has
been to maximize the guaranteed error-correction power of the code. Because Hamming distance
is a metric satisfying the triangle inequality, a code with Hamming distance d is guaranteed to
correct t symbol errors whenever 2t < d, or in fact to correct t errors and s erasures whenever
2t + s < d. This elementary metric is not the whole story with regard to performance on an
AWGN channel, nor does it take into account decoding complexity; however, it is a good first
measure of code power.

8.2 The Singleton bound and MDS codes


The Singleton bound is the most fundamental bound on the parameters (n, k, d) over any field:
k + d ≤ n + 1. A code that meets this bound with equality is called “maximum distance
separable” (MDS).
The only binary codes that are MDS are the trivial (n, n, 1) universe codes, the (n, n − 1, 2)
single-parity-check (SPC) codes, and the (n, 1, n) repetition codes. However, over nonbinary
fields, there exist much less trivial MDS codes, notably the Reed-Solomon codes and their close
relatives.
Let C ⊆ An be any set of |C| n-tuples over any symbol alphabet A of size q = |A|, and let the
minimum Hamming distance between codewords in C be d. Then C should be able to correct
any set of s = d − 1 erasures; i.e., after puncturing s = d − 1 of the coordinates, the codewords
must still all differ in the remaining n − s = n − d + 1 coordinates. Therefore there can be at
most q n−d+1 codewords in C:

Theorem 8.1 (Singleton bound) A code C of length n with minimum Hamming distance d
over an alphabet of size q = |A| can have at most |C| ≤ q n−d+1 codewords. Equality holds if and
only if the codewords run through all q k possible k-tuples in every set of k = n−d+1 coordinates.

Note that the Singleton bound holds also for nonlinear codes. Any code that meets the
Singleton bound with equality is called an MDS code.
Given a (possibly nonlinear) code C with q k codewords over an alphabet of size q, a set of k
coordinates is called an information set of C if the codewords run through all q k possible k-tuples
in that set of coordinates; i.e., if there is a unique codeword associated with every possible set
of symbol values in that set of coordinates. In other words, we may freely specify the symbols
in these coordinates, and then the remainder of the codeword is uniquely specified. The case of
equality may therefore be restated as follows:
8.2. THE SINGLETON BOUND AND MDS CODES 103

Corollary 8.2 (Information sets of MDS codes) C is an MDS code if and only if every
subset of k = n − d + 1 coordinates is an information set of C.

If C is a linear (n, k, d) code over a finite field Fq , then its size is |C| = q k . Therefore:

Corollary 8.3 (Singleton bound for linear codes) A linear (n, k, d) code C over any finite
field Fq has k ≤ n−d+1. Equality holds if and only if every set of k coordinates is an information
set of C.

Equivalently, d + k ≤ n + 1, or d ≤ n − k + 1.
The conditions governing MDS codes are so stringent that the weight distribution of a linear
(n, k, d) MDS code over Fq is completely determined by the parameters n, k, d and q. We will
show how to derive Nd and Nd+1 ; the number of codewords Nw of higher weights w may be
derived similarly.

Theorem 8.4 (Nd for MDS codes) A linear (n, k, d) MDS code over Fq has q − 1 codewords
that have weight d in every subset of d coordinates. The total number of codewords of weight d
is thus µ ¶
n
Nd = (q − 1).
d

Proof. Given a subset of d = n − k + 1 coordinates, consider the information set consisting of


the complementary n − d = k − 1 coordinates plus any one of the given coordinates. Fix the
symbol values in the k − 1 complementary coordinates to 0, and let the symbol values in the
remaining information coordinate run through all q symbols in Fq . This must yield q different
codewords with all zeroes in the n − d complementary coordinates. One of these codewords must
be the all-zero codeword 0. The remaining q − 1 codewords have weight at most d; but since
the minimum nonzero weight
¡n¢ is d, all must have weight d. The formula for Nd then follows from
the fact that there are d different subsets of d coordinates.
Exercise 1. Show that the number of codewords of weight d + 1 in an (n, k, d) linear MDS
code over Fq is µ ¶µ µ ¶ ¶
n 2 d+1
Nd+1 = (q − 1) − (q − 1) ,
d+1 d
where the first term in parentheses represents the number of codewords with weight ≥ d in any
subset of d+1 coordinates, and the second term represents the number of codewords with weight
equal to d. Show that this implies that there exists no (n, k, d) linear MDS code over Fq with
d ≥ q, except when d = n (e.g., the (n, 1, n) repetition code).
Exercise 2. Compute the number of codewords of weights 2 and 3 in an (n, n − 1, 2) SPC
code over F2 . Compute the number of codewords of weights 2 and 3 in an (n, n − 1, 2) linear
code over F3 . Compute the number of codewords of weights 3 and 4 in a (4, 2, 3) linear code
over F3 .
104 CHAPTER 8. REED-SOLOMON CODES

8.3 Reed-Solomon (RS) Codes


For any code parameters n, k, d = n − k + 1 (1 ≤ k ≤ n) and finite field Fq , there exists a linear
(n, k, d) RS code over Fq , as long as n ≤ q + 1. Thus RS codes form a large class of MDS codes.
Binary RS codes have length at most n ≤ q + 1 = 3, and therefore are not very interesting.
When we speak of RS codes, we are usually thinking of nonbinary RS codes over Fq , where q
may be relatively large.

8.3.1 RS codes as valuation codes

The most natural definition of RS codes is for length n = q. The definition is in terms of a
certain valuation map from the set (Fq )k of all k-tuples over Fq to the set (Fq )n of n-tuples over
Fq .
Let the k information symbols be denoted by (f0 , f1 , . . . , fk−1 ), where fj ∈ Fq , 0 ≤ j ≤ k − 1.
Let f (z) = f0 +f1 z+· · ·+fk−1 z k−1 ∈ Fq [z] be the corresponding polynomial in the indeterminate
z. (We use z for the indeterminate here to avoid confusion with the indeterminate x used in the
polynomials representing the elements of Fq .) Thus f (z) is one of the q k polynomials over Fq of
degree less than k.
Let β1 , β2 , . . . βq be the q different elements of Fq arranged in some arbitrary order. The most
convenient arrangement is β1 = 0, β2 = 1, β3 = α, . . . , βj = αj−2 , . . . , βq = αq−2 = α−1 , where α
is a primitive element of Fq .
The information polynomial f (z) is then mapped into the n-tuple (f (β1 ), f (β2 ), . . . , f (βq ))
over Fq , whose components f (βi ) are equal to the evaluations of the polynomial f (z) at each
field element βi ∈ Fq :
k−1
X
f (βi ) = fj βij ∈ Fq , 1 ≤ i ≤ q,
j=0

where we use the convention 00


= 1. Note that it is the polynomial f (z) that varies from
codeword to codeword; the “symbol locators” 0, 1, β3 = α, . . . , βq = α−1 are fixed. The code
generators may thus be taken as the polynomials gj (z) = z j , 0 ≤ j ≤ k − 1, or as the n-tuples

g0 = (1, 1, 1, . . . , 1)
g1 = (0, 1, α, . . . , α−1 )
g2 = (0, 1, α2 , . . . , α−2 )
...
gk−1 = (0, 1, αk−1 , . . . , α−(k−1) )

Theorem 8.5 (RS codes) The q k q-tuples generated by the mapping f (z) 7→ {f (βi ), βi ∈ Fq }
as the polynomial f (z) ranges over all q k polynomials over Fq of degree less than k form a linear
(n = q, k, d = n − k + 1) MDS code over Fq .

Proof. The code is linear because the sum of the codewords corresponding to two polynomials
f1 (z) and f2 (z) is the codeword corresponding to the polynomial f1 (z)+f2 (z), and the multiple of
the codeword corresponding to f (z) by β ∈ Fq is the codeword corresponding to the polynomial
βf (z).
8.3. REED-SOLOMON (RS) CODES 105

A codeword has a zero symbol in the coordinate corresponding to βi if and only if f (βi ) = 0;
i.e., if and only if βi is a root of the equation f (z) = 0. By the fundamental theorem of algebra
(Theorem 7.12), if f (z) 6= 0, then since deg f (z) ≤ k − 1, this equation can have at most k − 1
roots in Fq . Therefore a nonzero codeword can have at most k − 1 symbols equal to zero, so its
weight is at least n − k + 1. Since the code is linear, this implies that its minimum distance is
at least d ≥ n − k + 1. But by the Singleton bound, d ≤ n − k + 1; thus d = n − k + 1.
Example 2. Consider the quaternary field F4 as constructed in Chapter 7, using the irre-
ducible polynomial g(x) = x2 + x + 1 over F2 . Let β1 = 0, β2 = 1, β3 = x, β4 = x2 = x + 1. (This
is the same ordering of the symbols as in the tables of Chapter 7.) The RS code with n = q = 4
and k = 2 is then given by the mapping

f (z) = f0 + f1 z −→ (f (β1 ), f (β2 ), f (β3 ), f (β4 )).

More explicitly, the mapping is

(f0 , f1 ) −→ ((f0 + f1 β1 ), (f0 + f1 β2 ), (f0 + f1 β3 ), (f0 + f1 β4 )).

Since f0 and f1 can each take on arbitrary values from F4 , the code has 16 codewords. The
two generators of the code correspond to the two polynomials g0 (z) = 1 and g1 (z) = z, or
equivalently the 4-tuples (1, 1, 1, 1) and (0, 1, x, x2 ), respectively. The full coding map is

0, 0 −→ 0, 0, 0, 0 x, 0 −→ x, x, x, x
0, 1 −→ 0, 1, x, x2 x, 1 −→ x, x2 , 0, 1
0, x −→ 0, x, x2 , 1 x, x −→ x, 0, 1, x2
0, x2 −→ 0, x2 , 1, x x, x2 −→ x, 1, x2 , 0
1, 0 −→ 1, 1, 1, 1 x2 , 0 −→ x2 , x2 , x2 , x2
1, 1 −→ 1, 0, x2 , x x2 , 1 −→ x2 , x, 1, 0
1, x −→ 1, x2 , x, 0 x2 , x −→ x2 , 1, 0, x
1, x2 −→ 1, x, 0, x2 x2 , x2 −→ x2 , 0, x, 1

We see by inspection that the minimum nonzero weight of this (4, 2) linear code is d = 3.
Moreover, there are N3 = 12 codewords of weight 3 and N4 = 3 codewords of weight 4, in
accord with the weight formulas of the previous section.
The RS codes are naturally nested— i.e., the (n = q, k, d) code is a subcode of the (n =
q, k + 1, d − 1) code— since the polynomials of degree less than k are a subset of the polynomials
of degree less than k + 1. The (n = q, n, 1) RS code is the universe code consisting of all q q
q-tuples over Fq , and the (n = q, 1, n) code is the repetition code over Fq , consisting of all q
codewords of the form (β, β, . . . , β) for β ∈ Fq .
An (n = q, k, d = n − k + 1) RS code may be punctured in any s ≤ n − k = d − 1 places to
yield an (n = q − s, k, d = n − k + 1 − s) MDS code. The minimum distance can be reduced by
at most s by such puncturing, so d ≥ n − k + 1 − s, and the code still has q k distinct codewords;
but then by the Singleton bound, d ≤ n − s − k − 1. Thus by puncturing an RS code, we can
create an (n, k, d = n − k + 1) MDS code for any n ≤ q and k, 1 ≤ k ≤ n.
For historical reasons (see following subsections), an RS code of length n = q is called an
“extended RS code.” Here is how to construct a “doubly extended RS code.”
106 CHAPTER 8. REED-SOLOMON CODES

Exercise 3 (doubly extended RS codes). Consider the following map from (Fq )k to (Fq )q+1 .
Let (f0 , f1 , . . . , fk−1 ) be any k-tuple over Fq , and define the polynomial f (z) = f0 + f1 z +
· · · + fk1 z k−1 of degree less than k. Map (f0 , f1 , . . . , fk−1 ) to the (q + 1)-tuple ({f (βj ), βj ∈
Fq }, fk−1 )— i.e., to the RS codeword corresponding to f (z), plus an additional component equal
to fk−1 . Show that the q k (q+1)-tuples generated by this mapping as the polynomial f (z) ranges
over all q k polynomials over Fq of degree less than k form a linear (n = q + 1, k, d = n − k + 1)
MDS code over Fq . [Hint: f (z) has degree less than k − 1 if and only if fk−1 = 0.]
Define a (4, 2, 3) linear code over F3 . Verify that all nonzero words have weight 3.

8.3.2 Punctured RS codes as transform codes

The (n = q, k, d = n − k + 1) RS code defined above has generators gj corresponding to the


polynomials fj (z) = z j , 0 ≤ j ≤ k − 1, evaluated at each of the elements of Fq .
In this section we will consider the punctured (n = q − 1, k, d = n − k + 1) RS code obtained
by puncturing the above code in the first symbol position, whose locator is β0 = 0. We will
see that such a code may be regarded as being generated by a transform similar to a Fourier
transform. We will also see that it can be characterized as the set of all polynomial multiples of
a certain generator polynomial g(z), and that it has a cyclic property.
A handy general lemma in any field F is as follows:

Lemma 8.6 Let F be a field with multiplicative identity 1, and let ω be any nth root of 1 in F
other than 1; i.e., such that ω n = 1 but ω 6= 1. Then
n−1
X
ω j = 0.
j=0

Proof. Since ω n = 1 = ω 0 , we have


n−1
X n
X n−1
X
ω ωj = ωj = ωj ;
j=0 j=1 j=0

Pn−1
i.e., the sum S = j=0 ω j satisfies ωS = S. If ω 6= 1, then this implies S = 0.
If ω is an nth root of unity, then so is ω i for any integer i, since (ω i )n = ω in = (ω n )i = 1.
Therefore we have the immediate corollary:

Corollary 8.7 Let F be a field with multiplicative identity 1, let ω be a primitive nth root of 1
in F (i.e., ω n = 1 and ω i 6= 1 for 1 ≤ i < n), and let i be any integer. Then
n−1
X ½
ij n, if i = 0 mod n;
ω =
0, otherwise.
j=0

Proof. If i = 0 mod n, then ω i = 1 and the sum is simply 1 + 1 + · · · + 1 (n times), an element


of F which we denote by n. If i 6= 0 mod n and ω is a primitive nth root of 1, then ω i is an nth
root of 1 other than 1, so Lemma 8.6 applies and the sum is 0.
8.3. REED-SOLOMON (RS) CODES 107

Note that if F is a finite field Fq and n = q − 1, then n = −1 in Fq , since p divides q and thus
q = 0 in Fq .
Using this corollary, we can then define a “finite Fourier transform” (FFT) and “inverse finite
Fourier transform” (IFFT) over F as follows. Let f = (f0 , f1 , . . . , fn−1 ) ∈ Fn be any n-tuple of
elements of F. Then the FFT of f is defined as the n-tuple F = (F0 , F1 , . . . , Fn−1 ) ∈ Fn , where
n−1
X
Fi = fj ω ij .
j=0

The IFFT of F is defined as f = (f0 , f1 , . . . , fn−1 ) ∈ Fn , where


n−1
X
fj = n−1 Fi ω −ij .
i=0

Here n−1 denotes the multiplicative inverse of n in Fq ; if n = q − 1 = −1, then n−1 = n = −1.
Using Corollary 8.7, we then verify that the IFFT of the FFT of f is f :
n−1
X n−1
X n−1
X n−1
X n−1
X
−1 ij 0 −ij −1 i(j 0 −j)
n fj 0 ω ω =n fj 0 ω = fj 0 δjj 0 = fj ,
i=0 j 0 =0 j 0 =0 i=0 j 0 =0

Pn−1 i(j 0 −j)


since Corollary 8.7 shows that i=0 ω is equal to n when j = j 0 (mod n) and 0 otherwise.
Similarly the FFT of the IFFT of F is F. In other words, f ↔ F is a transform pair over Fq
analogous to a finite Fourier transform pair over the complex field C.
Now a codeword F = (f (1), f (α), . . . , f (α−1 )) of a punctured (n = q − 1, k, d = n − k + 1) RS
code over Fq is obtained from an information k-tuple (f0 , f1 , . . . , fk−1 ) as follows:
k−1
X
F= fj αij ,
j=0

where α is a primitive nth root of unity in Fq . Thus F may be regarded as the FFT of the
n-tuple f = (f0 , f1 , . . . , fk−1 , 0, . . . , 0) whose last n − k elements are zeroes, where the FFT is
defined using a primitive nth root of unity α ∈ Fq .
Using the IFFT and the fact that n−1 = −1, the information vector f is therefore easily
recovered from a codeword F as follows:
n−1
X
fj = − Fi α−ij .
i=0

Note that the IFFT must equal 0 for the last n − k symbols fj , k ≤ j ≤ n − 1:
n−1
X
Fi α−ij = 0, k ≤ j ≤ n − 1.
i=0
108 CHAPTER 8. REED-SOLOMON CODES

8.3.3 RS codes as sets of polynomials

Let us define a polynomial F (z) = F0 +F1 z +· · ·+Fn−1 z n−1 of degree n−1 or less corresponding
to a codeword F. This last equation then implies that F (α−j ) = 0 for k ≤ j ≤ n − 1. In other
words, αj ∈ Fq is a root of F (z) for 1 ≤ j ≤ n − k.
It follows that the polynomial F (z) has n − k factors of the form z − αj , 1 ≤ j ≤ n − k. In
other words, F (z) is divisible by the generator polynomial
n−k
Y ¡ ¢
g(z) = z − αj .
j=1

Evidently deg g(z) = n − k.


In fact, the RS code may now be characterized as the set of all polynomials F (z) of degree
less than n that are multiples of the degree-(n − k) generator polynomial g(z). We have already
observed that g(z) divides every codeword polynomial F (z). But since F (z) = g(z)h(z) has
degree less than n if and only if h(z) has degree less than k, there are precisely q k distinct
multiples g(z)h(z) of g(z) such that deg F (z) < n. This accounts for all codewords. Thus the
RS code may be characterized as follows:

Theorem 8.8 (RS codes as sets of polynomials) The (n = q−1, k, d = n−k+1) punctured
RS code over Fq corresponds to the set of q k polynomials

{F (z) = g(z)h(z), deg h(z) < k},


Qn−k ¡ j
¢
where g(z) = j=1 z−α .

Example 2 (cont.) For the (4, 2, 3) RS code over F4 of Example 2, if we puncture the
first symbol, then we get a (3, 2, 2) RS code with generators g0 = (1, 1, 1) and g1 = (1, α, α2 )
(where α is a primitive field element). This code corresponds to the set of 16 polynomials
{F (z) = g(z)h(z), deg h(z) < 2} of degree less than 3 that are multiples of the degree-1 generator
polynomial g(z) = z + α, as can be verified by inspection.

8.3.4 Cyclic property of punctured RS codes

Finally, we show that punctured RS codes are cyclic; i.e., the end-around cyclic rotation F0 =
(Fn−1 , F0 , F1 , . . . , Fn−2 ) of any codeword F = (F0 , F1 , . . . , Fn−1 ) is a codeword. In polynomial
terms, this is equivalent to the statement that F 0 (z) = zF (z) − Fn−1 (z n − 1) is a codeword.
Q ¡ ¢
We have seen that the polynomial z n −1 factors completely as follows: z n −1 = nj=1 z − αj .
Q ¡ ¢
Consequently g(z) = n−k j=1 z − α
j divides z n − 1. Since g(z) also divides F (z), it follows that

g(z) divides F 0 (z) = zF (z) − Fn−1 (z n − 1), so F 0 (z) is a codeword.


Example 2 (cont.) The punctured (3, 2, 2) RS code described above has the cyclic property,
as can be verified by inspection.
Historically, RS codes were introduced by Reed and Solomon (1960) as valuation codes. In
the 1960s and 1970s, RS and BCH codes were primarily studied as cyclic codes. The transform
approach was popularized by Blahut in the early 1980s. In the past decade, RS codes have again
come to be regarded as valuation codes, due in part to the work of Sudan et al. on decoding as
interpolation. This will be our next topic.
8.4. INTRODUCTION TO RS DECODING 109

8.4 Introduction to RS decoding


An important property of RS codes is that in practice they can be decoded rather easily, using
finite-field arithmetic. For example, decoding of the NASA-standard (255, 233, 33) 16-error-
correcting code over F256 is today routinely accomplished at rates of Gb/s.
A series of decoding algorithms bearing names such as Peterson, Berlekamp-Massey, Euclid,
and Welch-Berlekamp have been developed over the years for error-correction and erasure-and-
error correction. More recently, Sudan and others have devised list and soft-decision error-
correction algorithms that can decode significantly beyond the guaranteed error-correction ca-
pability of the code. All of these algorithms are polynomial-time, with the most efficient requiring
a number of finite-field operations of the order of n2 .
We will present briefly an error-correction algorithm based on viewing the decoding problem
as an interpolation problem. Recent decoding algorithms, including the Sudan-type algorithms,
involve extensions of the ideas in this algorithm.
We start by introducing a new notation for n-tuples over Fq for n ≤ q. As before, we index each
of the n ≤ q coordinates by a unique field element βj ∈ Fq , 1 ≤ j ≤ n. An n-tuple (y1 , y2 , . . . , yn )
can then be represented by a set of pairs {(yj , βj ), 1 ≤ j ≤ n}. Each pair (yj , βj ) ∈ (Fq )2 is
regarded as a point in the two-dimensional space (Fq )2 . An n-tuple is thus specified by a set of
n points in (Fq )2 .
A codeword in an (n = q, k, d = n − k + 1) RS code is then an n-tuple specified by a set of
n = q points (yj , βj ) satisfying the equation yj = f (βj ) for a fixed polynomial f (z) ∈ Fq [z] of
degree less than k. In other words, the n points of the codeword are the n roots (y, z) ∈ (Fq )2
of the algebraic equation y = f (z), or y − f (z) = 0.
Curves specified by such algebraic equations are called algebraic curves. The n points of the
curve are said to lie on the bivariate polynomial p(y, z) = y − f (z); or, the bivariate polynomial
p(y, z) = y − f (z) is said to pass through the point (α, β) ∈ (Fq )2 if p(α, β) = 0.
The decoding problem can then be expressed as an interpolation problem: given n received
points, find the bivariate polynomial of the form p(y, z) = y −f (z) with deg f (z) < k that passes
through as many of the received points as possible.
To make further progress, we express the received n-tuple y as the sum (over Fq ) of a codeword
c and an unknown error n-tuple e. The closest codeword c is the one for which the corresponding
error n-tuple e = y − c has least Hamming weight. If the minimum distance of the code is d
and y is within distance t < d/2 of a codeword c, then the distance to any other codeword is at
least d − t > d/2, so the closest codeword is unique.
An error-locator polynomial associated with an error n-tuple e is any polynomial Λ(z) such
that Λ(βj ) = 0 whenever ej 6= 0. If the error vector has weight t, then clearly the degree-t
polynomial Y
Λ0 (z) = (z − βj )
j:ej 6=0

whose t roots are the error locators {βj : ej 6= 0} is an error-locator polynomial. Moreover,
every error-locator polynomial Λ(z) must be a polynomial multiple of Λ0 (z).
If Λ(z) is an error-locator polynomial for e and f (z) is the polynomial that maps to the
codeword c, then the bivariate polynomial
q(y, z) = Λ(z)(y − f (z))
110 CHAPTER 8. REED-SOLOMON CODES

evaluates to 0 for every received point (yj , βj ); i.e., q(y, z) passes through all n received points.
The decoding interpolation problem can therefore be expressed as follows: given n received
points, find the bivariate polynomial of the form q(y, z) = Λ(z)(y − f (z)) with deg f (z) < k and
deg Λ(z) = t as small as possible that passes through all received points.
Define g(z) = Λ(z)f (z); then
q(y, z) = yΛ(z) − g(z).
If deg Λ(z) = t, then deg g(z) < k + t. Thus, given t, the coefficients of Λ(z) and g(z) are a set
of t + 1 + k + t = k + 2t + 1 unknowns.
Assuming that the number of errors is not greater than t0 , and substituting the n received
points (yj , βj ) in the equation yΛ(z) − g(z) = 0, we obtain a system of n homogeneous linear
equations in k + 2t0 + 1 unknowns. In general, these equations may have no solutions other
than the all-zero solution, or may have a linear vector space of solutions. If k + 2t0 + 1 > n,
then there must be a space of solutions of dimension at least k + 2t0 + 1 − n. In particular, if
2t0 + 1 = d = n − k + 1, then there is at least a one-dimensional space of solutions.
The set of all such solutions may be found by standard techniques for solving systems of linear
equations, which in general have complexity of the order of n3 . Because of the special structure
of these equations, “fast” techniques with complexity of the order of n2 may be used. There is
active current research on such fast solution techniques.
Any such solution specifies two candidate polynomials Λ(z) and g(z). If these are the correct
polynomials, then since g(z) = Λ(z)f (z), the polynomial f (z) may be found simply by dividing
g(z) by Λ(z). (Note that the roots of Λ(z) need not be found explicitly.) If the candidate
polynomials are incorrect, then g(z) may not be divisible by Λ(z), in which case this candidate
pair may be discarded. If g(z) is divisible by Λ(z) and the result f (z) = g(z)/Λ(z) is a polynomial
of degree less than k such that the codeword corresponding to f (z) is within distance t0 of the
received n-tuple, then a candidate for the closest codeword has been found.
For standard errors-only error-correction, the distance d is chosen to be odd and t0 is chosen
such that 2t0 + 1 = d. We then obtain n homogeneous linear equations in n + 1 unknowns,
ensuring that there exists at least a one-dimensional space of solutions. If the actual number of
errors is t ≤ t0 , then there is guaranteed to be one and only one valid candidate solution, so as
soon as any valid solution is found, it may be declared to be the closest codeword.
For erasure-and-error-correction, or for decoding of a punctured code, a certain number s of
the received symbols may be erased or punctured— i.e., unknown. The above decoding method
is easily adapted to this case. We then have a system of n − s homogeneous linear equations in
k + 2t0 + 1 unknowns. If d − s is odd and t0 is chosen so that 2t0 + s + 1 = d, then we obtain
n − s equations in k + d − s = n − s + 1 unknowns, ensuring at least a one-dimensional space of
solutions. Moreover, if the actual number of errors is t ≤ t0 , then there is guaranteed to be one
and only one valid candidate solution.
Recent work by Sudan et al. has explored decoding beyond the guaranteed error-correction
capability of the code. The main idea is to replace Λ(z) by Λ(y, z). Then the polynomial q(y, z)
can have more then one factor of type y − f (z), and the list of all such factors is produced at the
decoder output. The number of such factors (the list size in Sudan-type decoding) is obviously
bounded by degy q(y, z) = degy Λ(y, z) + 1. It can be shown that under certain conditions, the
codeword c will be on the list, even if the distance between y and c exceeds d/2. In practice,
the probability that the list will contain more than one codeword is usually very small.
8.5. APPLICATIONS OF RS CODES 111

Finally, in many cases it may be possible to assign a reliability to each received symbol, where
the reliability is an estimate of the log likelihood of the symbol being correct. Or, the symbol
receiver may itself put out a list of more than one candidate symbol, each with a different
reliability. In such cases the decoder may develop a list of candidate codewords, from which
the one with greatest reliability (log likelihood) may then be chosen. Such reliability-based
decoding schemes can make up much of the performance difference between hard-decision and
maximum-likelihood decoding.

8.5 Applications of RS codes


RS codes are useful for a variety of applications:
(a) For channels that naturally transmit packets or binary m-tuples, RS codes over F2m are
used for m-tuple (byte) error correction.
(b) Over memoryless channels such as the AWGN channel, powerful codes may be constructed
by concatenation of an “inner code” consisting of q = 2m codewords or signal points together with
an “outer code” over Fq , where Fq is identified with the inner codewords. Such a concatenated
code may be efficiently decoded by maximum-likelihood detection of the inner code followed by
algebraic decoding of the RS outer code, preferably using reliability information from the inner
decoder in the RS decoding.
(c) RS codes are also useful for burst error correction; for this purpose they are often combined
with m-tuple (byte) interleavers.

8.5.1 Concatenation using RS Codes

A basic concatenated code involves two codes, called the inner code and outer code, as shown
in Figure 1. A concatenation scheme is thus an example of a layered architecture.

- Outer - Inner
bits encoder coded sequence encoder
?

Channel

bits Outer received sequence Inner


  
decoder decoder

Figure 1. Concatenation of an inner code and an outer code.

The inner code is typically a binary code, either block or convolutional, used over a binary
channel, such as a discrete-time AWGN channel with a 2-PAM signal set. The inner decoder
is typically a maximum-likelihood decoder, such as a Viterbi decoder (see next chapter), which
uses soft (reliability-weighted) outputs from the channel.
112 CHAPTER 8. REED-SOLOMON CODES

The outer code is typically a RS code over a finite field F2m of characteristic 2. Each field
element may thus be represented by a binary m-tuple. The field F256 is commonly used, in
which case field elements are represented by 8-bit bytes. An RS codeword of length n symbols
may therefore be converted into an RS-coded sequence of nm bits, which is the input sequence
to the inner encoder.
The input sequence to the inner encoder is often permuted by an interleaver to ensure that
symbols in a given RS codeword are transmitted and decoded independently. A simple row-
column block interleaver works as follows, assuming that the inner code is a binary linear block
code of dimension k = mN . Each RS codeword, consisting of n m-bit symbols, forms a horizontal
row in a rectangular N × n array of m-bit symbols, as shown below. Each column of the array,
consisting of N symbols, is taken as the input symbol sequence of length k = mN bits for one
inner code block.

symbol 1 in RS cw 1 symbol 2 in RS cw 1 ... symbol n in RS cw 1


symbol 1 in RS cw 2 symbol 2 in RS cw 2 ... symbol n in RS cw 2
... ... ...
symbol 1 in RS cw N symbol 2 in RS cw N ... symbol n in RS cw N

The input sequence is read out column by column, with the column length N chosen large
enough so that symbols in the same row are effectively independent. Such an interleaver also
protects against error bursts of length mN bits or less.
The output of the inner decoder is correspondingly viewed as a sequence of m-bit symbols.
Typically the inner decoder puts out a sequence of hard decisions on bits, which are converted
to hard decisions on symbols. As we have shown, the inner decoder output may preferably
also include reliability information indicating how confident it is in its hard decisions. However,
for simplicity we will omit this refinement. The performance of the inner code may then be
characterized by the probability Ps (E) that an m-bit output symbol is in error.
The outer RS decoder takes this sequence of symbol hard decisions and performs error-
correction, using an algebraic decoding algorithm. If the RS code has minimum distance d,
then the decoder can decode any pattern of t errors correctly provided that 2t < d; i.e., pro-
vided that t ≤ tmax = b(d − 1)/2c. Most RS error-correctors give up if there is no codeword
within Hamming distance tmax of the received sequence. With such a bounded-distance decoder,
therefore, the probability per codeword of not decoding correctly is

Pr(ndc) = Pr{tmax + 1 or more symbol errors}.

It is difficult to obtain good analytical estimates of the probability of not decoding correctly
when p = Ps (E) is relatively large, which is the operating region for concatenated codes (but see
Exercise 4 below). Empirically, one observes that Pr(ndc) ≈ 1 when p ≈ tmax /n. As p decreases,
there is a threshold region, typically p ≈ 10−2 –10−3 , where Pr(ndc) suddenly decreases to very
low values, e.g., Pr(ndc) ≈ 10−12 .
The objective of the inner code is therefore to achieve a very moderate symbol probability
of error such as Ps (E) ≈ 10−2 –10−3 at as low an Eb /N0 as possible (on an AWGN channel).
For this purpose maximum-likelihood decoding— specifically, the the Viterbi algorithm (VA) of
Chapter 9— is a good choice. The code should have a trellis with as many states as the VA can
reasonably decode, and should be the best code with that number of states, which almost always
8.6. BINARY BCH CODES 113

implies a convolutional code. A 64-state convolutional code was standardized in the 1970’s, and
around 1990 NASA built a 214 -state Viterbi decoder.
The objective of the outer code is then to drive Pr(ndc) down to the target error rate with as
little redundancy as possible. RS codes and decoders are ideal for this purpose. A (255, 223, 33)
RS code over F256 was standardized in the 1970’s, and is still in common use, nowadays at rates
up to Gb/s. Concatenated codes like this can attain low error rates within about 2–3 dB of the
Shannon limit on a power-limited AWGN channel.
Exercise 4. Show that if symbol errors occur independently with probability p = Ps (E), then
the probability of the RS decoder not decoding correctly is
Xn µ ¶
n t
Pr(ndc) = p (1 − p)n−t .
t
t=tmax +1

Using the Chernoff bound, prove the exponential upper bound

Pr(ndc) ≤ e−nD(τ ||p) ,

where τ = (tmax + 1)/n and D(τ ||p) is the divergence (relative entropy)
τ 1−τ
D(τ ||p) = τ ln + (1 − τ ) ln
p 1−p
between two Bernoulli distributions with parameters τ and p, respectively.

8.6 Binary BCH codes


In a finite field F2m of characteristic 2, there always exists a prime subfield F2 consisting of the
two elements F2 = {0, 1}. Correspondingly, in an (n, k, d = n − k + 1) RS code over F2m , there
always exists a subset of codewords whose components are entirely in the subfield F2 . This
“subfield subcode” is called a binary Bose-Chaudhuri-Hocquenghem (BCH) code. (Historically,
binary BCH codes were developed independently of RS codes, but at about the same time,
namely 1959-60.)
Example 2 (cont.). The binary BCH code derived from the (4, 2, 3) RS code over F4 of
Example 2 is the binary code consisting of the two codewords 0000 and 1111; i.e., the (4, 1, 4)
repetition code over F2 .
A binary BCH code is obviously linear, since the sum of any two binary RS codewords is
another binary RS codeword. Its parameters (n0 , k 0 , d0 ) are related to the parameters (n, k, d)
of the RS code from which it is derived as follows. Its length n0 is evidently the same as the
length n of the RS code. Its minimum Hamming distance d0 must be at least as great as the
minimum distance d of the RS code, because any two words must differ in at least d places.
Usually d0 = d. Finally, by the Singleton bound, k 0 ≤ n0 − d0 + 1 ≤ n − d + 1 = k. Usually k 0 is
considerably less than k, and the binary BCH code falls considerably short of being MDS.
Determining k 0 for a binary BCH code derived from a given (n, k, d) RS code over F2m is
an exercise in cyclotomics. Let us associate a polynomial F (z) ∈ F2m [z] with each codeword
F = (F0 , F1 , . . . , Fn−1 ) of the RS code as in Section 8.3.3. We showed in Lemma 7.19 that F (z)
is actually a binary polynomial with all coefficients in the binary subfield {0, 1} if and only if
the roots of F (z) are cyclotomic cosets; i.e., if β ∈ F2m is a root of F (z), then so are β 2 , β 4 , . . ..
114 CHAPTER 8. REED-SOLOMON CODES

It was further shown in Section 8.3.3 that the punctured (n = q − 1, k, d = n − k + 1) RS


code over Fq may be characterized as the set of q k polynomials F (z) of degree less than n that
Qn−k
are multiples of the generator polynomial g(z) = j=1 (z − αj ), whose roots are the first n − k
powers {α, α2 , . . . , αn−k } of a primitive element α ∈ Fq . Since n − k = d − 1, the subset of binary
polynomials in this set can thus be characterized as follows:

Theorem 8.9 (BCH codes) Given a field Fq of characteristic 2, the (n = q − 1, k 0 , d) BCH


0
code over F2 corresponds to the set of 2k binary polynomials

{F (z) = g(z)h(z), deg h(z) < k 0 },

where g(z) is the product of the distinct polynomials in the set of cyclotomic polynomials of the
elements {α, α2 , . . . , αd−1 } of Fq , and k 0 = n − deg g(z).

Example 3. Let us find the parameters (k 0 , d) of the BCH codes of length n = 15. The
cyclotomic polynomials of the elements of F16 are the binary irreducible polynomials whose
degrees divide 4, namely (taking α to be a root of x4 + x + 1):

polynomial roots
x 0
x+1 1
x2 + x + 1 α5 , α10
x4 + x + 1 α, α2 , α4 , α8
x4 + x3 + x2 + x + 1 α3 , α6 , α12 , α9
x4 + x3 + 1 α7 , α14 , α13 , α11

For the first BCH code, we take g(x) = x4 + x + 1, which has α and α2 as roots; therefore
this code is the subfield subcode of the (15, 13, 3) RS code over F16 . It has d = 3 and k 0 =
n − deg g(x) = 15 − 4 = 11; i.e., it is a (15, 11, 3) code.
For the second BCH code, we take g(x) = (x4 + x + 1)(x4 + x3 + x2 + x + 1), which has
{α, α2 , α3 , α4 } as roots; therefore this code is the subfield subcode of the (15, 11, 5) RS code over
F16 . It has d = 5 and k 0 = n − deg g(x) = 7; i.e., it is a (15, 7, 5) code.
For the third BCH code, we take g(x) = (x4 + x + 1)(x4 + x3 + x2 + x + 1)(x2 + x + 1), which
has {α, α2 , α3 , α4 , α5 , α6 } as roots; therefore this code is the subfield subcode of the (15, 9, 7)
RS code over F16 . It has d = 7 and k 0 = n − deg g(x) = 5; i.e., (n, k, d) = (15, 5, 7).
Finally, we find a (15, 1, 15) BCH code with a degree-14 generator polynomial g(x) which has
{α, α2 , . . . , α14 } as roots; i.e., the binary repetition code of length 15.
Table I below gives the parameters (n = q, k 0 , d+1) for all binary BCH codes of lengths n = 2m
with even minimum distances d + 1. By puncturing one symbol, one may derive (n = q − 1, k 0 , d)
binary BCH codes with odd minimum distances as above; these are the binary BCH codes that
are more usually tabulated. The former codes may be obtained from the latter by adding an
overall parity check as in Chapter 6, Exercise 1.
8.6. BINARY BCH CODES 115

n = q k0 d + 1 n = q k0 d + 1 n=q k0 d + 1
2 1 2 16 15 2 64 63 2
16 11 4 64 57 4
16 7 6 64 51 6
4 3 2 16 5 8 64 45 8
4 1 4 16 1 16 64 39 10
64 36 12
32 31 2 64 30 14
8 7 2 32 26 4 64 24 16
8 4 4 32 21 6 64 18 22
8 1 8 32 16 8 64 16 24
32 11 12 64 10 28
32 6 16 64 7 32
32 1 32 64 1 64

Table I. Binary BCH codes of lengths n = 2m for m ≤ 6.

We see that binary BCH codes with n = q include codes equivalent to SPC codes, extended
Hamming codes, biorthogonal codes, and repetition codes.
In comparison to binary RM codes, there exist more binary BCH codes; in particular, codes
with distances other than 2m−r . Also, for n ≥ 64, we begin to see codes with slightly better k
for a given n and d; e.g., the (64, 45, 8) binary BCH code has three more information bits than
the corresponding RM code, and the (64, 24, 16) code has two more information bits.
In principle, a binary BCH code may be decoded by any decoding algorithm for the RS code
from which it is derived. Candidate decoded RS codewords that are not binary may be discarded.
In practice, certain simplifications are possible in the binary case.
For these reasons, particularly the availability of algebraic decoding algorithms, binary BCH
codes have historically received more attention than binary RM codes.
However, for trellis-based maximum-likelihood (Viterbi) decoding, which we will discuss in
Chapter 9, RM codes usually have a better performance-complexity tradeoff. For example, the
(64, 45, 8) binary BCH code has eight times the trellis complexity of the (64, 42, 8) RM code,
and the (64, 24, 16) binary BCH code has four times the trellis complexity of the (64, 22, 16) RM
code. The corresponding increase in complexity of trellis-based decoding algorithms will usually
not justify the slight increase in number of information bits.

You might also like