|| Shri Hari ||
o
t
a
da t
of
e
m
u
l
,
o
x
v
a
f
,
he
t
t
x
e
e
t
c
(
u
d
d
e
To re nsmitt
ra
t
e
b
s)
e
g
e
g
a
r
ima
to
s
e
c
u
.
d
s
e
t
r
n
e
To
m
e
ir
u
q
re
IMAGE
COMPRESSION
Consider any Hindi movie.
Average length 3 hrs
Frames displayed sequentially at the rate of 30
fps
Size of each frame in High definition is 1920 x
1080
Each pixel requires 24 bits.
How many 8.5 GB dual-layer DVDs will be required?
~2x
Bytes required by 3 hr
12
10
movie?
Total size of movie 2000 GB;
So 236 DVDs required
APR 14, 11
DATA COMPRESSION
Process of reducing the amount of data required
to represent a given quantity of information.
Data is the means by which information is
conveyed.
Various amounts of data can be used to convey
the same amount of information.
Representations that contain repeated or
irrelevant information is said to be redundant
data.
APR 14, 11
INFORMATION VS
DATA
REDUNDANTDATA
INFORMATION
DATA = INFORMATION + REDUNDANT DATA
RELATIVE DATA
REDUNDANCY
Same information two representations
One requires b bits, the other b
Compression ratio
Relative data
redundancy
APR 14, 11
Larger
representation
Representation of image as 2-D intensity
array suffers from 3 principal types of data
redundancies
I. Coding Redundancy
The 8-bit codes that are used to represent
the intensities in most 2-D intensity arrays
contain more bits than are needed to represent
the intensities.
Occurs when data used to represent an
image is not utilized in an optimal manner
6
APR 14, 11
II. Spatial & Temporal Redundancy
Because the pixels of most 2-D intensity
arrays are correlated spatially, information is
unnecessarily replicated in the representation of
such pixels.
In a video sequence, temporally correlated
pixels (those similar to or dependent on pixels in
nearby frames) also duplicate information.
Spatial redundancy is aka
Inter-pixel redundancy
7
APR 14, 11
III. Irrelevant Information
Most 2-D intensity arrays contain
information that is ignored by the human visual
system &/or extraneous to the intended use of
the image. It is redundant in the sense that it is
not used.
Human eye is less sensitive to the higher
spatial frequency components than the lower
frequencies (transform coding)
8
APR 14, 11
I. CODING
REDUNDANCY
Remember (when we did histograms)
pr(rk) = nk / NM
k = 0,1, ..., L-1
Let l(rk) represent the number of bits used to represent
rk .
Then the average no. of bits required to represent each
pixel is:
APR 14, 11
Consider the following data for the
given image.
rk
0
128
0.1
1000000
0
200
0.3
1100100
0
255
0.2
What is the average no. of bits
required to represent each pixel?
others
0.0
1111111
1
8 bits
APR 14, 11
Code
pr(rk)
0.4
0000000
0
10
Now consider the following data for
the given image.
rk
Variable length
coding
Assign fewer bits
to
the
more
probable
intensity values.
Code
0
128
pr(rk)
0.4
0.1
200
255
others
0.3
0.2
0.0
10
110
0
111
What is the average no. of bits
required to represent each pixel?
1.9 bits
APR 14, 11
11
Why did we not take the following
code? The average no. of bits
required to represent each pixel
would then be just 1.3.
rk
pr(rk) Code
0
0.4
0
The symbol code must
128
0.1
10
be easy to decode.
200
0.3
1
It should be possible to
255
0.2
11
identify the end of a
other
0.0
codeword as soon as it
s
arrives
No codeword should
be a prefix of
another codeword
APR 14, 11
12
13
FIND THE COMPRESSION & THE
CORRESPONDING RELATIVE
REDUNDANCY IF THE SIZE OF THE
IMAGE IS 256X256
84% of the data in
the original 8-bit 2D intensity array is
redundant.
Coding redundancy is almost always present when the
intensities of an image are represented using a natural
binary code. This is so because
Images are mostly composed of objects that have a
predictable shape & reflectance
Are sampled so that the objects are much larger than
the pixels
Certain intensities are therefore more probable than
APR 14, 11
14
II. Spatial &
Temporal
Redundancy
Observe:
It cannot be compressed by
variable length coding.
Pixels are independent in the
vertical direction.
Pixels are identical along each
line.
A significant spatial redundancy can be eliminated by
representing the image as a sequence of run-length
pairs.
APR 14, 11
It exploits the fact that an image very often
contains strongly correlated pixels, in other
words, large regions whose pixel values are the
same or almost the same.
This redundancy can be explored in several ways,
one of which is by predicting a pixel value
based on the values of its neighboring pixels.
In order to do so, the original 2-D array of pixels is
usually mapped into a different format, e.g., an
array of differences between adjacent pixels. If the
original image pixels can be reconstructed from the
transformed data set the mapping is said to be
reversible.
Examples of compression techniques that explore
15
the interpixel redundancy include: Constant Area
Coding (CAC), (1-D or 2-D) Run-Length Encoding
APR 14, 11
What is the compression ratio for
this
256 x 256 image (each line is one
pixel thick)if run-length coding is
used?
The size of each run-length pair is 8 bits (for
intensity) + 8 bits (to represent the run length
which in this case is 256).
16
APR 14, 11
17
III. IRRELEVANT
INFORMATION
AKA PSYCHOVISUAL
REDUNDANCY
Many experiments on the psychophysical aspects
of human vision have proven that the human eye
does not respond with equal sensitivity to all
incoming visual information; some pieces of
information are more important than others.
The knowledge of which particular types of
information are more or less relevant to the final
human user have led to image and video
compression techniques that aim at eliminating
or reducing any amount of data that is
psychovisually redundant.
APR 14, 11
The end result of applying these techniques is a
compressed image file, whose size and quality
are smaller than the original information, but
whose resulting quality is still acceptable for the
application at hand.
The loss of quality that ensues as a byproduct of
such techniques is frequently
calledquantization,as to indicate that a wider
range of input values is normally mapped
into a narrower range of output values
thorough an irreversible process.
18
APR 14, 11
The image appears to be a
homogeneous field of gray &
can be represented only by its
average intensity a single 8bit value.
19
APR 14, 11
20
HOW FEW BITS ARE NEEDED TO
REPRESENT THE INFORMATION
IN AN IMAGE??????
Information Theory
Provides a mathematical framework to
answer this
ta
n
e
m
a i se
d
n
Fu Prem
l
Generation of information can be
modeled as a probabilistic process
that can be measured in a manner
that agrees with intuition.
APR 14, 11
A random event E, with probability P(E) is said to
contain I(E) units of information.
What is the information contained in the
event E,
I(E) = 0
if P(E) = 1 ??
P(E) = 1
Event always
occurs
No information
conveyed
If I say sun has arisen
today, then how much
information am I
conveying?
APR 14, 11
21
22
APR 14, 11
ENTROPY
AVERAGE INFORMATION PER SOURCE
OUTPUT
Image entropy is a quantity which is used to
describe the amount of information which must be
coded for by a compression algorithm.
Low entropy images, such as those containing a lot
of black sky, have very little contrast and large runs
of pixels with the same or similar intensity values.
An image that is perfectly flat will have an entropy
of zero. Consequently, they can be compressed to a
relatively small size.
On the other hand, high entropy images such as an
image of heavily cratered areas on the moon have
a great deal of contrast from one pixel to the next
and consequently cannot be compressed as much
23
ENTROPY
AVERAGE INFORMATION PER SOURCE
OUTPUT
The base of the log determines
the unit used to measure
information
In case of an image, histogram can be used to
estimate the symbol (intensity) probabilities. Then
the intensity sources entropy becomes
It is not possible to code the
intensity values with fewer
than
bit/pixel.
APR 14, 11
Calculate the entropy of the given
image.
rk
pr(rk)
0
0.4
128
0.1
200
255
others
= [ 0.529 0.332 0.521
0.697]
= 1.8465 bits/pixel
APR 14, 11
0.3
0.2
0.0
24
BASIC COMPRESSION
METHODS
25
APR 14, 11
26
HUFFMAN CODING
The code construction process has a
complexity of O(Nlog2N)
Huffman codes satisfy the prefix-condition
APR 14, 11
Uniquely decodable: no codeword is a
prefix of another codeword
Huffman Coding Algorithm
(1) Order the symbols according to the probabilities
Alphabet set: S1, S2,, SN
Probabilities: P1, P2,, PN
The symbols are arranged so that P1 P2 PN
(2) Apply a contraction process to the two symbols
with the smallest probabilities. Replace the last two
symbols SN and SN-1 to form a new symbol HN-1 that
has the probabilities P1 +P2.
The new set of symbols has N-1 members: S1, S2,,
SN-2 , HN-1
(3) Repeat the step 2 until the final set has only one
member.
(4) The codeword for each symbol Si is obtained
27 by
traversing the binary tree from its root to the leaf
node corresponding to Si
APR 14, 11
HUFFMAN CODING
Codeword
Codeword X
length
2
01
10
11
000
001
01
0.25
10
0.25
11
0.2
000
0.15
001
0.15
Probabilit
y
00
1
0.3
0.45
01
00
0.25
0.3
01
10
0.25
0.25
11
0.2
0
0.55
1
0.45
EXAMPLE
Ax={ a , b , c , d , e }
Px={0.25, 0.25, 0.2, 0.15,
0.15}
0
1.0
0.55
1
0
0.45
0
0.3
1
a
0.25
b
0.25
c
0.2
d
0.15
e
0.15
00
10
11
010
011
DISADVANTAGES OF
THE HUFFMAN CODE
Changing ensemble
If the ensemble changes the frequencies and
probabilities change the optimal coding
changes
e.g. in text compression symbol frequencies
vary with context
Saving/ transmitting the code too?!
Does not consider blocks of symbols
strings_of_ch the next nine symbols are
predictable aracters_ , but bits are used
VARIATIONS
n-ary Huffman coding
Adaptive Huffman coding
Uses {0, 1, .., n-1} (not just {0,1})
Calculates frequencies dynamically based on recent
actual frequencies
Huffman template algorithm
Generalizing
probabilities any weight
Combining methods (addition) any function
Can solve other min. problems e.g. max [wi+length(ci)]
32
GOLOMB CODING
Golomb coding is a lossless data
compression method using a family of
data compression codes invented by
Solomon W. Golomb in the 1960s.
Golomb coding is highly suitable for
situations in which the occurrence of
small values in the input stream is
significantly more likely than large
values.
APR 14, 11
A Golomb code is variable-length
code, a bit like Huffman; however,
rather than being based on the data,
like Huffman, it's based on a simple
model of the probability of the values
(which are explicitly dealt with as
natural numbers, rather than being
abstract symbols): small values are
more likely than big ones. The precise
relation between size and probability
is captured in a parameter, the
33
divisor.
APR 14, 11
Algorithm for Golomb code Generation
To Golomb-code a number:
1. Find the quotient and remainder of
division by the divisor.
2. Write the quotient in unary notation,
3. then the remainder in truncated binary
notation.
4. In practice, you need a stop bit after the
quotient: if the quotient is written as a
sequence of ones, the stop bit is a zero
5. The length of the remainder can be
determined from the divisor.
APR 14, 11
34
1. Fix the parameter M to an integer value.
2. For N, the number to be encoded, find
a) quotient = q = int[N/M]
b) remainder = r = N modulo M
3. Generate Codeword
a) The Code format: <Quotient Code><Remainder
Code>, where
b) Quotient Code (in unary coding)
i. Write a q-length string of 1 bits
ii. Write a 0 bit
c) Remainder Code (in truncated binary encoding) If
M is power of 2, code remainder as binary format.
So log2(M) bits are needed. (Rice code)
d) If M is not a power of 2, set
i. If r < 2b M code r as plain binary using b-1
bits.
35 M
ii. If
code the number r + 2b
in plain binary representation using b bits.
APR 14, 11
Find the Golomb code for integers 0 to 9 when
M = 4.
N
quotient
remaind
er
0
1
0
0
0
1
000
001
2
3
4
0
0
1
2
3
0
010
011
1000
5
6
7
1
1
1
1
2
3
1001
1010
1011
1100
0
8
APR 14, 11
Cod
e
36
37
APR 14, 11
ARITHMETIC
CODING
A message is encoded as a real number in
an interval from 0 to 1.
Arithmetic coding typically has a better
compression ratio than Huffman coding, as
it produces a single symbol rather than
several separate code words.
Arithmetic coding is a lossless coding
technique.
The idea is to code string as a binary
fraction pointing to the sub-interval for a
particular symbol sequence.
Arithmetic coding is very popular in the
38
ALGORITHM FOR
ARITHMETIC CODING
1. Start with an interval [0, 1), divided into
subintervals of all possible symbols to appear
within a message. Make the size of each
subinterval proportional to the frequency at which
it appears in the message. Eg:
Probabilit
Symbol
Interval
y
[0.0,
a
0.2
0.2)
[0.2,
b
0.3
0.5)
[0.5,
c
0.1
0.6)
APR 14, 11
2. When encoding a symbol, "zoom" into the current
interval, and divide it into subintervals like in step
one with the new range. Example: suppose we
want to encode add". We "zoom" into the
interval corresponding to "a", and divide up that
interval into smaller subintervals like before. We
now use this new interval as the basis of the next
symbol encoding step.
Symbol
a
b
c
d
APR 14, 11
New "a"
Interval
[0.0, 0.04)
[0.04, 0.1)
[0.1, 0.12)
[0.12, 0.2)
39
3. Repeat the process until the maximum precision of
the machine is reached, or all symbols are
encoded. To encode the next character d", we use
the "a" interval created before, and zoom into the
subinterval d", and use that for the next step.
This produces:
Symbol
a
b
New d" Interval
[0.12, 0.136)
[0.136, 0.16)
c
d
[0.16, 0.168)
[0.168, 0.2)
40
APR 14, 11
4. Transmit some number within the latest interval
to send the codeword. The number of symbols
encoded will be stated in the protocol of the
image format, so any number within [0.168, 0.2)
will be acceptable.
41
APR 14, 11
DISADVANTAGES
Although the AC usually provides a better result in
comparison to the wide-spread Huffman code, it is
applied rarely.
One is that the whole codeword must be received to
start decoding the symbols, and if there is a corrupt
bit in the codeword, the entire message could
become corrupt.
Another is that there is a limit to the precision of the
number which can be encoded, thus limiting the
number of symbols to encode within a codeword.
There also exists many patents upon arithmetic
coding, so the use of some of the algorithms also
call upon royalty fees. The companies IBM, AT&T and
Encoding Strings
lowerbound = 0
upperbound = 1
while there are still symbols to encode
currentrange = upperbound - lowerbound
upperbound = lowerbound + (currentrange
upperboundofnewsymbol)
lowerbound = lowerbound + (currentrange
lowerboundofnewsymbol)
end while
Any value between the computed lower and upper
probability bounds now encodes the input string.
43
APR 14, 11
Symb Probabilit
Let us encode add
Interval
ol
y
Start with lower and upper
[0.0,
a
0.2
probability bounds of 0 and
0.2)
1.
[0.2,
b
0.3
0.5)
Encode 'a'
[0.5,
c
0.1
current range = 1 - 0 = 1
0.6)
upper bound = 0 + (1 0.2)
[0.6,
d
0.4
= 0.2
1.0)
Encode d
lower bound = 0 + (1 0.0)
current range = 0.2 - 0 = =
0.20.0
upper bound = 0 + (0.2 1.0)
= 0.2
lower bound = 0 + (0.2 0.6)
Encode d
= 0.12
current range = 0.2 0.12 = 0.08
44
upper bound = 0.12 + (0.08 1.0) =
0.2
APR 14, 11
45
DECODING
ALGORITHM
get encoded number
Do
find symbol whose range straddles the encoded
number
output the symbol range = symbol low
value - symbol high value
subtract symbol low value from encoded number
divide encoded number by range
until no more symbols
Decode 0.177
APR 14, 11
46
LZW CODING
Static coding schemes require some knowledge
about the data before encoding takes place.
Universal coding schemes, like LZW, do not require
advance knowledge and can build such knowledge
on-the-fly.
LZW is the foremost technique for general purpose
data compression due to its simplicity and
versatility.
It is the basis of many PC utilities that claim to
double the capacity of your hard drive
LZW compression uses a code table, with 4096 as a
common choice for the number of table entries.
APR 14, 11
Codes 0-255 in the code table are
always assigned to represent single
bytes from the input file.
When encoding begins the code table
contains only the first 256 entries, with the
remainder of the table being blanks.
Compression is achieved by using codes
256 through 4095 to represent sequences
of bytes.
As the encoding continues, LZW
identifies repeated sequences in the
data, and adds them to the code table.
Decoding is achieved by taking each code
from the compressed file, and translating
it
47
through the code table to find what
character or characters it represents.
APR 14, 11
48
APR 14, 11
LZW COMPRESSION ALGORITHM
We will compress the string
"ABABBABCABABBA"
Initially the dictionary is the following
Code
String
C
49
APR 14, 11
50
APR 14, 11
39
39
39
39
APR 14, 11
39
39
39
39
126
39
126
39
39
126 39126
126126
126
126
126 126
39
126 39126
39-39
126
126
126
126-126
39
39
39
39-39
126
39-39-126
126
126
39
126-39
39
39
126
39-126
126
o/p
Code
String
39
256
39-39
Encode
the given
image.
39
39-126
(did 257
u notice the
vertical edge?)
126
258
126-126
126
259
126-39
256
260
39-39-126
258
261
126-126-39
260
262
259
262
126-39-39
257
263
39-126-126
39-39-126-126
51
52
LZW
DECOMPRESSION
The LZW decompressor creates the same string
table during decompression.
It starts with the first 256 table entries initialized
to single characters.
The string table is updated for each character in
the input stream, except the first one.
Decoding achieved by reading codes and
translating them through the code table being
built.
APR 14, 11
53
APR 14, 11
54
APR 14, 11
Advantages of LZW over Huffman:
LZW requires no prior information about the input data
stream.
LZW can compress the input stream in one single pass.
Another advantage of LZW its simplicity, allowing fast
execution.
LZW: Limitations
What happens when the dictionary gets too large (i.e., when
all the 4096 locations have been used)?
Here are some options usually implemented:
Simply forget about adding any more entries and use the
table as is.
Throw the dictionary away when it is no longer effective at
compression.
Clear entries 256-4095 and start building the dictionary
again.
55
Some clever schemes rebuild a string table from the
last N input characters.
APR 14, 11