M ultim edi
a
Computin
g
Ralf and
Klarra
Chapter
5
Data
Compression
Why
Compression
• Multimedia data such a s image, audio,
video requires enormous amount of
storage.
• Point to point communication of such
data requires high bandwidth
• Real time communication such a s
video conferencing becomes
impossible
• To make feasible communication and
cost effective solutions, multimedia
By: Dinesh Maharjan 3
data is compressed
Why
Compression
• Is a multi-dimensional
signal
•
By: Dinesh Maharjan 4
Why
Compression
• Cd quality of stereo with 1 6 bits
depth 44.1khz sampling rate, size
of 1 0 in is
• Sampling rate(hz) *
depth(bits)*no. Of channel*
time(sec)
• =100.93MB
By: Dinesh Maharjan 5
Compression
techniques
• Video signal representation
includes
• Visual representation
• Transmission
• Digitalization
By: Dinesh Maharjan 6
Compression
techniques
• Processing of uncompressed video
requires storage in the range of giga
bytes.
• They also requires data rate a s
high a s 140Mbits/sec
• Compression techniques reduces the
high data transfer rate requirements
• Quality of compressed and
decompressed data must be a s
good a s possible.
By: Dinesh Maharjan 7
Multimedia
Applications
• Can be run in two different modes
• Dialogue mode
• Retrieval mode
• Dialogue mode is interaction between
human via multimedia information.
• Retrieval mode retrieves
multimedia information from
multimedia database.
By: Dinesh Maharjan 8
Dialogue
Mode
• End to end delay introduced by
compression and decompression should
not exceed 1 5 0 m s
• 5 0 m s delay should be achieved in
face to face dialogue applications
• Other delay is introduced by
communication protocol processing
and data transfer from input devices
By: Dinesh Maharjan 9
Retrieval
mode
• Can be run in two different modes
• Dialogue mode
• Retrieval mode
• Dialogue mode is interaction between
human via multimedia information.
• Retrieval mode retrieves
multimedia information from
multimedia database.
By: Dinesh Maharjan 10
Both mode
requirements
• Can be run in two different modes
• Dialogue mode
• Retrieval mode
• Dialogue mode is interaction between
human via multimedia information.
• Retrieval mode retrieves
multimedia information from
multimedia database.
By: Dinesh Maharjan 11
Compression
classification
• Can be run in two different modes
• Dialogue mode
• Retrieval mode
• Dialogue mode is interaction between
human via multimedia information.
• Retrieval mode retrieves
multimedia information from
multimedia database.
By: Dinesh Maharjan 12
General
Compression
• Video signal representation
includes
• Visual representation
• Transmission
• Digitalization
By: Dinesh Maharjan 13
ASCII
encoding
• Suppose, we want to send m ess a g e
“go g o gophers”
• The m e s s a g e are send in the form
of sequence of bits according to
their ASCII value.
•
By: Dinesh Maharjan 14
ASCII
encoding
• ASCII encoded of “go g o gophers” is
• 1100111 1101111 11000000
1100111
1101111 1000000 1100111
1101111
1110000 1101000 1100101
1110010
1110011
• ASCII uses 7 bits per character.
Thus, 1 3 character requires 9 1 bits
• ASCII encoding By: Dinesh Maharjan
m a y be very big if
15
Fixed length
encoding
• Since m e s s a g e does not contain all
the characters, we can encode with
our own code.
• We can encode 8 characters with 3
bits.
By: Dinesh Maharjan 16
Fixed length
encoding
• 3 bit encoding requires 3 9 bits for
m e s s a g e and
• 6 4 bits (56 bits for ASCII encoding of
8 characters and 2 4 bits for our own
code) for code translation table.
• All together 1 0 3 bits are required.
• This shows that fixed length code is
not efficient if frequency is not high
enough
By: Dinesh Maharjan 17
Run-length
Encoding
• Content Dependent compression
• Lossless compression
• Replace sequence of the s a m e bytes with
their number of occurrences
• A special flag indicates the
number of occurences
• Special flag m a y be !
• Uncompressed data:
A BC C C C C C C C D E F G G G 18
By :D ine sh
Huff man
Encoding
• Variable length encoding
• Determines the optimal code using
minimum number of bits
• Frequently occurring characters are
coded with shorter bits than seldom
occurring characters
• Binary Tree is constructed to
determine huff man code
By: Dinesh Maharjan 19
Huff man
Encoding
• Leaf nodes represent characters
• Non-leaf nodes is frequency of
characters
• First characters are sorted in the
increasing order of their frequency
• Then two smallest frequencies are
added to form commutative frequency
node.
• This is done repeatedly till no frequencies
are left to be added
By: Dinesh Maharjan 20
Huff man
Encoding
• Left edge is given 0 code and right edge
gets 1.
• Code for each character is determined
by combining the code for arc from
root to leaf node.
• Lossless encoding
• In the similar way, we can encode image
and video
By: Dinesh Maharjan 21
Huff man
Encoding
• Every detail of a scene is pixel
• S o m e of details of a scene fall
between scanning lines.
• Only 7 0 % of vertical lines are
represented
• It mea ns some vertical resolution is
lost.
• Ratio is know a s Kell factor
By: Dinesh Maharjan 22
Huff man
Encoding
• p(A)=0.16, p(B)=0.51, p(C)=0.09,
p(D)=0.13,
• p(E)=0.11
• A A A A A A BC C C C C C D D E E E E E
• GEEKSFORGEEKS
By: Dinesh Maharjan 23
Periodic
function
• Repeats itself after some fixed interval
• sin(x), cos(x) are naturally
periodic with period of 2*pi
• Integration from -a to a of f(x)d(x) is 2
times integration of f(x) from 0 to a if
f(x) is even
• Otherwise 0
• Function is even when we get sa m e
function for -x and + x .
• For eg f(x)=x 2 is 24
eBvy:eDinnehs fMuanhacajrtnion
Odd
function
• The function value changes for odd
function when sign of variable is
changed
• For e g f(x)=x 3 , sinx
• Integration by parts
• Chain rule of integration x2sin2xdx
• X 2 integration of sin2xdx – dx 2 /dx
integration of second + differentiation of
first integration of first
• Chain rule can be applied if first is algebraic 25
B y: Di nse h M
a ah ajr n
Fourier
Series
• f(x) be periodic function with period T
• Fourier series of f(x) is the continuous
series given by
•
•
• where
By: Dinesh Maharjan 26
Fourier
Series
• f(x) be periodic function with
period T
• Fourier series
By: Dinesh Maharjan 27
Fourier
Series
• f(x) be periodic function with period T
• Fourier series of f(x) is the continuous
series given by
•
•
• where
By: Dinesh Maharjan 28
JPEG
compression
• Introduced by Joint Photographic
Experts Group in 1 9 9 2 for image
compression
• Lossy image compression
• Uses DCT method for compression
• Can compress color images a s
well a s grayscale images
•
By: Dinesh Maharjan 29
Steps in JPEG
compression
• Picture preparation: Pixel, Block,
MCU
• Picture processing: FDCT
• Quantization:
• Entropy Encoding: Run-length,
Huff man
By: Dinesh Maharjan 30
Steps in JPEG
compression
• Color is produced through three
different signals
• Portion of signals are proportional to
Red, Green and blue light intensities
• These signals are transmitted
separately
By: Dinesh Maharjan 31
Steps in JPEG
compression
• Color is produced through three
different signals
• Portion of signals are proportional to
Red, Green and blue light intensities
• These signals are transmitted
separately
By: Dinesh Maharjan 32
Steps in JPEG
compression
• Color is produced through three
different signals
• Portion of signals are proportional to
Red, Green and blue light intensities
• These signals are transmitted
separately
By: Dinesh Maharjan 33
Image
Preparation
• Human eye can detect motion of objects.
• Rapid succession of slightly different
still images provides sensation of
motion
• This property is used in motion pictures
and television
• For realistic motion two conditions are
required
• - rate of succession of images must be
high By: Dinesh Maharjan 34
Image
Preparation
• To perceive continuous motion the frame
rate must be higher than 1 5 fps
• Otherwise we can separate two frames
• For smooth motion, frame rate should
be 2 4 to 3 0 fps
By: Dinesh Maharjan 35
Image
Preparation
• Image is divided into data units
• Lossless mode uses one pixel a s one
data unit
• Lossy mode uses blocks of 8x8 pixels
By: Dinesh Maharjan 36
Image
Preparation
• Computer display achieves 70hz refresh
rate with the use of refresh buffer
• Full TV picture is divided into two half-
pictures by line interleaving
• Refresh rate 25hz for the full TV
picture requires a scan rate of
50hz
By: Dinesh Maharjan 37
Image
Preparation
• Interleaved processing of data Units:
• Data units are combined into M C U s
(minimum coded units)
• If all components have sa m e resolution,
then M C U consists exactly one data
unit for each component
•
By: Dinesh Maharjan 38
Image
Preparation
• Video bandwidth=a*b*c/2
• A computer system with a = 1 3 1 2 ,
b = 8 0 0 pixels, out of which, 1 2 4 x 7 8 6
are visible and frame rate c100hz
• Video bandwidht= 1 3 1 2 * 8 0 0 * 100/2
• =52.48Mhz
By: Dinesh Maharjan 39
Image
Processing
• Image is transformed with DCT algorithm
• DCT requires the pixel values in the
range of - 1 2 8 to 1 2 7
• Thus subtract 1 2 8 from each pixel value
from original image
By: Dinesh Maharjan 40
Image
Processing
• R G B video signal requires separate R,
B, G wires for each pixel.
• Other colors are generated a s
the combination of these
colors.
• For eg, R + G + B = 1 provides neutral
white color.
• Computer monitors uses R G B signal.
By: Dinesh Maharjan 41
Image
Processing
• Huma n eye is more sensitive to
brightness than chrominance.
• Thus separating brightness
information from color information is
more important.
•
By: Dinesh Maharjan 42
Image
Processing
• DCT transforms image into frequency
domain using eq 1
By: Dinesh Maharjan 43
Image
Processing
• To DCT in the matrix form following
eq. Is defined
By: Dinesh Maharjan 44
Image
Processing
• First row is 1/sqrt(8) a s
expected
• We get following DCT matrix
By: Dinesh Maharjan 45
Image
Processing
●
DCT of M is given by
●
DCT(M)=TMT’, this yields following
matrix
By: Dinesh Maharjan 46
Quantizatio
n
• Quantization enables to decide quality
levels in specific ranging from 1 to 1 0 0
• It mea ns quantization levels
• 1 gives poorest image quality and
1 0 0 gives highest quality.
• For this quantization matrix is used
By: Dinesh Maharjan 47
Quantizatio
n
Differ in digital image, resolution,
quantization and frame rate
By: Dinesh Maharjan 48
Quantizatio
n
• Quantization is achieved using
following equation
•
By: Dinesh Maharjan 49
Quantizatio
n
• Result of quantization is
• Video controller can access memory
also.
• For the presentation of different
colors, system uses color look up
table.
• Digitization depends upon image
resolution, quantization and frame rate.
By: Dinesh Maharjan 50
Zig Z a g
ordering
• Most of high frequency coeffi cient
will be quantized to zero.
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 51
Zig Z a g
ordering
• Most of coeffi cient of quantized matrix
will be zeros
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 52
Zig Z a g
ordering
• 10, 4, 3, -7, 9, 2, 5, 1, -5, -3, -2, -5, 1,
2,
1, 0, … … . .
By: Dinesh Maharjan 53
D P C M on D C
component
• Let D C coeffi cients of first 5 image
blocks be 150, 155, 149, 152, 1 4 4
• D P C M would produce 150, 5, -6, 3, -8
• D P C M is carried out on entire image at
once.
By: Dinesh Maharjan 54
RLE on AC
component
By: Dinesh Maharjan 55
D P C M and
RLE
• Result of quantization is
• Video controller can access memory
also.
• For the presentation of different
colors, system uses color look up
table.
• Digitization depends upon image
resolution, quantization and frame rate.
By: Dinesh Maharjan 56
Huff man coding on
DC
• Most of coeffi cient of quantized matrix
will be zeros
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 57
Huff man coding on
DC
• Most of coeffi cient of quantized matrix
will be zeros
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 58
Huff man coding on
DC
• Most of coeffi cient of quantized matrix
will be zeros
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 59
Huff man coding on
AC
• Most of coeffi cient of quantized matrix
will be zeros
• JPEG takes advantage of this fact
• It encodes the coeffi cients in zig-zag
order
By: Dinesh Maharjan 60
Entropy
Encoding
• Animation brings life to video.
• Animation covers all changes that
have a visual effect.
• It brings changes in time varying
positions, shape, color, transparency,
structure and texture of an object.
• Computer based animation is performed
with a computer having graphical tools.
By: Dinesh Maharjan 61
Basic
concepts
• Input Process:
- Drawing for the objects must be
digitized
- this can be done through optical
scanning, data tablet or with a drawing
program
- drawing must be filtered to clean up
glitches
By: Dinesh Maharjan 62
D ecodin
g
• De-quantized (D) image is obtained by
multiplying compressed image (C) with
quantization matrix (Q)
• D=C*Q
• Reconstruction of image (R) is obtained
with the help of IDCT.
• R=T’DT
• Where, T is DCT matrix
By: Dinesh Maharjan 63
Coding and
Decoding
• Input Process:
- Drawing for the objects must be
digitized
- this can be done through optical
scanning, data tablet or with a drawing
program
- drawing must be filtered to clean up
glitches
By: Dinesh Maharjan 64
Basic
concepts
• Composition
Stage
By: Dinesh Maharjan 65
Variants of
JPEG
• Sequential mode
• Each component of image is encoded
from left to right, top to bottom in a
single scan.
• S o far we have discussed this mode.
By: Dinesh Maharjan 66
Variants of
JPEG
• Progressive mode
• Firstly generate low quality
versions of images.
• Then, progressively produces higher
quality of images.
• User can stop the production if
quality is acceptable.
• DCT and quantization is similar to
sequential JPEG.
By: Dinesh Maharjan 67
Variants of
JPEG
• Progressive mode
• Good for low bandwidth
communication channels
• Then, progressively pr
By: Dinesh Maharjan 68
Variants of
JPEG
• Hierarchical mode
• Encodes images in the hierarchy of
several different resolutions.
• Lowest resolution is low-pass-filtered
image
• Higher resolutions provide additional
details.
• Progressively improves the quality
similar to progressive mode.
By: Dinesh Maharjan 69
Variants of
JPEG
• Hierarchical mode
• Encodes images in the hierarchy of
several different resolutions.
• Lowest resolution is low-pass-filtered
image
• Higher resolutions provide additional
details.
• Progressively improves the quality
similar to progressive mode.
By: Dinesh Maharjan 70
Variants of
JPEG
• Lossless mode
• No loss in image quality.
• Uses only differential coding method
instead of transform coding.
• Rarely used a s compression ratio is
very low.
By: Dinesh Maharjan 71
Variants of
JPEG
• In between process:
- animation needs composition of frames
with intermediate position in between
key frames.
• The easiest method is linear interpolation
• Gets start and end positions.
• The intermediate positions computed
with interpolation are unrealistic
By: Dinesh Maharjan 72
Variants of JPEG
(Lossless)
• Inbetween Process:
- Drawing for the objects must be
digitized
- this can be done through optical
scanning, data tablet or with a drawing
program
- drawing must be filtered to clean up
glitches
By: Dinesh Maharjan 73
Variants of JPEG
(Lossless)
• Uses predictor for predicting next pixel.
• Predictor can combine up to three neighboring pixels.
• Predictor can uses anyone of the seven schemes.
• Encoder encodes the difference of predicted pixel value
with actual value using entropy encoding
By: Dinesh Maharjan 74
Variants of
JPEG
• Hierarchic
al
By: Dinesh Maharjan 75
H.26
1
• Linear List notation:
each event is described by specifying
start and ending frame number and
the event.
• 42, 53, B, ROTATE, ‘’PALM’’, 1, 3 0
• Means, between frames 4 2 and 53,
rotate the object PALM about axis 1 by
3 0 degrees
By: Dinesh Maharjan 76
M PE
G
• General-purpose Languages:
- this includes the languages
created by embedding animation
capability within general
programming languages.
• e.g. A S A S has animation capability
built on the top of LISP.
• A S A S includes wide range of
geometric transformations that
operate on objects.
By: Dinesh Maharjan 77
M PE
G
• General-purpose Languages:
- this includes the languages
created by embedding animation
capability within general
programming languages.
• e.g. A S A S has animation capability
built on the top of LISP.
By: Dinesh Maharjan 78
M PE
G
• General-purpose Languages:
- this includes the languages
created by embedding animation
capability within general
programming languages.
• e.g. A S A S has animation capability
built on the top of LISP.
• A S A S includes wide range of
geometric transformations that
operate on objects.
By: Dinesh Maharjan 79
M PE
G
• MPEG-1: Original MPEG standard.
• Allows 320x240, 30 frame per second.
• MPEG-2: is a higher resolution version of
MPEG-1.
• Designed for television broadcast.
• MPEG-3: designed for HDTV.
• HDTV uses frames of higher resolution with
higher frame rate.
80
• MPEG-4: designed for low bit-rate
By : D ni esh
Video
Encoding
• Color sub-
sampling
By: Dinesh Maharjan 81
Video
Encoding
• Data stream contains further information
like
• Aspect ratio e.g. 1:1, 19:9, 4:3
• Refresh frequency (number of
images per second) e.g. 23.976hz,
24hz, 25hz etc
•
By: Dinesh Maharjan 82
Steps in MPEG
compression
• Resolution reduction: convert R G B to
YUV
• Motion estimation: calculates motion
vector to find the next block.
• It helps to remove temporal
redundancy
• Motion compensation and Image
subtraction
• DCT
By: Dinesh Maharjan 83
• Quantization
M PE
G
• JPEG exploits spatial redundancy.
• Special redundancy within a sa m e
image is very high
• Because the nearby pixels in sa m e
image is s a m e or similar.
By: Dinesh Maharjan 84
M PE
G
• However, M P E G exploits temporal
redundancy.
• A video h a s number of frames that change
very slowly.
• It m e a n s video h a s significant
temporal redundancy.
• Thus, every frame of a video need not
to be coded.
• Instead difference between current frame
and other frames can be only coded.
85
• Which results in sigByn: DifiinecshaMnahtarcjanompression
M PE
G
• JPEG exploits spatial redundancy.
• Special redundancy within a sa m e
image is very high
• Because the nearby pixels in sa m e
image is s a m e or similar.
By: Dinesh Maharjan 86
M PE
G
• JPEG exploits spatial redundancy.
• Special redundancy within a sa m e
image is very high
87
• Because the nearby pixels in sa m e
image is s a m e or similar.
By: Dinesh Maharjan 87
M PE
G
• MPEG uses predictive and
compensating approach on
difference frame to remove
temporal redundancy
• And transform method to remove
spatial redundancy.
• MPEG codes first frame with JPEG
method.
• This frame is taken a s reference frame
also known a s i-frame.
By: Dinesh Maharjan 88
M PE
G
• MPEG defi nes three types of frames
• I-frame
• P-frame
• B-frame
•
• Combination of these frames is called
group of pictures
By: Dinesh Maharjan 89
Encoding of I-
frame
• Each frame is first converted to YUV
domain with 4:2:2 chroma sampling.
• MPEG creates macro blocks from each
frame
• This happens for the first frame also.
• Size of macro block is 1 6 x16 for
luminance and 8x8 for chrominance.
•
By: Dinesh Maharjan 90
Encoding of I-
frame
• Each frame is first converted to YUV
domain with 4:2:2 chroma sampling.
• MPEG creates macro blocks from each
frame
• This happens for the first frame also.
• Size.
By: Dinesh Maharjan 91
Encoding of I-
frame
By: Dinesh Maharjan 92
Encoding of I-
frame
• Spatial redundancy is removed on
each 8 x8 block by converting into
frequency domain with DCT.
• Quantized results m a n y frequency
to be zeros.
• Coeffi cients are organized in a zigzag
order to produce long runs o zeros.
• Then RLE and huff man coding is
applied to further compress.
By: Dinesh Maharjan 93
Encoding of P-
frame
• Coded with nearest previous I or P-
frames.
• This is called forward prediction.
• Like I-frames, P-frames can serve
a s a reference for B-frames.
• It provides more compensation
By: Dinesh Maharjan 94
Encoding of P-
frame
• For encoding P-frames, motion
compensation prediction is used.
• It exploits the temporal redundancy.
• And accurately predict the P-frame
based on the reference frame.
• Search is conducted in I frame to find
the macroblock which closely
matches the macroblock under
consideration
• The difference between two macro 95
block is called prediction eByr: rDoneirs.h Maharjan
Encoding of P-
frame
• This error is coded with DCT which
results in few high frequency
coeffi cients.
• Quantization allows to represent the
rame with few bits.
• Quantization matrix is different from that
of I- frame.
• The displacement of best matching
block is called motion vector.
• Finally RLE and huff man
By: Dinesh Maharjancoding is 96
done.
Encoding of P-
frame
• Coded with nearest previous I or P-
frames.
• This is called forward prediction.
• Like I-frames, P-frames can serve
a s a reference for B-frames.
• It provides more compensation
By: Dinesh Maharjan 97
Encoding of P-
frame
• Best matching macro block has least
mean squared error (MSE).
•
•
• Motion vectors are transmitted along
with error.
By: Dinesh Maharjan 98
Encoding of B-
frame
• Bidirectional predictive frames use both
past and future frames as reference
frame.
• Thus, has highest compression
• But computation time is largest.
• B frames are useful when there is no
information in reference frame.
By: Dinesh Maharjan 99
Encoding of B-
frame
• In figure below, target macro block could
find good match in previous frame.
• Thus, it tries from next I or p frame.
• It means forward and backward
prediction is performed.
By: Dinesh Maharjan 100
Encoding of B-
frame
• Each macro block has one motion vector.
• Error is calculated by subtracting the average
of two matching macro blocks.
By: Dinesh Maharjan 101
Encoding of B-
frame
• Each macro block has two motion vector.
• Error is calculated by subtracting the average
of two matching macro blocks.
By: Dinesh Maharjan 102
MPEG
pattern
• Determined at encoding time and specified
in the header of video.
• M indicates the interval between P-frame
and its preceding I- or P-frame.
• N indicates the interval between two
consecutive I- frames.
• Since, B-frames depend upon either I- or P-
frame, display order and coding or
transmission order is different
• M=3, By: Dinesh Maharjan 103
N=9
MPEG
pattern
By: Dinesh Maharjan 104
MPEG
pattern
• Determined at encoding time and specified
in the header of video.
• M indicates the interval between P-frame
and its preceding I- or P-frame.
• N indicates the interval between two
consecutive I- frames.
• Since, B-frames depend upon either I- or P-
frame, display order and coding or
transmission order is
By: Dinesh Maharjan 105
Layers in Video
stream
• Video
sequence
By: Dinesh Maharjan 106
Layers in Video
stream
• Video
sequence
By: Dinesh Maharjan 107
Questio
ns
• Video
sequence
By: Dinesh Maharjan 108
Questio
ns
• Video
sequence
By: Dinesh Maharjan 109
Questio
ns
• Why is lower right coefficients are larger
than the upper left coefficients in the
quantization matrix given below?
By: Dinesh Maharjan 110
Questio
ns
• Determine the huffman code of each
symbol.
• What is the entropy
By: Dinesh Maharjan 111
Questio
ns
• Video
sequence
By: Dinesh Maharjan 112
Questio
ns
• Video
sequence
By: Dinesh Maharjan 113
Questio
ns
• Video
sequence
By: Dinesh Maharjan 114
Questio
ns
• Video
sequence
By: Dinesh Maharjan 115