VIDEO
COMPRESSION
Presenting by
K. Vamsi Krishna
Introduction
o Compression means that it converts an input data stream to another data
stream of smaller size.
o A video consists of a time-ordered sequence of frames, i.e., images.
o An obvious solution to video compression would be predictive coding based
on previous frames.
o It can be done even better by searching for just the right parts of the image
to subtract from the previous frame.
Raw Compressed Compressed Uncompressed
Encoder Network Decoder Data
Data Data Data
Why compression & types:
o Reduces storage space.
o Reduces bandwidth.
o Lower communication cost.
o New applications
Types:
o Lossless compression (decompressed data is identical to the source).
o Lossy compression (decompressed data is not identical to the source).
H.261
o H.261: An earlier digital video compression standard, its principle of Motion
Compensation-based compression is retained in all later video compression
standards.
o The standard was designed for videophone, video conferencing and other
audiovisual services over Integrated Service Digital Network (ISDN).
o The video codec supports bit-rates of p × 64 kbps, where p ranges from 1 to
30 (Hence also known as p × 64).
o Require that the delay of the video encoder be less than 150 msec so that
the video can be used for real-time bidirectional video conferencing.
ITU Recommendations & H.261
Video Formats
H.261 belongs to the following set of ITU recommendations for visual telephony
systems:
o H.221 — Frame structure for an audiovisual channel supports 64 - 1,920 kbps.
o H.230 — Frame control signals for audiovisual systems.
o H.242 — Audiovisual communication protocols.
o H.261 — Video encoder/decoder for audiovisual services at p × 64 kbps.
o H.320 — Narrow-band audiovisual terminal equipment for p × 64 kbps
transmission.
Video Formats Supported by H.261
Luminance Chrominance Bit-rate (Mbps)
Video H.261
(Brightness) (Color) Image (If 30 fps and
Format Support
Image Resolution Resolution uncompressed)
QCIF 176 x 144 88 x 72 9.1 Required
CIF 352 x 288 176 x 144 36.5 Optional
o Luminance: is a photometric measure of the luminous intensity per unit area of
light travelling in a given direction.
o Chrominance: is the signal used in video systems to convey the color information
of the picture, separately from the accompanying luminance signal.
H.261 Video Bit stream
Hierarchy of four layers:
o Picture layer: PSC (Picture Start Code) describe boundaries between
pictures. TR (Temporal Reference) provides a time-stamp for the picture.
o GOB layer: H.261 pictures are divided into regions of 11 × 3 macroblocks,
each of which is called a Group of Blocks (GOB).
o Macroblock layer: Each Macroblock (MB) has its own Address indicating its
position within the GOB, Quantizer (MQuant), and six 8 × 8 image blocks (4
Y, 1 Cb, 1 Cr).
o Block layer: For each 8 × 8 block, the bit stream starts with DC value,
followed by pairs of length of zero-run (Run) and the subsequent non-zero
value (Level) for ACs, and finally the End of Block (EOB) code.
Types of coded frames:
o I‐frame – Intra‐coded frame, coded independently of all other frames.
o P‐frame – Predictively coded frame, coded based on previously coded frame.
o B‐frame –Bi‐directionally predicted frame, coded based on both previous and
future coded frames.
H.261 Frame Sequence
Spatial & Temporal Redundancy
o Spatial Redundancy elements that are duplicated within a structure, such
as pixels in a still image.
o Temporal redundancy Pixels in two video frames that have the same values
in the same location.
Motion Compensation
o Each image is divided into macroblocks of size N × N.
o By default, N = 16 for luminance images. For chrominance images, N = 8.
o Motion compensation is performed at the macroblock level.
o The current image frame is referred to as Target Frame.
o A match is sought between the macroblock in the Target Frame and the
most similar macroblock in previous and/or future frame(s).
o The displacement of the reference macroblock to the target macroblock is
called a motion vector MV.
Motion Vector in Video Compression
o Motion Vector search is usually limited to a small immediate neighborhood of
both horizontal and vertical displacements in the range [−p, p].
o This makes a search window of size (2p +1) × (2p +1).
Intra-frame (I-frame) Coding
o Macroblocks are of size 16 ×16 pixels for the Y frame, and 8×8 for Cb and Cr
frames, since 4:2:0 Chroma subsampling is employed. A macroblock consists
of four Y, one Cb, and one Cr 8 ×8blocks.
o For each 8 ×8 block a DCT transform is applied, the DCT coefficients then go
through quantization zigzag scan and entropy coding.
Intra-frame (P-frame) Coding
o For each macroblock in the Target
frame, a motion vector is allocated
by one of the search methods
discussed earlier.
o After the prediction, a difference
macroblock is derived to measure
the prediction error.
o Each of these 8×8 blocks go
through DCT, quantization, zigzag
scan and entropy coding
procedures..
Video Encoder
Video Decoder
H.261 Enhancements
Goal:
o Real‐time, two‐way video communication
Key features:
o Low delay (150ms)
o Low bit rates (p x 64 kb/s)
Technical details:
o Uses I-and P-frames (But, no B-frames).
o Full-pixel motion estimation.
o Search range +/-15 pixels.
o Low-pass filter in the feedback loop.
H.263
o H.263 is an improved video coding standard for video conferencing and other
audiovisual services transmitted on Public Switched Telephone Networks
(PSTN).
o Aims at low bit-rate communications at bitrates of less than 64 kbps.
o Uses predictive coding for inter-frames to reduce temporal redundancy and
transform coding for the remaining signal to reduce spatial redundancy (for
both Intra-frames and inter-frame prediction).
Video Formats Supported by H.263
Luminance Chrominance Bit-rate (Mbps) Bit-rate
Video
(Brightness) Image (Color) Image (If 30 fps and (Kbps)
Format
Resolution Resolution uncompressed) Compressed
Sub-QCIF 128 x 96 64 x 48 4.4 64
QCIF 176 x 144 88 x 72 9.1 64
CIF 352 x 288 176 x 144 36.5 256
4CIF 704 x 576 352 x 288 146.0 512
16CIF 1408 x 1152 704 x 576 583.9 1024
H.263 Enhancements
Goal:
o Communication over conventional analog telephone lines (< 33.6 kb/s)
Enhancements to H.261
o Reduced overhead information.
o Improved error resilience features.
o Algorithmic enhancements.
o Half-pixel motion estimation with larger motion search range.
“ Thank
You
”