0% found this document useful (0 votes)
59 views

Main Techniques and Performance of Each Compression

Run-length encoding (RLE) is a lossless data compression technique that replaces consecutive repeating symbols with a symbol and a count. It can be applied one-dimensionally to strings of symbols or two-dimensionally to images. RLE is efficient when the data contains many repeated symbols. It works by storing repeated data as a single instance of the data along with a count of the repetitions.

Uploaded by

Rizki Azka
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Main Techniques and Performance of Each Compression

Run-length encoding (RLE) is a lossless data compression technique that replaces consecutive repeating symbols with a symbol and a count. It can be applied one-dimensionally to strings of symbols or two-dimensionally to images. RLE is efficient when the data contains many repeated symbols. It works by storing repeated data as a single instance of the data along with a count of the repetitions.

Uploaded by

Rizki Azka
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Compression Techniques

Entropy coding Use statistical redundancy


Lossless
Source coding Use semantic context
Usually lossy
Hybrid coding Combine the above methods

T.Sharon-A.Frank
Main Techniques and Performance of
each Compression
• Data compression is often called coding due to the fact that its aim is
to find a specific short (or shorter) way of representing data.
• Encoding and decoding are used to mean compression and
decompression respectively.
• Some of the main compression algorithms:
1. Run-length coding
2. Quantisation
3. Statistical coding
4. Dictionary-based coding
5. Transform-based coding
6. Motion prediction.
1. Run-length coding
• The idea of Run-length coding is to replace consecutively
repeated symbols in a source with a code pair which
consists of either the repeating symbol and the number of
its occurrences, or sequence of non-repeating symbols.

• Example
• String ABBBBBBBCC can be represented by Ar7Br2C,
where r7 and r2 means 7 and 2 occurrences
respectively.

All the symbols are represented by an 8-bit ASCII codeword.


2. Quantisation
• The basic idea of quantisation is to apply a certain
computation to a set of data in order to achieve an
approximation in a simpler form.
Example
Consider storing a set of integers (7, 223, 15, 28, 64, 37, 145). Let x
be an integer in the set. We have 7 ≤ x ≤ 223. Since 0 < x < 255 and
2 8 = 256, it needs 8 binary bits to represent each integer above.

However, if we use a multiple, say 16, as a common divider to


apply to each integer and round its value to the nearest integer,
the above set becomes (0, 14, 1, 2, 4, 2, 9) after applying the
computation x div 16. Now each integer can be stored in 4 bits,
since the maximum number 14 is less than 2 4 = 16.
3. Statistical coding

The idea of statistical coding is to use statistical


information to replace a fixed-size code of symbols
by a, hopefully, shorter variable-sized code.
Example
We can code the more frequently occurring
symbols with fewer bits. The statistical information
can be obtained by simply counting the frequency
of each character in a file. Alternatively, we can
simply use the probability of each character.
4. Dictionary-based coding
• The dictionary approach consists of the following
main steps:
1. read the file
2. find the frequently occurring sequences of
symbols (FOSSs)
3. build up a dictionary of these FOSSs
4. associate each sequence with an index (usually
a fixed length code)
5. replace the FOSS occurrences with the indices.
Dictionary-Based Compression: Example
• Consider the Random House Dictionary of the English Language, Second edition,
Unabridged. Using this dictionary, the string:
A good example of how dictionary based compression works
can be coded as:
1/1 822/3 674/4 1343/60 928/75 550/32 173/46 421/2
• Coding:
– Uses the dictionary as a simple lookup table
– Each word is coded as x/y, where, x gives the page in the dictionary and y gives
the number of the word on that page.
– The dictionary has 2,200 pages with less than 256 entries per page: Therefore x
requires 12 bits and y requires 8 bits, i.e., 20 bits per word (2.5 bytes per word).
– Using ASCII coding the above string requires 48 bytes, whereas our encoding
requires only 20 (<-2.5 * 8) bytes: 50% compression.

7
5. Transform-based coding
• The transform-based approach models data by
mathematical functions, usually by periodic functions such
as cos(x) and applies mathematical rules to primarily diffuse
data.
• The idea is to change a mathematical quantity such as a
sequence of numbers to another form with useful features.
• It is used mainly in lossy compression algorithms involving
the following activities:
• analysing the signal (sound, picture etc.)
• decomposing it into frequency components
• making use of the limitations of human perception.
6. Motion prediction

• Motion prediction techniques are lossy


compression for sound and moving images.
• Here we replace objects (say, an 8 ×8 block of
pixels) in frames with references to the same
object (at a slightly different position) in the
previous frame.
Compression problems
• Data compression is an algorithmic problem.
• Compression algorithm for various types of data.
• Two classes of compression problems are of interest (Davisson and
Gray 1976):
• Distortion-rate problem
Given a constraint on transmitted data rate or storage capacity, the problem is
to compress the source at, or below, this rate but at the highest fidelity
possible. Compression in areas of voice mail, digital cellular mobile radio and
video conferencing are examples of the distortion-rate problems.
• Rate-distortion problem
Given the requirement to achieve a certain pre-specified fidelity, the problem
is to meet the requirements with as few bits per second as possible.
Compression in areas of CD-quality audio and motion-picture-quality video are
examples of rate-distortion problems.
Algorithmic solutions
• To perform data compression, it is necessary to
analyze the characteristics of the data to be
compressed to conclude several patterns to
achieve a compact representation.
• There are various data modeling and
representation techniques, which are at the core
of compression techniques.
• There is no one-size-fits-all solution to data
compression problems.
Compression and decompression
• Any compression algorithm will work if a decompression
approach is also provided.
• The term compression algorithm includes compression
algorithm and decompression algorithm.
• In many cases, the efficiency of the decompression
algorithm is more concerned than that of the
compression algorithm. For example, movies, photos,
and audio data are often compressed once by the artist
and then decompressed many times by millions of
viewers.
• Compression efficiency is sometimes more important,
for example, programs can record audio or video files
directly to computer storage.
Compression performance
• The criteria for measuring the performance of
compression algorithms depends on what our priority
concerns are.
• Effects produced by compression (i.e. the difference in the
size of the input file before compression and the size of
the output after compression).
• The easiest way to measure the effect of compression is
to use:
• Compression ratio: the ratio of the size after
compression to before compression.
• The compression factor is the reciprocal of the
compression ratio. Compression factor = size before
compression with after.
Example
A source image file (pixels 256 × 256)
with 65,536 bytes is compressed into a
file with 16,384 bytes. The compression
ratio is 1/4 and the compression factor
is 4. The saving percentage is: 75%
Data compression implies sending or storing a smaller
number of bits. Although many methods are used for this
purpose, in general these methods can be divided into two
broad categories: lossless and lossy methods.

Figure 15.1 Data compression methods 15.15


15-1 LOSSLESS COMPRESSION

In lossless data compression, the integrity of the data is


preserved. The original data and the data after
compression and decompression are exactly the same
because, in these methods, the compression and
decompression algorithms are exact inverses of each
other: no part of the data is lost in the process.
Redundant data is removed in compression and added
during decompression. Lossless compression methods
are normally used when we cannot afford to lose any
data.

15.16
Run-length encoding

Run-length encoding is probably the simplest method of


compression. It can be used to compress data made of any
combination of symbols. It does not need to know the
frequency of occurrence of symbols and can be very efficient
if data is represented as 0s and 1s.
The general idea behind this method is to replace
consecutive repeating occurrences of a symbol by one
occurrence of the symbol followed by the number of
occurrences.
The method can be even more efficient if the data uses
only two symbols (for example 0 and 1) in its bit pattern and
one symbol is more frequent than the other.

15.17
Figure 15.2 Run-length encoding example
15.18
Figure 15.3 Run-length encoding for two symbols 15.19
Run Length Encoding (RLE)

• Generalization of Zero Suppression.


• Runs (sequences) of data are stored as a single value
and count, rather than the individual run.
• Example:
– WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWW
WWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWW
– Becomes: 12WB12W3B24WB14W
• To avoid confusion, use flags + appearance counter
• Example: ABCCCCCCCCDEFGGG
– Becomes: ABC!8DEFGGG ! is flag
T.Sharon-A.Frank
One-dimensional RLE

10'0'
2'0' 6'1' 2'0'
3'0' 4'1' 3'0'
3'0' 4'1' 3'0'
3'0' 4'1' 3'0'
4'0' 2'1' 4'0'
4'0' 2'1' 4'0'
4'0' 2'1' 4'0'
1'0' 8'1' 1'0'
10'0'

T.Sharon-A.Frank
Two-dimensional RLE

10'0'
2'0' 6'1' 2'0'
3'0' 4'1' 3'0'
AGAIN
AGAIN
4'0' 2'1' 4'0'
AGAIN
AGAIN
1'0' 8'1' 1'0'
10'0'

T.Sharon-A.Frank
Link Praktek
https://www.mathcelebrity.com/
runlencode.php

https://www.csfieldguide.org.nz/en/interactives/run-
length-encoding/

You might also like