0% found this document useful (0 votes)
35 views16 pages

Huffman Coding for Data Compression

Huffman coding is a method of data compression that uses variable-length codes to represent characters. Frequently used characters are represented by shorter bit sequences than less common characters. The document describes how Huffman coding assigns bit codes by combining nodes from a frequency table into a binary tree, then traversing the tree to assign codes.

Uploaded by

vijay1vijay2147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views16 pages

Huffman Coding for Data Compression

Huffman coding is a method of data compression that uses variable-length codes to represent characters. Frequently used characters are represented by shorter bit sequences than less common characters. The document describes how Huffman coding assigns bit codes by combining nodes from a frequency table into a binary tree, then traversing the tree to assign codes.

Uploaded by

vijay1vijay2147
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Huffman Coding

Huffman Coding
• Huffman codes can be used to compress
information
• The basic idea is that instead of storing each
character in a file as an 8-bit ASCII value, we will
instead store the more frequently occurring
characters using fewer bits and less frequently
occurring characters using more bits
Huffman Coding
• We then pick the nodes with the smallest
frequency and combine them together to
form a new node
• The two selected nodes are removed from
the set, but replace by the combined node
• This continues until we have only 1 node
left in the set
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1

duke blue devils


D =2 times
U =2
K =1
E =3
B =1
L =2
V =1
I =1
S =1
Space =2
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 2

i,1 s,1
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1 2 2

b,1 v,1 i,1 s,1


Huffman Coding
e,3 d,2 u,2 l,2 sp,2 3 2

k,1 2 i,1 s,1

b,1 v,1
Huffman Coding
e,3 d,2 u,2 4 3 2

l,2 sp,2 k,1 2 i,1 s,1

b,1 v,1
Huffman Coding
e,3 4 4 3 2

d,2 u,2 l,2 sp,2 k,1 2 i,1 s,1

b,1 v,1
Huffman Coding
e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1
Huffman Coding
7 4 5

e,3 4 l,2 sp,2 2 3

d,2 u,2 i,1 s,1 k,1 2

b,1 v,1
Huffman Coding
7 9

e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1
Huffman Coding
16

7 9

e,3 4 4 5

d,2 u,2 l,2 sp,2 2 3

i,1 s,1 k,1 2

b,1 v,1
Huffman Coding
• Now we assign codes to the tree by placing
a 0 on every left branch and a 1 on every
right branch
• A traversal of the tree from root to leaf give
the Huffman code for that particular leaf
character
• Note that no code is the prefix of another
code
e 00
d 010
Huffman Coding u 011
16 l 100
sp 101
7 9 i 1100
s 1101
e,3 4 4 5 k 1110
b 11110
d,2 u,2 l,2 sp,2 2 3 v 11111

i,1 s,1 k,1 2

b,1 v,1
Huffman Coding
• These codes are then used to encode the string
• Thus, “duke blue devils” turns into:
010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101

• When grouped into 8-bit bytes:


01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx

• Thus it takes 7 bytes of space compared to 16


characters * 1 byte/char = 16 bytes
uncompressed

You might also like