ICT - Module 1 Lecture 1
ICT - Module 1 Lecture 1
Module 1
By
Dr Akriti Nigam
Computer Science & Engineering
Department
BIT, Mesra
1
References
“Elements of Information Theory”, by Thomas Cover
https://cs-114.org/wp-content/uploads/2015/01/
Elements_of_Information_Theory_Elements.pdf
Stefan M. Moser: “Information Theory (Lecture Notes)”
Text Book
Ranjan Bose, “Information Theory coding and
Cryptography”, McGraw-Hill Publication,
2ndEdition
2
Course Assessment
10 marks Quiz 1
10 marks Quiz 2
5 marks Assessment
3
Claude Shannon
Father of Digital
Communications
“Claude Shannon's creation in
the 1940's of the subject of
information theory is arguably
one of the great intellectual
achievements of the twentieth
century”
Bell Labs
Computing and Mathematical
Sciences Research
Link to the article:
https://people.math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf
4
Quotes about Shannon
7
Information Theory
ENTROPY
Two Main Concepts: CAPACITY
Measurement of
Three Main Axes for Information.
Shannon Theory Source Coding Theory.
Channel Coding Theory.
8
Information Sources
9
Information Content
10
SAME Information Source DIFFERENT Information
Content
C: Channel capacity.
Fundamental Theorems
18
Probability-Based Measure of Information
As pk decreases,
The uncertainty increases
The occurrence of event corresponds to some gain in information. BUT
HOW MUCH?
𝐼 ( 𝑥 𝑘 )= log
1
𝑝𝑘 ( )
=− log ( 𝑝 𝑘 )
bits
nats
Hartleys
19
Self Information
Example 1:
Example 2:
Self Information
Let
Ofte
n de
Then note
d
1
The uncertainty (information) is greatest when
0 0.5 1
Entropy: Three properties
( )
𝑲 𝑲
𝟏
𝑯 ( 𝑿 )= ∑ 𝒑 𝒌 × 𝐥𝐨𝐠 =− ∑ 𝒑 𝒌 ×𝐥𝐨𝐠 ( 𝒑 𝒌)
𝒌=𝟏 𝒑𝒌 𝒌=𝟏
Theorem:
Joint Entropy and Conditional Entropy
the number of bits needed to describe X and Y is the sum of the number of bits
needed to describe X and that needed to describe Y once X is known
The conditional entropy
measures how much
entropy a random variable
Y has remaining if we have
already learned the value
of a second random
variable X.
Or
as the expected number of
bits needed to describe Y
when X is known to both
the encoder and the
decoder
It is referred to as the entropy
of Y conditional on X, and is
written H(Y∣X)
H(Y|X)≤H(Y)
Knowledge of (X) never
increases entropy, and,
except when it is
irrelevant (X and Y
independent), it always
lowers entropy.
Relative Entropy and Mutual Information
• In the above definition, we use the convention that 0 log 0/0= 0 and the convention
that 0 log 0/q = 0 and p log p/0= ∞. Thus, if there is any symbol x ∈ X such that p(x) >
0 and q(x) = 0, then D(p||q)=∞.
• Relative entropy is always nonnegative and is zero if and only if p = q.
• Consider two random variables X and Y with a joint probability mass function p(x, y)
and marginal probability mass functions p(x) and p(y). The mutual information I (X;
Y) is the relative entropy between the joint distribution and the product distribution
p(x)p(y):
Relationship Between Entropy And Mutual
Information
• We can rewrite the definition of mutual information I (X; Y) as