0% found this document useful (0 votes)
75 views21 pages

Chebyshev's Inequality Explained

The document discusses Chebyshev's Inequality, providing a theorem and proof that relates the probability of a random variable deviating from its mean to its variance. It also covers the Law of Large Numbers, including both the Weak and Strong Laws, and their implications for independent random variables. Additionally, it presents examples and corollaries related to these statistical concepts.

Uploaded by

Emmanuel Coker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views21 pages

Chebyshev's Inequality Explained

The document discusses Chebyshev's Inequality, providing a theorem and proof that relates the probability of a random variable deviating from its mean to its variance. It also covers the Law of Large Numbers, including both the Weak and Strong Laws, and their implications for independent random variables. Additionally, it presents examples and corollaries related to these statistical concepts.

Uploaded by

Emmanuel Coker
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistics

A Virtual Lecture Facilitated by

J. N. Onyeka-Ubaka (Ph.D)
jonyeka-ubaka@[Link] +2348059839937
Chebyshev's Inequality
Theorem 1: Let X be a random variable (discrete or
continuous) with mean 𝜇 and variance 𝜎 2 . Then for any
positive number k we have
𝜎2
P{|X - 𝜇| ≥ 𝑘} ≤
𝑘2
Or equivalently
1
P{|X - 𝜇| ≥ 𝑘𝜎 2 } ≤ .
𝑘2
Proof
𝜎2
To show that P{|X - 𝜇| ≥ 𝑘} ≤ ,
𝑘2
Let X be a non-negative random variable such that E(X) = 𝜇 < ∞.
Define another random variable Y as follows:
0 if X < k
Y= k if X ≥ 𝑘
This new variable Y is a discrete variable having two values 0, k. The
probability density function of Y can be written thus:
y 0 k
P(Y = y) P(X < k) P(X ≥ 𝑘)
Hence, E(Y) = 0*P(X < k) + k*P(X ≥ 𝑘)
which gives E(Y) = kP(≥ 𝑘).
Since the variable X ≥ 𝑌 for all possible values, we have
E(X) ≥ E(Y) = kP(≥ 𝑘),
thus,
Proof Cont’d
𝐸(𝑋)
P(X ≥ 𝑘) ≤ . (1)
𝑘
Equation (1) is called the Markov inequality and can be generalized
thus: For any j ≥ 0, k > 0
𝐸|𝑊|𝑗
P{|W| ≥ 𝑘} ≤ (2)
𝐾𝑗
The proof of Chebyshev’s inequality is an immediate consequence of
(2) by putting j = 2 and W = X - 𝜇, we obtain
𝐸(X − 𝜇)2 𝜎 2
P{|X - 𝜇| ≥ 𝑘} ≤ 2 = 2.
𝑘 𝑘
This completes the proof.

Note that from (1) if we replaced X by (X − 𝜇)2 and k by 𝑘 2 we have


2 2 𝐸(X − 𝜇)2 𝜎 2
P{(X − 𝜇) ≥ 𝑘 } ≤ 2 = 2
𝑘 𝑘
Proof Cont’d

since (X − 𝜇)2 ≥ 𝑘 2 ⇔ |X - 𝜇| ≥ k, we have


𝜎2
P{|X - 𝜇| ≥ 𝑘} = P{(X − 𝜇)2 ≥ 𝑘 2 } ≤ .
𝑘2
Thus,
𝜎2
P{|X - 𝜇| ≥ 𝑘} ≤ 2 .
𝑘
Example 1
Let X be a random variable having Poisson distribution with mean λ and
variance 𝜎 2 . Use Chebyshev’s inequality to show that:
(i) P{|X - λ| ≥ 1} ≤ 1
3λ 4
(ii) P X > ≤
2 λ
Solution
Using Chebyshev’s inequality, 𝜇 = λ and 𝜎 2 = λ, we have
λ
(i) P{|X - 𝜇| ≥ 𝑘} ≤ .
𝑘2
putting k = 1, we obtain P{|X - 𝜇| ≥ 1} ≤ λ.
Solution to Example 1 Cont’d
3λ 4
(ii) P X > ≤
2 λ
𝜎2 4 4𝑘 2
Let = ⇒ 𝜎2 =
𝑘2 λ λ
on substituting for 𝜎 2 , we have
λ2 = 4𝑘 2
2 λ2
𝑘 =
2
λ
k= .
2
λ 4
Thus, P |X - 𝜇| ≥ ≤
2 λ
λ −λ λ
since |X -λ| ≥ if and only if X – λ < or X – λ < ,
2 2 2
Solution to Example 1 Cont’d
We see that
λ −λ λ
P |X - λ| ≥ = P X–λ< + P X–λ<
2 2 2

λ 3λ 4
= P X < + P X > ≤
2 2 λ
λ
Since P X < ≥ 0, we have
2

3λ 4
P X > ≤ .
2 λ
The Law of Large Numbers of
Bernoulli Trials
▪ Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑛 be n independent and identically distributed
Bernoulli random variables and let X = 𝑋1 + 𝑋2 + . . . + 𝑋𝑛 be
Binomial random variables (number of successes), with parameters
n and p. the mean and variance of X are np and npq, respectively.

▪ The mean 𝜇 grows as n increases but the standard deviation


grows only as √𝑛. Using Chebyshev’s inequality
𝑛𝑝𝑞 𝑛
P{|X – np| ≥ 𝜀} ≤ ≤ .
𝜀2 𝜀2
Theorem 2
Let X be the number of successes in n independent Bernoulli trials
with probability p of success. For any 𝜀 > 0,
𝑋 𝑝𝑞 1
P | – p| ≥ 𝜀 ≤ ≤
𝑛 𝑛𝜀 2 4𝜀 2 𝑛

and
𝑋
lim −𝑝 ≥ 𝜀 =0
𝑛→∞ 𝑛
Proof
Applying Chebyshev’s inequality,
𝑋 𝑋 1 𝑝𝑞
𝐸 = 𝑝, Var = . 𝑛𝑝𝑞 =
𝑛 𝑛 𝑛2 𝑛

𝑋
Var 𝑛 𝑝𝑞
= → 0 as n → ∞
𝜀2 𝑛𝜀 2
1
pq = p(1 – p) ≤ 4
for all p, 0 ≤ p < 1
hence,
𝑋
P | – p| ≥ 𝜀 → 0 as n → ∞.
𝑛
𝑋
This means that for large numbers, we can be almost certain that
𝑛
will be close to probability p of success. This shows that the relative
frequency of success in independent Bernoulli trials converges (in a
probability sense) to the theoretical probability p of success at each
trial.
The Law of Large Numbers
Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑛 be n independent and identically distributed random variables with
all 𝐸(𝑋𝑖2 ) < ∞ and let
𝑆𝑛 = 𝑋1 + 𝑋2 + . . . + 𝑋𝑛 . We know
E(𝑋1 ) = 𝐸(𝑋2 ) = . . . = E(𝑋𝑛 ), E(𝑆𝑛 ) = n E(𝑋𝑖 ) and
Var(𝑆𝑛 ) = nVar(𝑋𝑖 )
𝑆𝑛 1 1
E = 𝑛 𝐸(𝑆𝑛 ) =𝑛 . n E(𝑋𝑖 ) = E(𝑋𝑖 )
𝑛

𝑆𝑛 1 nVar(𝑋𝑖 ) 1
Var = 𝑉𝑎𝑟(𝑆𝑛 ) = = 𝑉𝑎𝑟(𝑋𝑖 ).
𝑛 𝑛2 𝑛2 𝑛
𝑆𝑛
This shows that expectation of is equal to the expectation of 𝑋𝑖 and the standard
𝑛
𝑆𝑛
deviation of is
𝑛

𝑉𝑎𝑟(𝑋) 𝑆𝑡𝑑(𝑋𝑖 )
𝑛
= 𝑛
𝑆𝑛
which tends to zero as n tends to infinity. Thus, the distribution of becomes more and
𝑛
more concentrated near E(𝑋𝑖 ).
Theorem 3
Theorem 3: A Weak Law of Large Numbers (WLLN) for
Independent Random Variables.
Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑛 be n independent and identically
distributed random variables with finite mean μ and 𝜎 2 .
Let 𝑆𝑛 = 𝑋1 + 𝑋2 + . . . + 𝑋𝑛 , then for any 𝛿 > 0,
𝑆𝑛
lim 𝑃 −μ ≥𝛿 =0
𝑛→∞ 𝑛
Proof
𝑆𝑛
Applying Chebyshev’s inequality to , we have
𝑛
𝑆𝑛 𝑆𝑛 1
μ=E , Var = 𝑉𝑎𝑟(𝑆𝑛 )
𝑛 𝑛 𝑛2

𝑆𝑛 𝜎2ൗ 𝜎2
𝑛
𝑃 −μ ≥𝛿 ≤ = → 0 as n → ∞ .
𝑛 𝛿2 𝑛𝛿 2

Since 𝜎 2 ≤ M. That is, the variances on n. This shows that under the
1
above conditions, if n is large number we can be quite sure 𝑆𝑛 will
𝑛
be close to E(X).
The Strong Law of Large Numbers (SLLN)
The strong law of large numbers asserts that under certain
conditions, we can be quite sure that every one of
𝑆𝑛 𝑆𝑛+1 𝑆𝑛+2
, , ,...
𝑛 𝑛+1 𝑛+2
for large n, will be close t0 E(X).

Theorem 4: A SLLN for iid r.v’s


Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑚 (m > n) be n independent random variables having
common distribution with mean μ and E(𝑋𝑖 ) < ∞, and set
𝑆𝑛 = 𝑋1 + 𝑋2 + . . . + 𝑋𝑛 . if 𝛿 > 0, then
2
1 2𝜎𝑋
𝑃 𝑆 − 𝐸(𝑋) ≥ 𝛿 ≤
𝑘 𝑘 𝛿2𝑛
for at least one k satisfying n ≤ k ≤ m.
Corollaries
Corollary I: let f(x) be a continuous real function on [0, 1]. Then as n
→ ∞ uniformly with respect to p ∈ [0, 1] ,
𝑆𝑛
E 𝑓 → f(p).
𝑛

Corollary II: Weierstrass Approximation Theorem


Let f(x) be a continuous real function on [0, 1] defined on the real
interval 0 ≤ p ≤ 1 and let
𝑘 𝑛
𝑃𝑛 𝑝 = σ𝑛𝑘=0 𝑓 𝑘
𝑝𝑘 (1 − 𝑃)𝑛−𝑘 .
𝑛
Then, 𝑃𝑛 𝑝 → f(p) as n → ∞ and the convergence is uniform in p.
𝑃𝑛 𝑥 is called the Bernstein Polynomials.
Theorem 5
A WLLN for independent but not necessarily identical random
variables.
Let 𝑋1 , 𝑋2 , . . . , 𝑋𝑛 be n independent random variables with mean μ
𝑆𝑛
and E(𝑋 2 ) < ∞ and let 𝑆𝑛 = 𝑋1 + 𝑋2 + . . . + 𝑋𝑛 and 𝜇𝑛 = 𝐸 ,
𝑛
then
𝑆𝑛 1
𝑃 − 𝜇𝑛 ≥ 𝜀 ≤ {𝑉𝑎𝑟 𝑋1 + 𝑉𝑎𝑟 𝑋2 + . . . +𝑉𝑎𝑟 𝑋𝑛 , 𝜀 > 0.
𝑛 𝜀 2 𝑛2
whenVar(𝑋𝑖 ) ≤ M for all i, that is, when the variances are all bounded
by a constant which does not depend on n, then we have
𝑆𝑛 𝑀
𝑃 − 𝜇𝑛 ≥ 𝜀 ≤ → 0 as n → ∞
𝑛 𝜀2𝑛
Assignment

Let X be a non-negative integer-valued random variable whose


probability generating function 𝑃𝑋 𝑠 = 𝐸(𝑠 𝑋 ) is finite for all s.
Use Chebyshev’s inequality to verify the following inequalities:
𝑃𝑋 (𝑠)
(i) P(X ≤ 𝑘) ≤ , 0 ≤ s ≤ 1, k any positive number
𝑠𝑘
𝑃𝑋 (𝑠)
(ii) P(X≥ 𝑘) ≤ , s ≥ 1.
𝑠𝑘
Questions and Answers
References

 Onyeka-Ubaka, J. N. (2022). Probability and Distribution Theories for


Professional Competence. First Edition, Masterprint Educational
Publishers & Printers, Ibadan.

 R. Bola Kasumu (2002). Introduction To Probability A First Course. Fatol


Ventures, Lagos

 Multi-Level Statistical Table


Thank you!

You might also like