0% found this document useful (0 votes)
39 views19 pages

Statistical Distributions Overview for Students

Uploaded by

minteman2017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views19 pages

Statistical Distributions Overview for Students

Uploaded by

minteman2017
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistical Theory of Distributions & Inference

for
Statistics & Data Science Students
Demeke Lakew Workie (Associate Professor in Statistics)
BDU, College of Science, Statistics Program

Email: wadela1606@[Link]
Mobile: +251913185477

Theory of Distributions by Demeke L.


January, 2023
3/19/2024 1
(wadela1606@[Link]) BDU-Ethiopia
2 Common Univariate Distributions

Theory of Distributions by Demeke L.


3/19/2024 2
(wadela1606@[Link])
2.1 Introduction
There are a number of specific distributions that are used over and over in practice that possess the properties of a
probability :
 mass function for discrete random variables
 density function for continuous random variables.
There is a random experiment behind each of these distributions.
Since these random experiments model a lot of real life phenomenon, these ‘special’ distributions are used
frequently in different applications.
Recall that we have two types of distribution
Discrete Distribution Continuous Distribution
 Uniform Distribution
 Discrete Uniform distribution  Normal (Gaussian) Distribution
 Bernoulli Distribution  Exponential Distribution
 Gamma Distribution
 Binomial distribution  Beta Distribution
 Hypergeometric distribution  Chi-squared Distribution
 Cauchy Distribution
 Poisson distribution  Weibull Distribution
 Negative Binomial (Pascal) distribution  The t-Distribution
 F-distibution
 Geometric distribution
Thus, in this chapter the nature and applications of these distributions are discussed.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 3
2.2 Discrete Distributions
1. Discrete Uniform Distribution: if a random variable can take on k different values with equal probability & its
1
pmf is given by p(X/k) = 𝑁, for X = x1, x2, x3, …, xN is know as discrete uniform distribution.

If X is a discrete uniform distribution, then

𝑁
1
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡
𝑁
𝑥=1

𝑁 𝑁
1 𝑁+1 1 𝑁 + 1 (2𝑁 + 1)
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥 = & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2 =
𝑁 2 𝑁 6
𝑥=1 𝑥=1

Thus,

2
𝑁 + 1 2𝑁 + 1 𝑁+1 𝑁2 − 1
𝑉 𝑋 =𝐸 𝑋2 − 𝐸 𝑋 2 = − =
6 2 12

Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′ 𝑥 𝑡 = 0 = 𝐸 𝑋2 , then V(X) = M’’x(t) – [M’x(t)]2.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 4
2.2 Discrete Distributions
2. Bernoulli Distribution: a Bernoulli random variable is a random variable that can only take two possible values,
usually 0= failure & 1 = success.
A random variable X is said to be a Bernoulli random variable with the probability of success p is defined as:
𝑝 𝑋 = 𝑝 𝑥 (1 − 𝑝)1−𝑥 , 𝑓𝑜𝑟 𝑥 = 0 , 1
If X is a Bernoulli distribution, then

𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑞 + 𝑝𝑒 𝑡
𝑥=0

1 1

𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥𝑝 𝑥 1 − 𝑝 1−𝑥
= 𝑝 & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2𝑝 𝑥 (1 − 𝑝)1−𝑥 = 𝑝
𝑥=0 𝑥=0

Thus,
𝑉 𝑋 = 𝐸 𝑋2 − 𝐸 𝑋 2 = 𝑝 − 𝑝2 = 𝑝 1 − 𝑝 = 𝑝𝑞

Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′𝑥 𝑡 = 0 = 𝐸(𝑋2) , then V(X) = M’’x(t) – [M’x(t)]2.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 5
2.2 Discrete Distributions
3. Binomial Distribution: A binomial experiment consists of n identical and independent Bernoulli trials and each
trials can result in one of the two possible outcomes (success and failure).
The probability of success, p is constant from trial to trial.
Therefore, the probability of a r.v. X having x success out of the n trials is given by:
𝑛 𝑥
𝑝 𝑋 = 𝑝 (1 − 𝑝)𝑛−𝑥 , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3, … where n > 0 and 0 ≤ p ≤ 1
𝑥
Hence, the binomial distribution reduces to Bernoulli when n=1
If X is a binomial distribution, then

𝑛
𝑛 𝑥
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 𝑝 (1 − 𝑝)𝑛−𝑥 = 𝑞 + 𝑝𝑒 𝑡 𝑛
𝑥
𝑥=0

𝑛 𝑛
𝑛 𝑥 𝑛−𝑥 𝑛 𝑥
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥 𝑝 1 −𝑝 = 𝑛𝑝 & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2 𝑝 (1 − 𝑝)𝑛−𝑥 = 𝑛 (𝑛 − 1)𝑝2
𝑥 𝑥
𝑥=0 𝑥=0

Thus,
𝑉 𝑋 = 𝐸 𝑋2 − 𝐸 𝑋 2 = 𝑛 (𝑛 − 1)𝑝2 − (𝑛𝑝)2 = 𝑛𝑝 1 − 𝑝 = 𝑛𝑝𝑞

Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′𝑥 𝑡 = 0 = 𝐸(𝑋2) , then V(X) = M’’x(t) – [M’x(t)]2.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 6
2.2 Discrete Distributions
4. Poisson Distribution: A random variable X is defined to have a Poisson distribution if the pmf of X is given by:

𝑒 −𝜆 𝜆𝑥
𝑝 𝑥 = 𝑥!
, for x =0, 1, 2, …

Here x is the number of times an event occurs in an interval independently and the average rate at which events
occur is constant called 𝜆 .
If X is a Poisson distribution, then

1
𝑒 −𝜆 𝜆𝑥
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 = 𝑒 𝜆 𝑒𝑡 − 1
𝑥!
𝑥=0

𝑛 𝑛
𝑒 −𝜆 𝜆𝑥 𝑒 −𝜆 𝜆𝑥
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥 = 𝜆 & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2 = 𝜆 (𝜆 + 1)
𝑥! 𝑥!
𝑥=0 𝑥=0

Thus,
𝑉 𝑋 = 𝐸 𝑋2 − 𝐸 𝑋 2 = 𝜆 (𝜆 + 1) − 𝜆2 = 𝜆
Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′𝑥 𝑡 = 0 = 𝐸(𝑋2), then V(X) = M’’x(t) – [M’x(t)]2.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 7
2.2 Discrete Distributions
Note: Poisson as an approximation for binomial
The Poisson distribution can be viewed as the limit of binomial distribution.
Suppose X ∼ Binomial(n, p) where n →∞, p →0 & λ = np.
We show that the PMF of X can be approximated by the PMF of a Poisson(λ) random variable.

Let us state this as:


e   x
Let X ∼ Binomial(n, p = λ/n ), where λ > 0 is fixed, then for any x ∈ {0, 1, 2, . . . }, we have lim p( x) 
n  x!

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 8
2.2 Discrete Distributions
5. Hypergeometric Distribution:
 A lot consisting of N items of which M of them are defective & the remaining N − M of them are non-defective.
 A sample of n items is drawn randomly without replacement.

𝑀
 We find that the number of ways of selecting x defective items from M defective items is ;
𝑥
𝑀−𝑁
 the # of ways of selecting n − x non-defective items from N −M non-defective items is .
𝑛−x
𝑀 𝑀−𝑁
 Total number of ways of selecting n items with x defective & n−x non-defective items is .
x 𝑛−x
𝑁
 Finally, the number of ways one can select n different items from a collection of N different items is .
𝑛
 The hypergeometric distribution is used when we are sampling without replacement from a finite population.
Thus, the random variable X is referred to as the hypergeometric random variable with parameters N, M and n
& the probability of observing x defective items in a sample of n items is given by:

𝑁−𝑀
𝑀 𝑛−𝑥
𝑝 𝑋 = , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3, … 𝑛
𝑥 𝑁
𝑛
Note that: One of the most common applications of the hypergeometric distribution is in industrial quality control,
such as calculating probabilities for defective parts produced in a factory.
Theory of Distributions by Demeke L.
3/19/2024 (wadela1606@[Link]) 9
2.2 Discrete Distributions
Thus, if X is a hypergeometric distribution, then

𝑛 𝑁−𝑀 𝑛 𝑁−𝑀
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥
𝑀 𝑛 − 𝑥 = 𝑛 𝑀 & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2 𝑀 𝑛 − 𝑥 = 𝑛(𝑛 − 1) 𝑀(𝑀 − 1)
𝑥 𝑁 𝑁 𝑥 𝑁 𝑁(𝑁 − 1)
𝑥=0 𝑥=0
𝑛 𝑛

𝑀 𝑁−𝑀 𝑁−𝑛
Thus, 𝑉 𝑋 = 𝐸 𝑋2 − 𝐸 𝑋 2 = 𝑛. . . .
𝑁 𝑁 𝑁−1

Example: consider a certain electronic components ships in batches of 10. The quality control procedure is to check
3 components in each batch, and reject the batch if 1 or more are found to be defective. If a batch actually contains
2 defective components , what is the probability of that it will be rejected?
Note that:
 The mean lies between 0 and n
 The variance is affected by the sample size, the number of success in the population and the population size (n
becomes large then variance becomes small)
 If we set M/N=p, then the E(X) of the hypergeometric distribution coincides with the E(X) of the binomial
𝑁−𝑛
distribution, & the V(X) of the hypergeometric distribution is (npq).
𝑁−1

 For the hypergeometric distribution, there is no simple closed-form expression for the MGF.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 10
2.2 Discrete Distributions
6. Geometric Distribution: A random variable X is said to be a geometric random variable with parameter p, if its
pmf is given by: 𝑝 𝑋 = (1 − 𝑝)𝑥 𝑝, 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3, …
 Where X is the trial at which the 1st success occurs, so we are waiting for a success.
 Thus, P(X=x) = (1-p)x p need x failures and final trial must be a success
 Hence, the experiment consists of a sequence of independent Bernoulli trials each trial can result in either a
success or failure.
 That mean the geometric distribution counts the # of failures before the first success

If X is a geometric distribution, then



𝑝
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 𝑝(1 − 𝑝)𝑥 =
1 − 𝑞𝑒 𝑡
𝑥=0
∞ ∞
𝑥
𝑞 𝑞
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥𝑝 1 − 𝑝 = & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2𝑝(1 − 𝑝)𝑥 = 2
𝑝 𝑝
𝑥=0 𝑥=0

Thus,
𝑝𝑞 𝑞 2 𝑞
𝑉 𝑋 = 𝐸 𝑋2 − 𝐸 𝑋 2 = − ( ) = 2
𝑝2 𝑝 𝑝

Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′𝑥 𝑡 = 0 = 𝐸(𝑋2) , then V(X) = M’’x(t) – [M’x(t)]2.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 11
2.2 Discrete Distributions
7. Negative Binomial (Pascal) Distribution: the Pascal distribution is a generalized form of the Geometric
distribution, i.e., the geometric counts the number of failures before the first success while the Pascal counts the
number of failures before the rth success.
For example, if x = 3 failures and r = 2 successes, we could have: F F F SS F F SF S F SF F S SF F F S
Thus, a random variable X is said to be a pascal random variable with parameter r & p, if its pmf is given by:

𝑟+𝑥 −1 𝑟 𝑥 −𝑟 𝑟
𝑝 𝑋 = 𝑝 𝑞 = 𝑝 (−𝑞)𝑥 , 𝑓𝑜𝑟 𝑥 = 0, 1, 2, 3, … ; 𝑟 = 1, 2. , 3. , … & 0 < 𝑝 ≤ 1
𝑥 𝑥
Where r is the number of success, x is the number of failures before r success & p is probability of success.

If X is a Pascal distribution, then


1 𝑟
−𝑟 𝑟 𝑝
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 𝑝 (−𝑞)𝑥 =
𝑥 1 − 𝑞𝑒 𝑡
𝑥=0
1 1
𝑟𝑞 𝑝𝑞 + 𝑟𝑞 2
−𝑟 𝑟 −𝑟 𝑟
𝐸 𝑋 = ෍ 𝑋𝑝 𝑥 = ෍ 𝑥 𝑝 (−𝑞)𝑥 = & 𝐸 𝑋2 = ෍ 𝑋2𝑝 𝑥 = ෍ 𝑥2 𝑝 (−𝑞)𝑥 =
𝑥 𝑝 𝑥 𝑝2
𝑥=0 𝑥=0

Thus,
𝑝𝑞 + 𝑟𝑞 2 𝑟𝑞 𝑟𝑞
𝑉 𝑋 =𝐸 𝑋2 − 𝐸 𝑋 2 = − ( )2 = 2
𝑝2 𝑝 𝑝

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 12
2.2 Discrete Distributions
7. Negative Binomial (Pascal) Distribution: the results of expected value and variance for NB distribution is
obtained from the geometric distribution as sum of r independent geometric distribution is a NB distribution.

Suppose Y1 , . . . ,Yr, where each Yi ∼ Geometric(p), Yi independent, then X = Y1 +…+Yr ∼ NB(r, p).
Thus,
𝑟𝑞
E(X) = E(Y1) + …+ E(Yr) = rE(Yi) = and
𝑝

𝑟𝑞
V(X) = V(Y1) + …+ V(Yr) = rV(Yi) =
𝑝2

Finally

1 𝑟
−𝑟 𝑟 𝑝
𝑀𝑥 𝑡 = ෍ 𝑒 𝑋𝑡 𝑝 (−𝑞)𝑥 =
𝑥 1 − 𝑞𝑒 𝑡
𝑥=0

Mx(t) = My1+y2+…yr (t)= My1(t)My2(t) …Myr(t), by iid

𝑝 𝑝 𝑝 𝑝 𝑟
= 𝑡 … =
1−𝑞𝑒 𝑡 1−𝑞𝑒 1−𝑞𝑒 𝑡 1−𝑞𝑒 𝑡

Note that: 𝑀′𝑥 𝑡 = 0 = E 𝑋 & 𝑀′′𝑥 𝑡 = 0 = 𝐸(𝑋2) , then V(X) = M’’x(t) – [M’x(t)]2.
Theory of Distributions by Demeke L.
3/19/2024 (wadela1606@[Link]) 13
2.3 Continuous Distributions
In this section several parametric families of univariate probability density functions are presented.
1. Uniform Distribution: A continuous r.v. X is said to have a Uniform distribution over the interval [a, b], if its pdf
is given by 1
f ( x)  ,a  xb
ba

If X is a uniform distribution, then M X t  


ebt  e at
, E( X ) 
ab
& V (X ) 
b  a 
2

t b  a  2 12

Note that:
Based on the MGF, the mean and variance of a uniform distribution over the interval [a, b] are both zero. SHOW!

However, this result may seem counterintuitive since the mean and variance of a uniform distribution are typically
defined as (a + b) / 2 and (b - a)^2 / 12, respectively.

Thus, the traditional formulas for the mean and variance of a uniform distribution should be used instead.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 14
2.3 Continuous Distributions
2. Normal Distribution: A continuous r.v. X is said to have a normal distribution, if its pdf is given by

1  x 
2

1   
 
f ( x)  e 2 , x
2 
Note that:
 There is CLT which shows that the normal distribution can be used to approximate a large variety of
distributions in large sample under some mild conditions.
 A large portion of statistical theory is built on the normal distribution
 Normal distribution is often used to approximate other probability distributions including discrete
distributions
 2t 2
t
E ( X )   , V ( X )   2 & M X t   e
If X is a normal distribution, then
2

3. Standard Normal Distribution: if X ~ N(µ, σ2) , then Z = (X-µ)/σ has a N(0, 1), which is called standard normal
distribution. 2
1  2z
Thus, the pdf of Z is given by:  ( z )  e , z
2

Note that: To compute the probabilities associated with the normal distribution, we use the standard normal table.
We can show that: M Z t   E e  e
Zt
t2
2

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 15
2.3 Continuous Distributions
4. Exponential Distribution: The exponential distribution is one of the widely used continuous distributions and it is
often used to model the time elapsed between events.

A continuous random variable X is said to have an exponential distribution with parameter λ > 0, if its pdf is given by:

f ( x )   e  x , x  0
If X is a exponential distribution, then

  
E ( X )  , V ( X )  2 & M x t   
1 1

Note that:     t 
If X is exponential with parameter λ > 0, then X is a memoryless random variable, that is
P(X > x + a | X > a) = P(X > x), for a, x ≥ 0.

P ( X  x  a, X  a ) 1  F ( x  a ) e    x  a ) 
p( X  x  a / X  a)     a  e  x  p ( X  x )
p( X  a) 1  F (a) e

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 16
2.3 Continuous Distributions
5. Gamma Distribution: A continuous random variable X is said to have a gamma distribution with parameters α >
0 and λ > 0, shown as X ∼ Gamma(α,λ), if its PDF is given by
 x 1e  x
f ( x)  ,x 0
 
Where Γ (α) is a gamma function specifically, if n ∈ {1, 2, 3, . . . }, then Γ(n) = (n - 1)! and α is a constant (integer)
that ensures f(x) integrates to 1.
More generally, for any positive real number α, Γ(α) is defined as
 
    x  1  x
e dx  
 e dy,   0
y  1  y

0 0
Note that for α =1 and α = ½ , we can write

1
1   e  x dx  1 &    
0 2

Also, using integration by part can be shows that Γ(α+1) = αΓ(α), for α > 0


    
If X is a gamma distribution, then E ( X )  , V ( X )  2 & M x t    
     t 
Theory of Distributions by Demeke L.
3/19/2024 (wadela1606@[Link]) 17
2.3 Continuous Distributions
Note that:
The exponential distribution has been used as a model for lifetimes of various things.
Recall that the Poisson distribution is an occurrence of an event in time, while the length of the time interval
between successive events can be shown to have an exponential distribution provided that the number of events
in a fixed time interval has a Poisson distribution.
Also, if we assume again that the number of events in a fixed time interval is Poisson distributed, the length of time
between time 0 and the instant when the rth event occurs can be shown to have a gamma distribution.
So a gamma random variable can be thought of as a continuous waiting time random variable i.e., it is a time one
has wait for the rth event.
If we let α = 1, we obtain an exponential distribution!
More generally, if you sum n independent Exponential(λ) random variables, then you will get a Gamma(n,λ)
random variable.

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 18
2.3 Continuous Distributions

Theory of Distributions by Demeke L.


3/19/2024 (wadela1606@[Link]) 19

You might also like