0% found this document useful (0 votes)
55 views21 pages

Distributions

The document discusses various probability distributions including binomial, Poisson, normal, and chi-square distributions. It provides the probability mass functions or probability density functions that define each distribution. It also discusses how sampling distributions arise from taking samples from populations and computing statistics on the samples.

Uploaded by

Pratik Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views21 pages

Distributions

The document discusses various probability distributions including binomial, Poisson, normal, and chi-square distributions. It provides the probability mass functions or probability density functions that define each distribution. It also discusses how sampling distributions arise from taking samples from populations and computing statistics on the samples.

Uploaded by

Pratik Das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Statistical Inference

Class 2

Sugata Sen Roy


Department of Statistics,
University of Calcutta,
Calcutta, INDIA.

1 / 21
Probability Distributions : Binomial Distribution

I Suppose there are two possible outcomes - Success & Failure.


I Let there be n independent trials.
I Let π = P[success in a given trial] , 0 ≤ π ≤ 1.
I π is constant for each trial
I Let X = number of successes in the n trials.
I Then X ∼ Bin(n, π), i.e. X has the p.m.f.
 
n
f (x) = π x (1 − π)n−x , x = 0, 1, . . . , n.
x
I E(X) = nπ
I V ar(X) = nπ(1 − π)
I If n = 1 (a Bernoulli trial),

f (x) = π x (1 − π)1−x , x = 0 or 1.

2 / 21
Some similar Distributions

I Negative Binomial Distribution


I Here X is the number of trials to get r (given) successes.
 
x−1
f (x) = π r (1 − π)x−r , x = r, r + 1, . . . , ....
r−1

I Geometric Distribution (Special case when r = 1).


I Hypergeometric Distribution
I Let X be the number of S-cards when drawing n cards from
an urn having N cards, M of which are S-cards.
  
M N −M
x n−x
f (x) =   , x = 0, 1, . . . , min(n, M ).
N
n

I Inverse Hypergeometric Distribution


3 / 21
Probability Distributions : Poisson Distribution

I Let X be the number of counts of an event.


I Examples : X can be the number of
I particles emitted from a radio-active source in a given time
I road accidents in a year on a given road
I annual visits of a patient to a doctor
I rivers in a country
I Then X is P(λ) if for some λ > 0,

e−λ λx
f (x) = , x = 0, 1, 2, . . .
x!
I E(X) = λ
I V ar(X) = λ

4 / 21
Probability Distributions : Normal Distribution

I Let X be a continuous random variable (theoretically) taking


values from −∞ to ∞.
I Examples : Height, Weight, IQ, measurement error.
I X is said to follow the normal distribution N (µ, σ 2 ) if the
probability density function of X is of the form

1 (x−µ)2
f (x) = √ e− 2σ2 , − ∞ < x < ∞.
2πσ
I E(X) = µ, −∞ < µ < ∞
I V ar(X) = σ 2 , σ > 0

5 / 21
Normal Distribution

6 / 21
Properties of the Normal Distribution

I Symmetric about µ.
I Moderately peaked and thin tailed (mesokurtic).
I 99.73% of X’s lie within [µ − 3σ, µ + 3σ].
I If X ∼ N (µ, σ 2 ), then the random variable

X −µ
τ= ∼ N (0, 1),
σ
i.e. a normal distribution with mean zero and variance 1.
I τ is known as the standard normal variable and its p.d.f. is

1 τ2
φ(τ ) = √ e− 2 , − ∞ < x < ∞.

7 / 21
Probability Distribution and Sampling Distribution

I The above distributions that we just studied are referred to as


probability (or theoretical) distributions.
I Probability distributions characterize the population.
I Next we take a sample from the population and compute a
statistic.
I This statistic varies from sample to sample and hence is itself
a random variable.
I The distribution of this statistic is referred to as sampling
distribution.

8 / 21
A Small Experiment

I Let a population be of size 5 with X-values


?, ?, ?, ?, ?.
I Our interest is in the unknown population mean
µ = 51 (sum of the five numbers).
I Lack of time forces us to pick just two members randomly and
study the sample mean X = X1 +X 2
2
. hoping that it would be
a good representation of µ.
I I pick red and green and they are 2 and 8. Thus my
X = (2 + 8)/2 = 5.
I You pick blue and brown and they are 10 and 3. Thus your
X = (10 + 3)/2 = 6.5.
I Someone else picks green and purple and they are 8 and 12.
Thus his/her X = (8 + 12)/2 = 10.
I How many such samples are possible ?

9 / 21
A Small Experiment (contd.)

I With Replacement (WR) Sampling i.e. a unit chosen is


replaced back in the population before the next choice - here
the same unit can be chosen several times.
I This in our case will lead to 5 × 5 = 25 possible samples.
I In general, for a sample of size n from a population of size N ,
possible number of samples = N n .
I Without Replacement (WOR) Sampling i.e. a unit chosen is
not replaced back in the population again - here the sample
has all distinct units.  
5
I This in our case will lead to = 10 possible samples.
2
I In general, for a sample of size n froma population
 of size N ,
N
possible number of samples = .
n

10 / 21
A Small Experiment (contd.)

I In either case we can get several possible samples and


correspondingly several X.
(like 5, 6.5, 10, ... as in the above example)
I The distribution of this X’s is referred to as its sampling
distribution.
I Observe that in practice we will get only one sample.
I But theoretically we can study this distribution and make our
conclusions based on it.

11 / 21
Sampling from Binomial Distribution

I Sampling distribution of the sum of successes in n Bernoulli


trials :
I Suppose X1 , X2 , ..., Xn is a sample from a Bernoulli
experiment with
probability of success = π
and probability of failure = 1 − π.
I Let S = ni=1 Xi = number of successes in the sample.
P

Then S ∼ Bin(n, π).

I In fact, if Xi ’s are identically and independently distributed


(i.i.d.) as Bin(m, π),
n
X
S= Xi ∼ Bin(nm, π).
i=1

12 / 21
Sampling from a Poisson Distribution

I Sampling distribution of the sum of Poisson variables :


I Suppose X1 , X2 , ..., Xn is a sample from a Poisson
distribution, P (λ).
I Let S = ni=1 Xi = total count in the sample.
P

I Then S ∼ P (nλ).

13 / 21
Sampling from a N (µ, σ 2 ) Distribution

I Let X1 , X2 , ..., Xn be a sample from N (µ, σ 2 ) (i.i.d.).


I The n observations are independently drawn.
I Then S = ni=1 Xi ∼ N (nµ, nσ 2 )
P

I Also Sample Mean,


n
1X S σ2
X= Xi = ∼ N (µ, ).
n n n
i=1

I What about Sample Variance


n
1X
2
S = (Xi − µ)2 if µ is known
n
i=1

n
2 1 X
or s = (Xi − X)2 if µ is unknown?
n−1
i=1

14 / 21
Chi-square Distribution

I Define
Xi − µ
τi = , i = 1, . . . , n.
σ
I τi ’s i.i.d. ∼ the standard normal distribution N (0, 1).
I The sum of the squares of the standard normal variables
n n 
Xi − µ 2
X X 
2
χ = τi2 = ,
σ
i=1 i=1

is said to follow the Chi-square (χ2n ) distribution with n


degrees of freedom.
I p.d.f. of χ2n :

1 χ2 n
f (χ2 ) = n e− 2 (χ2 ) 2 −1 , 0 < χ2 < ∞.
2 Γ(n/2)
2

I positively skewed distribution


15 / 21
t-Distribution

I Let τ ∼ N (0, 1) independently of a χ2n distribution with n d.f..


I Then
τ
t= p ,
χ2n /n
is said to follow the t-distribution with n degrees of freedom.
I p.d.f. of tn :
− n+1
Γ( n+1 ) t2
 2
f (t) = √ 2 n 1+ , − ∞ < t < ∞.
nπΓ( 2 ) n

I symmetric about 0
I leptokurtic.

16 / 21
F-Distribution

I Let χ21 and χ22 be two independent chi-square variables with


n1 and n2 degrees of freedom respectively..
I Then
χ2 /n1
F = 21 .
χ2 /n2
is said to follow the F-distribution with degrees of freedom
(n1 , n2 ).
I p.d.f. of Fn1 ,n2 :
  n1 −1   n1 +n2
n1 /n2 n1 2 n1 2
f (F ) = F 1+ F , 0 < F < ∞.
β( n21 , n22 ) n2 n2

17 / 21
Distribution of the Sample Mean and Variance

I X1 , X2 , ..., Xn is a sample from a N (µ, σ 2 ).


2
I X ∼ N (µ, σn ).
I Hence, if σ is known,

X −µ n(X − µ)
τ= √ = ∼ N (0, 1).
σ/ n σ
I If µ is known,
n  n
Xi − µ 2
Pn
nS 2 − µ)2

i=1 (Xi
X X
= = = τ 2 ∼ χ2n .
σ2 σ2 σ
i=1 i=1

18 / 21
Distribution of the Sample Mean and Variance

I If µ is unknown,
Pn
ns2 i=1 (Xi − X)2
= ∼ χ2n−1 .
σ2 σ2
I If σ is unknown,

X −µ n(X − µ)/σ τ
√ =q =q ∼ tn−1 .
s/ n (n−1)s2
χ2 /(n − 1)
σ2
/(n − 1) n−1

19 / 21
Statistical Inference

I Two distinct areas


I Problem of Estimation
I Problem of Hypothesis Testing
I The problem of estimation arises when we have no apriori idea
of the population parameter(s), or the characteristic(s) of the
population that we are interested in.
I We then use the sample observations to obtain a statistic
which can be used as a substitute (or estimator) of this
parameter.
I For example, we want to study the per capita income of
Egyptians.
I Do we even have a remote idea about it? How to find it ?

20 / 21
Statistical Inference (contd.)

I Estimation problem itself can be of rwo types


I Point Estimation - a single value is quoted as the subtsitute of
the unknown parameter.
I Interval Estimation - an interval is given such that we strongly
believe that the parameter lies within it.
I In hypothesis testing on the other hand, we have some
tentative idea or hypothesis about the population
parameter(s) under study.
I We then use the sample to find out whether this idea is
correct or not.
I For example, suppose we believe that the Indian annual per
capita income is Rs 10000/-.
I Is this correct or is it not ? We have a hypothesis testing
problem.

21 / 21

You might also like