Chapter 6
Chapter 6
Chapter 6
Probability Distribution
6.1 Definition of random variables and probability Distribution.
Random variable: - is numerical valued function defined on the sample space.
It assigns a real number for each element of the sample space. Generally a
random variables are denoted by capital letters and the value of the random
variables are denoted by small letters
Example 6.1: Consider an experiment of tossed a fair of coin three times. Let
the random variable X be the number of heads in three tosses, then find X?
Random variables are of two types:
1. Discrete random variable: are variables which can assume only a specific
number of values. They have values that can be counted
Examples:
• Toss a coin n time and count the number of heads.
• Number of children in a family.
• Number of car accidents per week.
• Number of defective items in a given company.
• Number of bacteria per two cubic centimeter of water.
2. Continuous random variable: are variables that can assume all values
between any two give values.
Examples:
• Height of students at certain college.
• Mark of a student.
• Life time of light bulbs.
• Length of time required to complete a given training.
Probability distribution:- consists of a value a random variable can assume
and the corresponding probabilities of the values or it is a function that assigns
probability for each element of random variable.
Probability distribution can be discrete or continues.
A) Discrete probability distribution:- is a formula, a table, a graph or other
devices used to specify all possible values of the discrete random variable(R.V) X
along with their respective probabilities.
Example 6.2:1) Consider the experiment of tossing a coin three times. Let X be
the number of heads. Construct the probability distribution of X.
2) A balanced die is tossed two twice, construct a probability distribution if
A) X is the sum the number of spots in the two trials.
B) X is the absolute difference of the number of spots in the trials.
Properties of discrete probability distribution
n
1) P X x 1
i 1
i
2) P X xi 0 or 0 P X xi 1
3) If X is discrete random variable then
b 1
P a X b P( x)
X a 1
b 1
P a X b P( x)
X a
b
P a X b P( x)
X a 1
b
P a X b P( x)
X a
b) P(a X b) the area under the curve between the point a and b.
c) P X 0
d) P X a 0
e) Pa X b Pa X b P(a X b) P(a X b)
2. Let X be a continuous random variable assuming the values in the interval (a,
b
b) such that f x d (x) =1,then
a
b
E X X . f ( x)d ( x)
a
X 2 f x d ( x) if X is continuous
x
Rule of Expectation
1) Let X be a R.V and k be a real number, then
a) E (kX) =kE(X) var kX k 2 . var X
b) E(X+k) =E(X) + k var( X k ) var( X )
2) Let X and Y be R.V on the sample space, then
a) E X Y E X EY
b) var X Y var X var Y 2. cov X , Y
Where, Cov(X, Y) =the covariance between X and Y=E (XY)-E(X).E(Y)
x .
x 0,1,2.....
P( X x) x! Where is the average number
0
otherwise
occurrence of an event in the unit length of interval or distance and x is the
number of occurrence in a Poisson process
- The Poisson distribution depends only on the average number of occurrences
per unit time of space.
- The Poisson distribution is used as a distribution of rare events, such as:
• Number of misprints.
• Natural disasters like earth quake.
• Accidents.
• Hereditary.
• Arrivals
. Number of misprints per page
- The process that gives rise to such events is called Poisson process.
- If X is a Poisson random variable with parameters λ then
E(x) = λ, var(x)= λ
Note: The Poisson probability distribution provides a close approximation to the
binomial probability distribution when n is large and p is quite small or quite
large with λ=np.
P X x
np .e
x np
where np the average number
x!
x 0,1,2,....
Usually we use this approximation if 5≤np. In other words, if n>20 and np<5 or
n(1-p) ≤5 then we may use Poisson distribution as an approximation to binomial
distribution.
Example 6.5: 1. If 1.6 accidents can be expected an intersection on any given
day, what is the probability that there will be 3 accidents on any given day?
2. If there are 200 typographical errors randomly distributed in a 500-page
manuscript, find the probability that a given page contains exactly 3 errors.
3. A sale firm receives, on the average, 3 calls per hour on its toll-free number.
For any given hour, find the probability that it will receive the following.
a) At most 3 calls
b) At least 3 calls
x
2
f x
1 1
.e 2
where x , , 0
2
E x and 2 var iancex are parameters of the normal distribution.
f x
1
2
2. It is asymptotic to the axis, i.e., it extends indefinitely in either direction from the
mean.
3. It is a continuous distribution i.e. there is no gaps or holes.
4. It is a family of curves, i.e., every unique pair of mean and standard deviation
defines a different normal distribution. Thus, the normal distribution is completely
described by two parameters: mean and standard deviation.
5. Total area under the curve sums to 1, i.e., the area of the distribution on each side of
the mean is 0.5 f ( x)d x 1
1
X
f z i.e. if X ~ N , 2 then Z ~ 0,1
1 Z2
Z e2
2
Properties of the Standard Normal Distribution:
Same as a normal distribution, but also...
• Mean is zero
• Variance is one
• Standard Deviation is one
- Areas under the standard normal distribution curve have been tabulated in various
ways. The most common ones are the areas between Z=0 and a positive value of Z.
- Given a normal distributed random variable X with Mean μ and standard
deviation σ
a X b
Pa X b P
a b
P Z
Example 6.6:
1. Find the area under the standard normal distribution which lies
a) Between Z=0 and z=96.0
b) Z=-1.45 and Z=0
c) The right of Z=-0.35
d) To the left of Z=0.35
e) Between Z=-0.67 and Z=0.75
f) Between Z=0.25 and Z=1.25
2. Find the value of Z if
a) The normal curve area between 0 and z(positive) is 0.4726
b) The area to the left of z is 0.9868
3. A random variable X has a normal distribution with mean 80 and standard
deviation 4.8. What is the probability that it will take a value?
a) Less than 87.2
b) Greater than 76.4
c) Between 81.2 and 86.0
4. A normal distribution has mean 62.4.Find its standard deviation if 20.05% of
the area under the normal curve lies to the right of 72.9
5. A random variable has a normal distribution with σ =5.Find its mean if the
probability that the random variable will assume a value less than 52.5 is
0.6915.
6. Of a large group of men, 5% are less than 60 inches in height and 40% are
between 60 & 65 inches. Assuming a normal distribution, find the mean and
standard deviation of heights.
2. Student’s t Distribution
In statistics as long as sample size is large enough, most datasets can be
explained by Standard Normal Distribution. But when the sample size is small,
statisticians rely on the distribution of the t statistic (also known as the t score),
whose value is given by:
[x μ]
t=
s
n
Where x is the sample mean, μ is the population mean, s is the standard
deviation of the sample, and n is the sample size.
The distribution of the t statistic is called the t distribution or the Student t
distribution. The particular form of the t distribution is determined by its
Degrees of Freedom (df). The degree of freedom refers to the number of
independent observations in a set of data. When estimating a mean score or a
proportion from a single sample, the number of independent observations is
equal to the sample size minus one.. The t distribution can be used with any
statistic having a bell-shaped distribution (i.e., approximately normal).
The t distribution has the following properties:
The mean of the distribution is equal to 0.
The variance is equal to v / (v - 2), where v is the degrees of freedom.
With infinite degrees of freedom, the t distribution is the same as the standard
normal distribution.
The t distribution is similar to standard normal distribution in the
following ways
It is bell-shaped.
It is symmetric about the mean.
The mean, median, and mode are equal to zero and located at the
center of the distribution.
The curve never touches the x axis.
The t distribution differs from standard normal distribution in the
following ways.
The variance is greater than one
The t distribution is actually a family of curves based on the concept of
degrees of freedom, which is related to sample size.
As the sample size increases, the t distribution approaches the standard
normal distribution.
Chi-Square Distribution
1) Find the chi-square critical value for a specific when the hypothesis test is
one tailed right. In this case, find the value at the top of 2 table and the
corresponding degree of freedom in the left column. Then, the critical value is
located when the two columns meet.
Example: the critical chi-square value for 15 degree of freedom when =0.05
and the test is one-tailed right ( 015.05 ) is 24.996
2). Find the chi-square critical value for a specific when the hypothesis test is
one tailed left. In this case, the value must be subtracted from one. Then, the
left side of the table used, because the 2 table gives the area to the right of the
Example: The critical 2 value for 10 df when =0.05 and the test is one-tailed
left is 3.940.
3) Find the chi-square critical value for a specific when the hypothesis test is
two-tailed. When a two-tailed test is conducted, the area must be split. For
example, to find the critical chi-square values for 22 degrees of freedom when
=0.05, we use the area to the right of the larger value 0.025 (0.05/2), and the
area to right of the smaller value 0.975(1-0.05/2). Hence, one must use values
in the table of 0.025 and 0.975, with 22 degrees of freedom the critical values are
36.781 and 10.982 respectively.
Note that after the degrees of freedom reach 30, chi-square table only gives
values for multiples of 10(40, 50,60,etc.). When the exact degrees of freedom one
is seeking are not specified in the table, the closer smaller value should be used.
The chi-square distribution has the following properties: