Random Variables and Probability Functions
Random Variables and Probability Functions
5.1 INTRODUCTION
In the previous units, we have studied the assignment and computation of
probabilities of events in detail. In those units, we were interested in knowing
the occurrence of outcomes. In the present unit, we will be interested in the
numbers associated with such outcomes of the random experiments. Such an
interest leads to study the concept of random variable.
In this unit, we will introduce the concept of random variable, discrete and
continuous random variables in Sec. 5.2 and their probability functions in Secs.
5.3 and 5.4.
Objectives
A study of this unit would enable you to:
define a random variable, discrete and continuous random variables;
specify the probability mass function, i.e. probability distribution of
discrete random variable;
specify the probability density function, i.e. probability function of
continuous random variable; and
define the distribution function.
6
Random Variables
5.3 DISCRETE RANDOM VARIABLE AND
PROBABILITY MASS FUNCTION
Discrete Random Variable
A random variable is said to be discrete if it has either a finite or a countable
number of values. Countable number of values means the values which can
be arranged in a sequence, i.e. the values which have one-to-one
correspondence with the set of natural numbers, i.e. on the basis of three-four
successive known terms, we can catch a rule and hence can write the
subsequent terms. For example suppose X is a random variable taking the
values say 2, 5, 8, 11, … then we can write the fifth, sixth, … values,
because the values have one-to-one correspondence with the set of natural
numbers and have the general term as 3n 1 i.e. on taking n = 1, 2, 3, 4, 5, …
we have 2, 5, 8, 11, 14,…. So, X in this example is a discrete random
variable. The number of students present each day in a class during an
academic session is an example of discrete random variable as the number
cannot take a fractional value.
Probability Mass Function
Let X be a r.v. which takes the values x1, x2, ... and let P X x i = p(xi). This
function p(xi), i =1,2, … defined for the values x1, x2, … assumed by X is
called probability mass function of X satisfying p(xi) 0 and p x i 1 .
i
X x1 x2 x3 …
p( x ) p( x1 ) p( x 2 ) p( x 3 )…
(ii)
X 0 1 2
p( x ) 3 1 3
4 2 4
7
Random Variables and (iii)
Expectation
X 0 1 2
p( x ) 1 1 1
4 2 4
(iv)
X 0 1 2 3
p( x ) 1 3 1 1
8 8 4 8
Solution:
(i) Here p( x i ) 0, i = 1, 2; but
2
1 3 5
px
i 1
i = p( x1 ) + p( x 2 ) = p(0) + p(1) = 1.
2 4 4
2
So, the given distribution is not a probability distribution as p x is
i 1
i
greater than 1.
1
(ii) It is not probability distribution as p(x2) = p(1) = i.e. negative
2
(iii) Here, p(x i ) 0 , i = 1, 2, 3
3
1 1 1
and p x p x p x p x p 0 p 1 p 2 4 2 4 1 .
i 1
i 1 2 3
p x = p( x ) + p( x
i 1
i 1 2 ) + p( x3 ) + p( x 4 )
1 3 1 1 7
= p(0) + p(1) + p(2) + p(3) = 1.
8 8 4 8 8
The given distribution is not probability distribution.
Example 2: For the following probability distribution of a discrete r.v. X, find
i) the constant c,
ii) P[X 3] and
iii) P[1 < X < 4].
X 0 1 2 3 4 5
p( x ) 0 c c 2c 3c c
8
Solution: Random Variables
px 1
i
i
1
0 + c + c + 2c + 3c + c = 1 8 c = 1 c =
8
ii) P[X 3] = P[X = 3] + P[X = 2] + P[X = 1] + P[X = 0]
1 1
= 2c + c+ c + 0 = 4 c = 4 .
8 2
1 3
iii) P[1 < X < 4] = P[X = 2] + P[X = 3] = c + 2c = 3c = 3 .
8 8
Example 3: Find the probability distribution of the number of heads when
three fair coins are tossed simultaneously.
Solution: Let X be the number of heads in the toss of three fair coins.
As the random variable, “the number of heads” in a toss of three coins may be
0 or 1 or 2 or 3 associated with the sample space
{HHH, HHT, HTH, THH, HTT, THT, TTH, TTT},
X can take the values 0, 1, 2, 3, with
1
P[X = 0] = P[TTT ] =
8
3
P[X = 1] = P[HTT, THT, TTH] =
8
3
P[X = 2] = P[HHT, HTH, THH] =
8
1
P[X = 3] = P [HHH] = .
8
Probability distribution of X, i.e. the number of heads when three coins are
tossed simultaneously is
X 0 1 2 3
p( x ) 1 3 3 1
8 8 8 8
9
Random Variables and Solution: As P[X < 0] = P[X = 0] = P[X > 0]
Expectation
P[X = 1] + P[X = 2] = P[X = 0] = P[X = 1] + P[X = 2]
p + p = P[X = 0] = p + p
[Letting P[X = 1] = P[X = 2] = P[X = 1] = P[X = 2] = p]
P[X = 0] = 2p.
Now, as P[X < 0] + P[X = 0] + P[X > 0] = 1,
P[X = 1] + P[X = 2] + P[X = 0] + P[X = 1] + P[X = 2] = 1
p + p + 2p + p + p = 1
1
6p = 1 p =
6
1 2
P[X = 0] = 2p = 2 = ,
6 6
1
P[X = 1] = P[X = 2] = P[X = 1] = P[X = 2] = p = .
6
Hence, the probability distribution of X is given by
X 2 1 0 1 2
p(x) 1 1 2 1 1
6 6 6 6 6
X 0 1 2 3
p( x ) 1 3 1 1
10 10 2 10
10
Let us define and explain a continuous random variable and its probability Random Variables
function in the next section.
A B
y = f(x)
D C
x x + x
Fig. 5.1
Now, if x is very-very small, then the curve AB will act as a line and hence
the shaded region will be a rectangle whose area will be AD DC i.e. f( x ) x
[ AD = the value of y at x i.e. f ( x ), DC = length x of the interval
( x, x + x )]
Also, this area = probability that X lies in the interval ( x, x + x )
= P[ x X x + x ]
11
Random Variables and Hence,
Expectation
P[ x X x + x ] = f (x) x
P[x X x x]
f x , where x is very-very small
x
P[x X x x]
lim f x.
x 0 x
f( x ), so defined, is called probability density function.
Probability density function has the same properties as that of probability mass
function. So, f( x ) 0 and sum of the probabilities of all possible values that
the random variable can take, has to be 1. But, here, as X is a continuous
random variable, the summation is made possible through ‘integration’ and
hence
f x dx 1 ,
R
where integral has been taken over the entire range R of values of X.
Remark 2
i) Summation and integration have the same meanings but in mathematics
there is still difference between the two and that is that the former is used
in case of discrete values, i.e. countable values and the latter is used in
continuous case.
ii) An essential property of a continuous random variable is that there is
zero probability that it takes any specified numerical value, but the
probability that it takes a value in specified intervals is non-zero and is
calculable as a definite integral of the probability density function of the
random variable and hence the probability that a continuous r.v. X will
lie between two values a and b is given by
b
P[a < X < b] = f x dx .
a
12
1 1 Random Variables
f x dx 1 Ax 3dx 1
0 0
1
x4 1
A 1 A 0 1 A 4
4 0 4
0.5 0.5 0.5
x4
(ii) P[0.2 < X < 0.5] = f x dx = Ax dx 4 = [(0.5)4 – (0.2)4]
3
1
3 0 1 3
Now, P X f x dx
4 3 2 4
4
3
lower limit is 4 and upper limit
is given in the problem which is1
1 1 4
x4 4 3 81 175
= 4x dx 4 = 1 1
3
, and
3 4 3 4 256 256
4
4
1
1 1 15
P X f x [x 4 ]11/ 2 1 .
2 1 16 16
2
3
P X
4 175 16 35 35
the required probability .
1 256 15 16 3 48
P X
2
Example 6: The p.d.f. of the different weights of a “1 litre pure ghee pack” of
a company is given by:
f(x) is p.d.f.
1.02 1.02
x2
Now, P[1.01 < X < 1.02] = 200 x 1 dx = 200 x
1.01 2 1.01
1.0404 1.0201
= 200 1.02 1.01
2 2
= 200 0.5202 1.02 0.51005 1.01
A
, 1500 x 2500
f x x3
0, elsewhere
14
Discrete Distribution Function Random Variables
= p1 + p2 + ... + pi .
x2 p1 p 2
x3 p1 p 2 p3
x4 p1 p 2 p3 p 4
. .
. .
. .
The value of F(x) corresponding to the last value of the random variable X is
always 1, as it is the sum of all the probabilities. F(x) remains 1 beyond this
last value of X also, as it being a probability can never exceed one.
For example, Let X be a random variable having the following probability
distribution:
X 0 1 2
p( x ) 1 1 1
4 2 4
Notice that p( x ) will be zero for other values of X. Then, Distribution function
of X is given by
X F( x ) = P[X x ]
0 1
4
1 1 1 3
4 2 4
2 1 1 1
1
4 2 4
15
Random Variables and Here, for the last value, i.e. for X = 2, we have F( x ) =1.
Expectation
Also, if we take a value beyond 2 say 4, then we get
F(4) = P[X 4]
= P[X = 4] + P[X = 3] + P[X 2]
= 0 + 0 + 1 = 1.
Example 7: A random variable X has the following probability function:
X 0 1 2 3 4 5 6 7
p( x ) 1 1 1 3 1 1 17
0
10 5 5 10 100 50 100
Determine the distribution function of X.
Solution: Here,
F(0) = P[X 0] = P[X = 0] = 0,
1 1
F(1) = P[X 1] = P[X = 0] + P [X = 1] = 0 + ,
10 10
1 1 3
F(2) = P[X 2] = P[X = 0] + P [X = 1] + [X = 2] = 0 + ,
10 5 10
and so on. Thus, the distribution function F( x ) of X is given in the following
table:
X F( x ) = P[X x ]
0 0
1 1
10
2 3
10
3 3 1 1
10 5 2
4 1 3 4
2 10 5
5 4 1 81
5 100 100
6 81 1 83
100 50 100
7 83 17
1
100 100
16
Continuous Distribution Function Random Variables
d
f(x) =
dx
F x
dF(x) = f(x) dx
x x
F x P[X x] f x dx = 6x 1 x dx
0 0
x x
x 2 x3
= 6 x x dx = 6 = 3x 2 2x 3
2
0 2 3 0
17
Random Variables and The c.d.f. of X is given by
Expectation
0, x0
2 3
F x 3x 2x , 0 x 1.
1, x 1
X 0 1 2 3 4 5 6 7 8
x
2, 0 x 1
1 , 1 x 2
f x 2 .
1
3 x , 2x 3
2
0, elsewhere.
5.6 SUMMARY
Following main points have been covered in this unit of the course:
1) A random variable is a function whose domain is a set of possible
outcomes and range is a sub-set of the set of reals and has the following
properties:
i) Each particular value of the random variable can be assigned some
probability.
ii) Sum of all the probabilities associated with all the different values of
18 the random variable is unity.
2) A random variable is said to be discrete random variable if it has either Random Variables
a finite number of values or a countable number of values, i.e. the values
can be arranged in a sequence.
3) If a random variable is such that its values cannot be arranged in a
sequence, it is called continuous random variable. So, a random
variable is said to be continuous if it can take all the possible real
(i.e. integer as well as fractional) values between two certain limits.
4) Let X be a discrete r.v. which take on the values x1 , x2, ... and let
P X x i = p(xi). The function p(xi) is called probability mass
function of X satisfying p(xi) 0 and p x 1 . The set
i
i
discrete r.v. X.
5) Let X be a continuous random variable and f( x ) be a continuous
function of x . Suppose ( x , x + x ) be an interval of length x . Then
P[x X x x]
f( x ) defined by lim f x is called the probability
x 0 x
density function of X.
Probability density function has the same properties as that of probability
mass function i.e. f( x ) 0 and f x dx 1 , where integral has been
R
taken over the entire range R of values of X.
6) A function F defined for all values of a random variable X by
F( x ) = P[X x ] is called the distribution function. It is also known as
the cumulative distribution function (c.d.f.) of X. The domain of the
distribution function is a set of real numbers and its range is [0, 1].
Distribution function of a discrete random variable X is said to be
discrete distribution function and is given by
x1 , F x1 , x 2 , F x 2 ,... . Distribution function of a continuous
random variable X having the probability density function f( x ) is said to
be continuous distribution function and is given by
x
F( x ) = P[X x ] = f x dx .
5.7 SOLUTIONS/ANSWERS
E 1) Let X be the number of bad articles drawn.
X can take the values 0, 1, 2 with
P[X = 0] = P[No bad article]
= P[Drawing 2 articles from 5 good articles and zero article
from 2 bad articles]
19
Random Variables and 5
Expectation
C 2 2 C0 5 4 1 10
7
,
C2 76 21
P[X = 1] = P[One bad article and 1 good article]
2
C1 5 C1 2 5 2 10
7
, and
C2 7 6 21
P[X = 2] = P[Two bad articles and no good article]
2
C 2 5 C0 1 1 2 1
= 7
C2 76 21
E2) As Y = X2 + 2X,
For X = 0, Y = 0 + 0 = 0;
For X = 1, Y = 12 + 2(1) = 3;
For X = 2, Y = 22 + 2(2) = 8; and
For X = 3, Y = 32 + 2(3) = 15.
Thus, the values of Y are 0, 3, 8, 15 corresponding to the values
0, 1, 2, 3 of X and hence
1 3
P[Y = 0] = P[X = 0] = , P[Y = 3] = P[X = 1] = ,
10 10
1 1
P[Y = 8] = P[X = 2] = and P[Y = 15] = P[X = 3] = .
2 10
The probability distribution of Y is
Y 0 3 8 15
p(y) 1 3 1 1
10 10 2 10
20
3 3 3 27 Random Variables
=
7 7 7 343
P[X = 1] = P[One red and two white]
= P R1 W2 W3 or W1 R 2 W3 or W1 W2 R 3
= P R 1 W2 W3 P W1 R 2 W3 P W1 W2 R 3
= P[R 1 ]P W2 P W3 P W1 P R 2 P W3 P W1 P W2 P R 3
4 3 3 3 4 3 3 3 4 4 3 3 108
= = 3 = ,
7 7 7 7 7 7 7 7 7 7 7 7 343
P[X = 2] = P[Two red and one white]
P R1 R 2 W3 or R 1 W2 R 3 or W1 R 2 R 3
= P[R 1 ]P R 2 P W3 P R1 P W2 P R 3 P W1 P R 2 P R 3
4 4 3 4 3 4 3 4 4 4 4 3 144
= = 3 = .
7 7 7 7 7 7 7 7 7 7 7 7 343
P[X = 3] = P [Three red balls]
4 4 4 64
= P[R1 R2 R3] = P(R1) P(R2) P(R3) = .
7 7 7 343
Probability distribution of the number of red balls is
X 0 1 2 3
p( x ) 27 108 144 64
343 343 343 343
E4) As f( x ) is p.d.f.,
2500 2500 2500
A 3 x 2
3
dx 1 A x dx 1 A 1
1500
x 1500 2 1500
A 1 1 A 1 1
1 1
2 2
2 2500 1500 20000 625 225
A 9 25
1 16A = 5625 20000
20000 5625
5625 20000
A= = 5625 1250 = 7031250.
16
2000 2000
1
Now, P[1600 X 2000] = f x dx = A dx
1600 1600
x3
21
Random Variables and
A 1 1
2000
Expectation A 1
=
2 x 2 1600 2 2000 2 1600 2
A 1 1 A 16 25
= =
20000 400 256 20000 6400
9 7031250 2025
= =
20000 6400 4096
E5) i) As the given distribution is probability distribution,
sum of all the probabilities = 1
k + 3k + 5k + 7k + 9k + 11k + 13k + 15k + 17k = 1
1
81 k = 1 k =
81
ii) The distribution function of X is given in the following table:
X F( x ) = P[X x ]
0 1
k
81
1 4
k 3k 4k
81
2 9
4k 5k 9k
81
3 16
9k 7k 16k
81
4 25
16k 9k 25k
81
5 36
25k 11k 36k
81
6 49
36k 13k 49k
81
7 64
49k 15k 64k
81
8 64k 17k 81k 1
22
For 0 x < 1, Random Variables
x 0 x
F( x ) = P[X x ] =
f x dx = f x dx f x dx
0
[ 0 x <1]
x
x x
=0+ 2 dx f (x) 2 for 0 x 1
0
x
1 x2 x2
= .
2 2 0 4
For 1 x < 2,
x
F ( x ) = P [X x ] =
f x dx
0 1 x
f x dx f x dx f x dx
0 1
1 x
x 1 1 1 1
= 0 dx dx = x 2 [x]1x
0
2 1
2 4 0 2
1 x 1 1
= 2x 1
4 2 2 4
For 2 x < 3,
x
F x = f x dx
0 1 2 x
= f x dx f x dx f x dx f x dx
0 1 2
1 2 x
x 1 1
= 0 dx dx 3 x dx
0
2 1
2 2
2
1 x
x2 1 2 1 x2
= x 1 3x
4 0 2 2 2 2
1 1 1 x2
= 3x 6 2
4 2 2 2
1 x2 5 x 2 3x 5
= 3x =
2 2 4 4 2 4
For 3 x < ,
x
F x = f x dx
23
Random Variables and 0 1 2 3 x
Expectation = f x dx f x dx f x dx f x dx f x dx
0 1 2 3
1 2 3 x
x 1 1
= 0 dx dx 3 x dx 0 dx
0
2 1
2 2
2 3
1 2 3
x2 x 1 x2
= 3x 0
4 0 2 1 2 2 2
1 1 1 9
= 1 9 6 2
4 2 2 2
1 1 19
= 4
4 2 22
1 1 1
= 1
4 2 4
Hence, the distribution function is given by:
0, x 0
x2
, 0 x 1
4
2x 1
Fx , 1 x 2
4
x 2 3x 5
, 2 x3
4 2 4
1, 3 x
24
Bivariate Discrete Random
UNIT 6 BIVARIATE DISCRETE RANDOM Variables
VARIABLES
Structure
6.1 Introduction
Objectives
6.1 INTRODUCTION
In Unit 5, you have studied one-dimensional random variables and their
probability mass functions, density functions and distribution functions. There
may also be situations where we have to study two-dimensional random
variables in connection with a random experiment. For example, we may be
interested in recording the number of boys and girls born in a hospital on a
particular day. Here, ‘the number of boys’ and ‘the number of girls’ are
random variables taking the values 0, 1, 2, … and both these random variables
are discrete also.
In this unit, we concentrate on the two-dimensional discrete random variables
defining them in Sec. 6.2. The joint, marginal and conditional probability mass
functions of two-dimensional random variable are described in Sec. 6.3. The
distribution function and the marginal distribution function are discussed in
Sec. 6.4.
Objectives
A study of this unit would enable you to:
define two-dimensional discrete random variable;
specify the joint probability mass function of two discrete random
variables;
obtain the marginal and conditional distributions for two-dimensional
discrete random variable;
define two-dimensional distribution function;
define the marginal distribution functions; and
solve various practical problems on bivariate discrete random variables.
25
Random Variables and
Expectation 6.2 BIVARIATE DISCRETE RANDOM
VARIABLES
In Unit 5, the concept of single-dimensional random variable has been studied
in detail. Proceeding in analogy with the one-dimensional case, concept of
two-dimensional discrete random variables is discussed in the present unit.
A situation where two-dimensional discrete random variable needs to be
studied has already been given in Sec. 6.1 of this unit. To describe such
situations mathematically, the study of two random variables is introduced.
Definition: Let X and Y be two discrete random variables defined on the
sample space S of a random experiment then the function (X, Y) defined on
the same sample space is called a two-dimensional discrete random variable. In
others words, (X, Y) is a two-dimensional random variable if the possible
values of (X, Y) are finite or countably infinite. Here, each value of X and Y is
represented as a point ( x, y ) in the xy-plane.
As an illustration, let us consider the following example:
Let three balls b1, b2, b3 be placed randomly in three cells. The possible
outcomes of placing the three balls in three cells are shown in Table 6.1.
Table 6.1 : Possible Outcomes of Placing the Three Balls in Three Cells
Arrangement Placement of the Balls in
Number
Cell 1 Cell 2 Cell 3
1 b1 b2 b3
2 b1 b3 b2
3 b2 b1 b3
4 b2 b3 b1
5 b3 b1 b2
6 b3 b2 b1
7 b1,b2 b3 -
8 b1,b2 - b3
9 - b1,b2 b3
10 b1,b3 b2 -
11 b1,b3 - b2
12 - b1,b3 b2
13 b2,b3 b1 -
14 b2,b3 - b1
15 - b2,b3 b1
16 b1 b2,b3 -
17 b1 - b2,b3
18 - b1 b2,b3
26
19 b2 b3,b1 - Bivariate Discrete Random
Variables
20 b2 - b3,b1
21 - b2 b3,b1
22 b3 b1,b2 -
23 b3 - b1,b2
24 - b3 b1,b2
25 b1,b2,b3 - -
26 - b1,b2,b3 -
27 - - b1,b2,b3
Now, let X denote the number of balls in Cell 1 and Y be the number of cells
occupied. Notice that X and Y are discrete random variables where X take on
the values 0, 1, 2, 3 ( number of balls in Cell 1 may be 0 or 1 or 2 or 3) and
Y take on the values 1, 2, 3 ( number of occupied cells may be 1 or 2 or 3).
The possible values of two-dimensional random variable (X, Y), therefore, are
all ordered pairs of the values x and y of X and Y, respectively, i.e. are (0, 1),
(0, 2), (0, 3), (1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3).
Now, to each possible value xi , y j of (X, Y), we can associate a number
p x i , y j representing P X x i , Y y j as discussed in the following section
of this unit.
27
Random Variables and
Expectation p(1, 2) = P[X = 1, Y = 2] = P[one ball in Cell 1 and 2 cells occupied]
= P[Arrangement numbers 16, 17, 19, 20, 22, 23]
6
=
27
p(1, 3) = P[X = 1, Y = 3] = P[one ball in Cell 1 and 3 cells occupied]
6
= P[Arrangement numbers1 to 6] =
27
p(2, 1) = P[X = 2, Y = 1] = P[two balls in Cell 1 and 1 cell occupied]
= P[Impossible event] = 0
p(2, 2) = P[X = 2, Y = 2] = P[two balls in Cell 1 and 2 cells occupied]
= P[Arrangement numbers 7, 8, 10, 11, 13, 14]
6
=
27
p(2, 3) = P[X = 2, Y = 3] = P[two balls in Cell 1 and 3 cells occupied]
= P[Impossible event] = 0
p(3, 1) = P[X = 3, Y = 1] = P[three balls in Cell 1 and 1 cell occupied]
1
= P[Arrangement number 25] =
27
p(3, 2) = P[X = 3, Y = 2] = P[three balls in Cell 1 and 2 cells occupied]
= P[Impossible event] = 0
p(3, 3) = P[X = 3, Y = 3] = P[three balls in Cell 1 and 3 cells occupied]
= P[Impossible event] = 0
The values of (X, Y) together with the number associated as above constitute
what is known as joint probability distribution of (X, Y) which can be written
in the tabular form also as shown below:
Y 1 2 3 Total
X
0 2 / 27 6 / 27 0 8 / 27
1 0 6 / 27 6 / 27 12 / 27
2 0 6 / 27 0 6 / 27
3 1 / 27 0 0 1 / 27
Total 3 / 27 18 / 27 6 / 27 1
28
We are now in a position to define joint, marginal and conditional probability Bivariate Discrete Random
mass functions. Variables
(i) p xi , y j 0
(ii) p x , y 1
i j
i j
p x i P X xi
= P X x i Y y1 or X x i Y y 2 or...
P X x i Y y1 P X x i Y y 2 P X x i Y y 3 ...
= P X x
j
i Y y j
p y j P Y y j
= P X x1 Y y j P X x 2 Y y j ...
P X x i Y y j
i
= p xi , y j
i
p x y P[X x Y y]
P X x Y y
= , provided P[Y = y] 0
P Y y
P A B
P A B , P B 0
P B
Similarly, the conditional probability mass function of Y, given X = x, is
defined as
P Y y X x
p y x P Y y X x
P X x
30
6 Bivariate Discrete Random
P X 0 Y 2 1 Variables
P X 0 Y 2 27
P Y 2 18 3
27
6
P X 1 Y 2 1
P X 1 Y 2 27
P Y 2 18 3
27
6
P X 2 Y 2 1
P X 2 Y 2 27
P Y 2 18 3
27
P X 3 Y 2 0
P X 3 Y 2 0
P Y 2 18
27
[Note that values of numerator and denominator in the above expressions have
already been obtained while discussing the joint and marginal probability mass
functions in this section of the unit.]
Independence of Random Variables
Two discrete random variables X and Y are said to be independent if and only
if
P X x i Y y j P X x i P Y y j
[ two events A and B are independent if and only if P(A B) = P(A) P(B)]
Example 1: The following table represents the joint probability distribution of
the discrete random variable (X, Y):
Y 1 2
X
1 0.1 0.2
2 0.1 0.3
3 0.2 0.1
Find :
i) The marginal distributions.
ii) The conditional distribution of X given Y = 1.
iii) P[(X + Y) < 4].
Solution:
i) To find the marginal distributions, we have to find the marginal totals, i.e.
row totals and column totals as shown in the following table:
31
Random Variables and
Expectation
Y 1 2 px
X
(Totals)
1 0.1 0.2 0.3
2 0.1 0.3 0.4
3 0.2 0.1 0.3
p y 0.4 0.6 1
(Totals)
P X 1, Y 1 0.1 1
ii) As P X 1 Y 1 ,
P Y 1 0.4 4
P X 2, Y 1 0.1 1
P X 2 Y 1 and
P Y 1 0.4 4
P X 3 Y 1 0.2 1
P X 3 Y 1 ,
P Y 1 0.4 2
(iii) As the values of (X, Y) which satisfy X + Y < 4 are (1, 1), (1, 2) and (2, 1)
only.
P X Y 4 P X 1, Y 1 P X 1, Y 2 P X 2, Y 1
Y 0 1 px
X
0 2/ 9 1/ 9 3/ 9
1 1/ 9 5/ 9 6/9
p y 3/ 9 6/9 1
3 6
P X 0 , P X 1 ,
9 9
3 6
P Y 0 , P Y 1
9 9
3 3 1
Now P X 0 P Y 0
9 9 9
2
But P X 0, Y 0
9
P X 0, Y 0 P X 0 P Y 0
33
Random Variables and
Expectation
E2) For the following joint probability distribution of (X, Y),
Y 1 2 3
X
1 1 / 20 1 / 10 1 / 10
2 1 / 20 1 / 10 1 / 10
3 1 / 10 1 / 10 1 / 20
4 1 / 10 1 / 10 1 / 20
P X x, Y y1 P X x, Y y 2 ...
P X x, Y y j
j
P X x1 , Y y P X x 2 , Y y ...
P X x i , Y y
i
34
Example 3: Considering the probability distribution function given in Bivariate Discrete Random
Example 1, find Variables
= P X 2, Y 2 P X 1, Y 2
P X 2, Y 2 P X 2, Y 1 P X 1, Y 2
P X 1, Y 1
P X 2, Y 2 P X 3, Y 2
ii) FX 3 P X 3
= P X 3, Y 1 P X 3, Y 2
= P X 3, Y 1 P X 2, Y 1 P X 1, Y 1
P X 3, Y 2 P X 2, Y 2 P X 1, Y 2
= P X 1, Y 1 P X 2, Y 1 P X 3, Y 1
= P X 1, Y 1 P X 2, Y 1 P X 3, Y 1
35
Random Variables and
2
Expectation F 0, 0 P X 0, Y 0 P X 0, Y 0
9
F 0,1 P X 0, Y 1 P X 0, Y 0 P X 0, Y 1
2 1 3
=
9 9 9
F 1,0 P X 1, Y 0 P X 1, Y 0 P X 0, Y 0
1 2 3
9 9 9
F 1,1 P X 1, Y 1 P X 1, Y 1 P X 1, Y 0
+ P X 0, Y 1 P X 0, Y 0
5 1 1 2
=1
9 9 9 9
Above distribution function F x, y can be shown in the tabular form as
follows:
Y 0 Y 1
X 0 2/ 9 3/ 9
X 1 3/ 9 1
FX 0 P X 0 P X 0
P X 0, Y 0 P X 0, Y 1
2 1 3
9 9 9
FX 1 P X 1 P X 1, Y 0 P X 1, Y 1
P X 1, Y 0 P X 0, Y 0
P X 1, Y 1 P X 0, Y 1
1 2 5 1
1
9 9 9 9
marginal distribution function of X is given as
X F(x)
0 3/ 9
1 1
Similarly, marginal distribution function of Y can be obtained. [Do it yourself]
36
Here is an exercise for you. Bivariate Discrete Random
Variables
E3) Obtain the joint and marginal distribution functions for the joint
probability distribution given in E 1).
Now before ending this unit, let us summarizes what we have covered in it.
6.5 SUMMARY
In this unit we have covered the following main points:
1) If X and Y be two discrete random variables defined on the sample space
S of a random experiment then the function (X, Y) defined on the same
sample space is called a two-dimensional discrete random variable. In
others words, (X, Y) is a two-dimensional random variable if the
possible values of (X, Y) are finite or countably infinite.
(i) p xi , y j 0
(ii) p x , y 1
i j
i j
P X x Y y
= ; and
P Y y
P Y y X x
P X x
37
Random Variables and
Expectation 6.6 SOLUTIONS/ANSWERS
E1) Let us compute the marginal totals. Thus, the complete table with
marginal totals is given as
Y 1 2 3 px
X
1 1 1 1 1 5
0 0
12 18 12 18 36
2 1 1 1 1 1 1 19
6 9 4 6 9 4 36
3 1 2 1 2 1
0 0
5 15 5 15 3
p y 1 14 79 1
4 45 180
Therefore,
i) Marginal distribution of X is
X px
1 5 / 36
2 19 / 36
3 1/ 3
P Y 1, X 2 1 36 6
ii) P Y 1 X 2
P X 2 6 19 19
P Y 2, X 2 1 36 4
P Y 2 X 2
P X 2 9 19 19
P Y 3, X 2 1 36 9
P Y 3 X 2
P X 2 4 19 19
1 6 / 19
2 4 / 19
3 9 / 19
38
iii) P X Y 5 Bivariate Discrete Random
Variables
=P X 1, Y 1 P X 1, Y 2 P X 1, Y 3
+P X = 2, Y = 1 P X 2, Y 2 P X 3, Y 1
1 1 1 1 15
0 0 .
12 18 6 9 36
E2) First compute the marginal totals, then you will be able to find
1
i) P X 4 , and hence
4
P Y 2, X 4 2
P Y 2 X 4
P X 4 5
2
ii) P Y 2
5
1 1 2
iii) P X 4, Y 2 , P X 4 , P Y 2
10 4 5
1 2 1
P[X = 4] P[Y = 2] = =
4 5 10
X= 4 and Y= 2 are independent
E3) To obtain joint distribution function F x, y P X x, Y y , we have
to obtain
F x, y for each value of X and Y, i.e. we have to obtain
X2 1 13 2
4 36 3
1 101
X3 1
4 180
39
Random Variables and
Expectation
Marginal distribution function of X is given as
X F(x)
1 5/36
2 2/3
3 1
40
Bivariate Continuous
UNIT 7 BIVARIATE CONTINUOUS Random Variables
RANDOM VARIABLES
Structure
7.1 Introduction
Objectives
7.2 Bivariate Continuous Random Variables
7.3 Joint and Marginal Distribution and Density Functions
7.4 Conditional Distribution and Density Functions
7.5 Stochastic Independence of Two Continuous Random Variables
7.6 Problems on Two-Dimensional Continuous Random Variables
7.7 Summary
7.8 Solutions/Answers
7.1 INTRODUCTION
In Unit 6, we have defined the bivariate discrete random variable (X, Y), where
X and Y both are discrete random variables. It may also happen that one of the
random variables is discrete and the other is continuous. However, in most
applications we deal only with the cases where either both random variables are
discrete or both are continuous. The cases where both random variables are
discrete have already been discussed in Unit 6. Here, in this unit, we are going
to discuss the cases where both random variables are continuous.
In Unit 6, you have studied the joint, marginal and conditional probability
functions and distribution functions in context of bivariate discrete random
variables. Similar functions, but in context of bivariate continuous random
variables, are discussed in this unit.
Bivariate continuous random variable is defined in Sec. 7.2. Joint and marginal
density functions are described in Sec. 7.3. Sec. 7.4 deals with the conditional
distribution and density functions. Independence of two continuous random
variables is dealt with in Sec. 7.5. Some practical problems on two-dimensional
continuous random variables are taken up in Sec. 7.6.
Objectives
A study of this unit would enable you to:
define two-dimensional continuous random variable;
specify the joint and marginal probability density functions of two
continuous random variables;
obtain the conditional density and distribution functions for two-
dimensional continuous random variable;
check the independence of two continuous random variables; and
solve various practical problems on bivariate continuous random
variables.
41
Random Variables and 7.2 BIVARIATE CONTINUOUS RANDOM
Expectation
VARIABLES
Definition: If X and Y are continuous random variables defined on the sample
space S of a random experiment, then (X, Y) defined on the same sample space
S is called bivariate continuous random variable if (X, Y) assigns a point in
xy-plane defined on the sample space S. Notice that it (unlike discrete random
variable) assumes values in some non-countable set. Some examples of
bivariate continuous random variable are:
1. A gun is aimed at a certain point (say origin of the coordinate system).
Because of the random factors, suppose the actual hit point is any point
(X, Y) in a circle of radius unity about the origin.
1 X
-1
-1
y=d
d
x=a x=b
c
y=c
X
a b
Fig.7.2: (X, Y) Assuming All Values in the Rectangle x,y : a x b,c y d
3. In a statistical survey, let X denotes the daily number of hours a child
watches television and Y denotes the number of hours he/she spends on the
studies. Here, (X, Y) is a two-dimensional continuous random variable.
42
Bivariate Continuous
7.3 JOINT AND MARGINAL DISTRIBUTION AND Random Variables
DENSITY FUNCTIONS
Two-Dimensional Continuous Distribution Function
The distribution function of a two-dimensional continuous random variable
(X, Y) is a real-valued function and is defined as
F x, y P X x, Y y for all real x and y.
F x, y f x, y dydx
Remark 2:
As in the one-dimensional case, f x, y does not represent the probability of
anything. However, for positive x and y sufficiently small, f x, y xy is
approximately equal to
P x X x x, y Y y y .
In the one-dimensional case, you have studied that for positive x sufficiently
small f(x)x is approximately equal to P x X x x . So, the two-
dimensional case is in analogy with the one-dimensional case.
Remark 3:
In analogy with the one-dimensional case [See Sec. 5.4 of Unit 5 of this course],
P x X x x, y Y y y
f x, y can be written as lim
x 0
y 0
xy
and is equal to
2
xy
F x, y , i.e. second order partial derivative with respect to x and y.
43
d
Random Variables and
Expectation
[See Sec. 5.5 of Unit 5 where f x
dx
F x ]
2
Note:
xy
F x, y means first differentiate F x, y partially w.r.t. y and
then the resulting function w.r.t. x. When we differentiate a function partially
w.r.t. one variable, then the other variable is treated as constant
For example, Let F x, y xy3 x 2 y
44
Bivariate Continuous
f x f x, y dy, Random Variables
d
or, it may also be obtained as
dx
F x ,
and the marginal probability density function of Y is given as
f y f x, y dx
or
d
=
dy
F y
i) f y x is clearly 0
f x, y
ii) f y x dy f x dy
1
f x, y dy
f x
1 f x, y dyis the m arg inal
f x
f (x)
probability density function of X
=1
45
Random Variables and Similarly, f x y satisfies
Expectation
i) f x y 0 and
ii) f x y dx 1
ii) f x y f x .
f x, y f x f y x [On cross-multiplying]
k 2x y , 0 x 1, 0 y 2
f x, y
0, elsewhere
1 2
f x, y dy dx 1
0 0
[0 x 1, 0 y 2 ]
1 2
k 2x y dy dx 1
0 0
1 2
k 2x y dy dx 1
0 0
1 2
y2
k 2xy dx 1
0
2 0
1 2
k 2x 2
2
0 dx 1
0
2
1
k 4x 2 dx 1
0
1
4x 2
k 2x 1
2 0
4 1
k 2 0 1 4k 1 k =
2 4
47
Random Variables and
Example 2: Let the joint density function of a two-dimensional random
Expectation variable (X, Y) be:
x y for 0 x 1 and 0 y 1
f x, y
0, otherwise
1
f x, y dy [ 0 y < 1]
0
1
= x y dy
0
1
y2
xy .
2 0
2
1 1
x 1 0 x , 0 x 1.
2 2
1 1 2 4 2 4 2
y2 4
i) P X Y f x, y dy dx 8xy dy dx 0 2 dx
8x
2 4 0 0 0 0
0
48
1 1 1 Bivariate Continuous
2 1 2
x 1 x2 2 Random Variables
8x dx 0 4 dx
0 16 2 4 2 0
1 1 1
.
4 8 32
1 1
y2
8xy dy 8x
x 2 x
1 x2
8x 4x 1 x 2 for 0 x 1
2 2
Marginal density function of Y is
y
f y f x, y dx [0 x y ]
0
8xy dx
0
y
x2 8y 3
8y 4y 3 for 0 y 1
2 0 2
8xy 2x
, 0x y
4y 3 y 2
Conditional density function of Y given X(0 < X < 1) is
f x, y
f y x
f x
8xy 2y
, x < y <1
4x 1 x 2 1 x2
iii) f x, y 8xy,
But f x f y 4x 1 x 2 4y3
16x 1 x y
2 3
49
Random Variables and f x, y f x f y
Expectation
Hence, X and Y are not independent random variables.
Now, you can try some exercises.
E1) Let X and Y be two random variables. Then for
kxy for 0 x 4 and 1 y 5
f x, y
0, otherwise
to be a joint density function, what must be the value of k?
E2) If the joint p.d.f. of a two-dimensional random variable (X, Y) is given by
2 for 0 x 1 and 0 y x
f x, y
0, otherwise,
Then,
i) Find the marginal density functions of X and Y.
ii) Find the conditional density functions.
iii) Check for independence of X and Y.
E3) If (X, Y) be two-dimensional random variable having joint density
function.
1
6 x y ; 0 x 2 , 2 y 4
f x, y 8
0, elsewhere
Now before ending this unit, let’s summarize what we have covered in it.
7.7 SUMMARY
In this unit, we have covered the following main points:
1) If X and Y are continuous random variables defined on the sample space S of
a random experiment, then (X, Y) defined on the same sample space S is
called bivariate continuous random variable if (X, Y) assigns a point in
xy -plane defined on the sample space S.
2) The distribution function of a two-dimensional continuous random variable
(X, Y) is a real-valued function and is defined as
F x, y P X x, Y y for all real x and y.
F x, y f x, y dydx
50
and satisfies Bivariate Continuous
Random Variables
i) f x, y 0
ii) f x, y dydx 1.
4) The marginal distribution function of the continuous random variable X is
defined as
x
F x P X x f x, y dy dx,
and that of continuous random variable Y is defined as
y
F y P Y y f x, y dx dy .
5) The marginal probability density function of X is given as
d
f x f x, y dy dx F(x) ,
51
Random Variables and 7.8 SOLUTIONS/ANSWERS
Expectation
E1) As f x, y is the joint probability density function,
f x, y dy dx 1
4 5 4 5
0 1 kxy dy dx 1 k 0 1 xy dy dx 1
4 5 4
y2
k x dx 1 k 12x dx 1
0
2 1 0
4
x2
12k 1 96 k = 1
2 0
1
k=
96
E2) i) Marginal density function of Y is given by
1
f y f x, y dx 2dx
y
[As x is involved in both the given ranges, i.e. 0 < x < 1 and 0 < y < x;
therefore, here we will combine both these intervals and hence have
0 < y < x < 1. x takes the values from y to 1]
1
2x y 2 2y
= 2 – 2y
= 2(1– y), 0 < y < 1
Marginal density function of X is given by
f x f x, y dy
x
2dy [ 0 < y < x < 1]
0
x
2 y 0
2x, 0 x 1.
ii) Conditional density function of Y given X(0 < X < 1) is
f x, y 2 1
f y x ; 0 y x
f x 2x x
f x f y 2 2x 1 y
As f x, y f x f y ,
1 3
E3) (i) P X 1, Y 3 f x, y dy dx
1 3
1
6 x y dy dx
0 2
8
3
1
1 y2
6y xy dx
8
0
2 2
1
1 9
6 3 x 3 12 2x 2dx
8 0 2
1
1 9
18 3x 10 2x dx
8 0 2
1 1
1 7 1 7 x2 1 7 1 3
x dx x
802 8 2 2 0 8 2 2 8
P X 1, Y 3
ii) P X 1 Y 3
P Y 3
2 3
1
where P Y 3 6 x y dy dx
0 2
8
2 3
1 y2
6y xy dx .
80 2 2
2
1 9
18 3x 12 2x 2 dx
8 0 2
2
1 9
18 3x 10 2x dx
8 0 2
2
1 7
x dx
8 0 2
2
1 7 x2
x
8 2 2 0
53
Random Variables and 1 4
7 0
Expectation 8 2
5
8
3 / 8 value of numerator is
P X 1 Y 3
5 / 8 already calculated in part(i)
3
5
54
UNIT 8 MATHEMATICAL EXPECTATION Mathematical Expectation
Structure
8.1 Introduction
Objectives
8.1 INTRODUCTION
In Units 1 to 4 of this course, you have studied probabilities of different events
in various situations. Concept of univariable random variable has been
introduced in Unit 5 whereas that of bivariate random variable in Units 6 and
7. Before studying the present unit, we advice you to go through the above
units.
You have studied the methods of finding mean, variance and other measures in
context of frequency distributions in MST-002 (Descriptive Statistics). Here, in
this unit we will discuss mean, variance and other measures in context of
probability distributions of random variables. Mean or Average value of a
random variable taken over all its possible values is called the expected value
or the expectation of the random variable. In the present unit, we discuss the
expectations of random variables and their properties.
In Secs. 8.2, 8.3 and 8.4, we deal with expectation and its properties. Addition
and multiplication laws of expectation have been discussed in Sec. 8.5.
Objectives
After studying this unit, you would be able to:
find the expected values of random variables;
establish the properties of expectation;
obtain various measures for probability distributions; and
apply laws of addition and multiplication of expectation at appropriate
situations.
55
Random Variables and n
Expectation
f x
i 1
i i
Mean = n
.
fi
i 1
f x
i 1
i i
f1x1 f 2 x 2 ... f n x n
Mean = n
= n
fi
i 1
f i 1
i
x1f1 x 2 f2 xn fn
= n
n
... n
fi
i 1
fi
i 1
f
i 1
i
f f f
= x1 n i x2 n 2 ... x n n n
f f f
i i i
i 1 i 1 i1
f1 f2 fn
Notice that n
, n
,..., n
are, in fact, the relative frequencies or the
f f
i 1
i
i 1
i f
i 1
i
56
n
Mathematical Expectation
x f
i 1
i i
Mean of a frequency distribution of X is n
, similarly mean of a
fi
i 1
n
x p
i 1
i i
probability distribution of r.v. X is n
.
p
i 1
i
n
Now, as we know that p
i 1
i 1 for a probability distribution, therefore
n
the mean of the probability distribution becomes x i pi .
i 1
n
Expected value of a random variable X is E X x i pi .
i 1
The above formula for finding the expected value of a random variable X
is used only if X is a discrete random variable which takes the values
x1 , x 2 , ..., x n with probability mass function
p x i P X x i , i 1, 2,..., n.
= x1p1 x 2 p 2 x 3p 3
57
Random Variables and
1 1 1 1 1
Expectation = 0 1 (2) = 0 1
4 2 4 2 2
So, we get the same answer, i.e. 1 using the formula also.
So, expectation of a random variable is nothing but the average (mean)
taken over all the possible values of the random variable or it is the value
which we get on an average when a random experiment is performed
repeatedly.
Remark 1: Sometimes summations and integrals as considered in the
above definitions may not be convergent and hence expectations in such
cases do not exist. But we will deal only those summations (series) and
integrals which are convergent as the topic regarding checking the
convergence of series or integrals is out of the scope of this course. You
need not to bother as to whether the series or integral is convergent or
not, i.e. as to whether the expectation exists or not as we are dealing with
only those expectations which exist.
Example 1: If it rains, a rain coat dealer can earn Rs 500 per day. If it is a dry
day, he can lose Rs 100 per day. What is his expectation, if the probability of
rain is 0.4?
Solution: Let X be the amount earned on a day by the dealer. Therefore, X can
take the values Rs 500, Rs 100 ( loss of Rs 100 is equivalent to negative of
the earning of Rs100).
Probability distribution of X is given as
Rainy Day Dry day
X in Rs. : 500 100
px : 0.4 0.6
Thus, his expectation is Rs 140, i.e. on an overage he earns Rs 140 per day.
1 2 1
P 2 heads , P one head , P no head .
4 4 4
Let X be the amount in rupees won by him
X can take the values 5, 2 and 1 with
58
1 Mathematical Expectation
P X 5 P 2heads ,
4
2
P X 2 P 1Head , and
4
1
P X 1 P no Head .
4
Probability distribution of X is
X: 5 2 1
1 2 1
px
4 4 4
Expected value of X is given as
3
E X x i pi = x1p1 x 2 p 2 x 3p 3
i 1
1 2 1 5 4 1 10
= 5 2 1 = 2.5.
4 4 4 4 4 4 4
Thus, the expected value of amount won by him is Rs 2.5.
Example 3: Find the expectation of the number on an unbiased die when thrown.
Solution: Let X be a random variable representing the number on a die when thrown.
X can take the values 1, 2, 3, 4, 5, 6 with
1
P X 1 P X 2 P X 3 P X 4 P X 5 P X 6 .
6
Thus, the probability distribution of X is given by
X: 1 2 3 4 5 6
1 1 1 1 1 1
px :
6 6 6 6 6 6
Hence, the expectation of number on the die when thrown is
6
1 1 1 1 1 1 21 7
E X x i pi 1 2 3 4 5 6 =
i 1 6 6 6 6 6 6 6 2
Example 4: Two cards are drawn successively with replacement from a
well shuffled pack of 52 cards. Find the expected value for the number of
aces.
Solution: Let A1, A2 be the events of getting ace in first and second draws,
respectively. Let X be the number of aces drawn. Thus, X can take the
values 0, 1, 2 with
P X 0 P no ace P A1 A 2
59
Random Variables and 48 48 12 12 144
Expectation = = ,
52 52 13 13 169
P X 1 one Ace and one other card
P A1 A 2 A1 A 2
4 4 1
P A1 P A 2 = .
52 52 169
3 16 16 3 16
1
4 3 4 4 12
Now, you can try the following exercises.
E1) You toss a fair coin. If the outcome is head, you win Rs 100; if the
outcome is tail, you win nothing. What is the expected amount won
by you?
60
E2) A fair coin is tossed until a tail appears. What is the expectation of Mathematical Expectation
number of tosses?
E3) The distribution of a continuous random variable X is defined by
x3 , 0 x 1
3
f x 2 x , 1 x 2
0 , elsewhere
Obtain the expected value of X.
= k pi
i
= k x i pi
i
k E X
3. E a X b ax i b p i [By def.]
i
= ax p
i
i i bp i ax i p i bpi a x i pi b pi
i i i i
aE X b 1 aE X b
61
Random Variables and Continuous Case:
Expectation
Let X be continuous random variable having f(x) as its probability density
function. Thus,
1. E k kf x dx [By def.]
k f x dx
k xf x dx kE X
3. E aX b ax b f x dx ax f x dx b f x dx
a x f x dx b f x dx aE X b 1 = aE X b
Find i) E(X)
ii) E(2X + 3)
iii) E(X2)
iv) E(4X – 5)
Solution
5
i) E X x i pi = x1p1 x 2 p 2 x 3 p3 x 4 p 4 x 5p 5
i 1
Let us now express the moments and other measures for a random variable
in terms of expectations in the following section.
f i 1
i
So, the rth order moment about any point ‘A’ of a random variable X
having probability mass function P X x i p x i pi is defined as
n
r
p x i i A
r' i 1
n
p i 1
i
Variance
Variance of a random variable X is second order central moment and is
defined as
2 2
2 = V X E X E X E X
Also, we know that
2
V X 2 ' 1 '
64
Mathematical Expectation
2
a 2 E X E X [Using property 2 of section 8.3]
Cor. (i) V aX a 2 V X
(ii) V(b) = 0
(iii) V(X + b) = V(X)
f xi i x y i y
Cov X, Y i
f
i
i
where p i j = P X x i , Y y j
65
Random Variables and 2
Expectation Proof: V X Y E X Y E X Y
2
E X Y E X E Y
2
E X E X Y E Y
2 2
E X E X Y E Y 2 X E X Y E Y
2 2
E X E X E Y E Y 2E X E X Y E Y
V X V Y 2Cov X, Y
V X V Y
V aX bY a 2 V X b 2 V Y .
f
i 1
i xi x
n
, and
f i 1
i
p i x mean n
i 1
n
p i x mean
p
i 1
i
i 1
66
pi x Mean for discrete r.v Mathematical Expectation
x Mean f x dx for continuous r.v
Note: Other measures as defined for frequency distributions in MST-002
can be defined for probability distributions also and hence can be
expressed in terms of the expectations in the manner as the moments;
variance and covariance have been defined in this section of the Unit.
Example 7: Considering the probability distribution given in Example 6, obtain
i) V(X)
ii) V(2X + 3).
Solution:
2
(i) V X E X 2 E X
Solution: V(3X + 4Y) = (3)2 V(X) + (4)2 V(Y) [By Remark 3 of Section 8.4]
= 9(2) + 16( 3) = 18 + 48 = 66
67
Random Variables and
Expectation
Addition Theorem of Expectation
Theorem 8.2: If X and Y are random variables, then E X Y E X E Y
Proof:
Discrete case:
Let (X, Y) be a discrete two-dimensional random variable which takes up
the values (xi, yj) with the joint probability mass function
pij = P X x i Y y j .
p ij
j
p j p y j P Y y j = p ij
i
E X x i pi , E Y y jp j and E X Y x i y j p ij
i j i j
Now E X Y x i y j p ij
i j
x i pij y jpij
i j i j
= x p y p
i
i
j
ij
j
j
i
ij
[ in the first term of the right hand side, xi is free from j and hence can
be taken outside the summation over j; and in second term of the right
hand side, yj is free from i and hence can be taken outside the summation
over i.]
E X Y x i pi y j p j = E X E Y
i j
Continuous Case:
Let (X, Y) be a bivariate continuous random variable with probability
density function f x, y . Let f x and f y be the marginal
probability density functions of random variables X and Y respectively.
68
Mathematical Expectation
E X x f x dx, E Y y f y dy,
and E X Y x y f x, y dy dx .
Now, E X Y x y f x, y dy dx
x f x, y dy dx y f x, y dy dx
x
f x, y dy dx y f x, y dx dy
[ in the first term of R.H.S., x is free from the integral w.r.t. y and
hence can be taken outside this integral. Similarly, in the second term of
R.H.S, y is free from the integral w.r.t. x and hence can be taken outside
this integral.]
Refer to the definition of marginal density
x f x dx y f y dy function given in Unit 7 of this course
E X E Y
Remark 3: The result can be similarly extended for more than two random
variables.
Multiplication Theorem of Expectation
Theorem 8.3: If X and Y are independent random variables, then
E(XY) = E(X) E(Y)
Proof:
Discrete Case:
Let (X, Y) be a two-dimensional discrete random variable which takes up the
values x i , y j with the joint probability mass function
pij P X x i Y y j . Let pi and p j' be the marginal probability mass
functions of X and Y respectively.
E X x i pi , E Y y jp j' , and
i j
E XY x i y j p ij
i j
69
Random Variables and
Expectation if events A and B are independent,
= P X x i P Y y j
then P A B P A P B
p i p j'
Hence, E(XY) = x y p p
i j
i j i j
'
= x i y jpi p j'
i j
x i p i y j p j'
i j
and E XY xy f x, y dy dx .
Now E XY xy f x, y dy dx
X and Y are independent, f(x,y)=f(x)f(y)
xy f x f y dy dx
(see Unit 7 of this course)
x f x yf y dy dx
x f x dx y f y dy
E X E Y
Remark 4: The result can be similarly extended for more than two
random variables.
Example 8: Two unbiased dice are thrown. Find the expected value of
the sum of number of points on them.
Solution: Let X be the number obtained on the first die and Y be the
number obtained on the second die, then
70
7 7 Mathematical Expectation
E X and E Y [See Example 3 given in Section 8.2]
2 2
The required expected value = E(X + Y)
Using addition theorem
= E(X) + E(Y)
of expectation
7 7
= =7
2 2
Remark 5: This example can also be done considering one random
variable only as follows:
Let X be the random variable denoting “the sum of numbers of points on
the dice”, then the probability distribution in this case is
X: 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
p(x) :
36 36 36 36 36 36 36 36 36 36 36
1 2 1
and hence E(X) = 2 3 ... 12 =7
36 36 36
Example 9: Two cards are drawn one by one with replacement from 8
cards numbered from 1 to 8. Find the expectation of the product of the
numbers on the drawn cards.
Solution: Let X be the number on the first card and Y be the number on
the second card. Then probability distribution of X is
X 1 2 3 4 5 6 7 8
px 1 1 1 1 1 1 1 1
8 8 8 8 8 8 8 8
p y 1 1 1 1 1 1 1 1
8 8 8 8 8 8 8 8
1 1 1
E X E Y 1 2 ... 8
8 8 8
1 1 9
1 2 3 4 5 6 7 8 36
8 8 2
Thus, the required expected value is
E XY E X E Y [Using multiplication theorem of expectation]
9 9 81
.
2 2 4
71
Random Variables and
Expectation
Expectation of Linear Combination of Random Variables
Theorem 8.4: Let X1 , X 2 , ..., X n be any n random variables and if
a1 , a 2 , ..., a n are any n constants, then
[Note : Here a1X1 a 2 X 2 ... a n X n is a linear combination of X1, X2, ... , Xn]
= a1E X1 a 2 E X 2 ... a n E X n .
Now before ending this unit, let’s summarize what we have covered in it.
8.6 SUMMARY
The following main points have been covered in this unit:
1) Expected value of a random variable X is defined as
n
E X x i pi , if X is a discrete random variable
i 1
xf x dx , if X is a continuous random variable.
72
vi) If X1 , X 2 , ..., X n be any n random variables and if a1 , a 2 , ..., a n are any n Mathematical Expectation
constants, then
E a1X1 a 2 X 2 ... a n X n a1E X1 a 2 E X 2 ... a n E X n .
pi x i A r , if X is a discrete r.v.
i
'
r
x A r f x dx, if X is a continous r.v
= E(X A)r
ii) Variance of a random variable X is given as
2 2
V X E X E X E X
E X E X Y E Y
8.7 SOLUTIONS/ANSWERS
E1) Let X be the amount (in rupees) won by you.
1
X can take the values 100, 0 with P[X = 100] = P[Head] = , and
2
1
P[X = 0] = P[Tail] = .
2
probability distribution of X is
X: 100 0
1 1
px
2 2
73
Random Variables and and hence the expected amount won by you is
Expectation
1 1
E X 100 0 = 50.
2 2
E2) Let X be the number of tosses till tail turns up.
X can take values 1, 2, 3, 4… with
1
P[X = 1] = P[Tail in the first toss] =
2
2
1 1 1
P[X = 2] = P[Head in the first and tail in the second toss] = ,
2 2 2
3
1 1 1 1
P[X = 3] = P[HHT] = , and so on.
2 2 2 2
Probability distribution of X is
X: 1 2 3 4 5...
2 3 4 5
1 1 1 1 1
px ...
2 2 2 2 2
and hence
2 3 4
1 1 1 1
E X 1 2 3 4 ... … (1)
2 2 2 2
1
Multiplying both sides by , we get
2
2 3 4 5
1 1 1 1 1
E X 2 3 4 ...
2 2 2 2 2
2 3 4
1 1 1 1
E X 2 3 ... … (2)
2 2 2 2
[Shifting the position one step towards right so that we get the
terms having same power at the same positions as that in (1)]
Now, subtracting (2) from (1), we have
2 3 4
1 1 1 1 1
E X E X ...
2 2 2 2 2
2 3 4
1 1 1 1 1
E X
2 2 2 2 2
2 3
1 1 1
E X 1 ...
2 2 2
74
(Which is an infinite G.P. with first term a = 1 and common ratio Mathematical Expectation
1
r= )
2
1 a
[ S (see Unit 3of course MST - 001)]
1 1 r
1
2
1
= 2.
1
2
E3) E X x f x dx
0 1 2
x f x dx x f x dx x f x dx x f x dx
0 1 2
0 1 2
3
x 0 dx x x dx x 2 x
3
dx x 0 dx
0 1 2
1 2
0 x 4 dx x 8 x 3 6x 2 x dx 0
0 1
1 2
x 4dx 8x x 4 12x 2 6x 3 dx
0 1
1 2
x5 x 2 x5 x3 x4
8 12 6
5 0 2 5 3 4 1
5 2 5 3 4 2 5 3 4
1 32 1 3
16 32 24 4 4
5 5 5 2
1 8 13 1 3 1
.
5 5 10 5 10 2
E4) As X is a random variable with mean ,
E(X) = ... (1)
75
Random Variables and
X
Expectation Now, E Z E
1
E X
1
E X [Using Property 2 of Sec. 8.3]
1
E X [Using Property 3 of Sec. 8.3]
1
[Using (1)]
=0
Note: Mean of standard random variable is zero.
X
E5) Variance of standard random variable Z = is given as
X X
V(Z) = V =V
1
V X
2
1 Using the result of the Theorem 8.1
V X of Sec. 8.5 of this unit
1
= VX
2
1 it is given that the standard deviation
2
2 = 1 2
of X is and hence its variance is
Note: The mean of standard random variate is ‘0’ [See (E4)] and its
variance is 1.
E6) Given that E Y 0 E(a X b) = 0 a E(X) – b = 0
a(10) – b = 0
10 a – b = 0 ... (1)
Also as V(Y) = 1,
hence V(aX b) = 1
1
a2V(X) = 1 a2(25) = 1 a2 =
25
1
a= [ a is positive]
5
From (1), we have
1
10 b 0 2 – b = 0 b = 2
5
76
1 Mathematical Expectation
Hence, a = , b = 2.
5
E7) Let X be the number on the first card and Y be the number on the
second card. Then probability distribution of X is:
X 1 2 3 4 5 6 7 8 9 10
px 1 1 1 1 1 1 1 1 1 1
10 10 10 10 10 10 10 10 10 10
px 1 1 1 1 1 1 1 1 1 1
10 10 10 10 10 10 10 10 10 10
1 1 1
E(X) = E(Y) = 1 2 ... 10
10 10 10
1 1
1 2 3 4 5 6 7 8 9 10 = 55 5.5
10 10
and hence the required expected value is
E X Y E X E Y = 5.5 + 5.5 = 11
E8) Let X be the number obtained on the first die and Y be the number
obtained on the second die.
7
Then E(X) = E(Y) = . [See Example 3 given in Section 8.2]
2
Hence, the required expected value is
E XY E X E Y [Using multiplication theorem of expectation]
7 7 49
= = .
2 2 4
77
Binomial Distribution
UNIT 9 BINOMIAL DISTRIBUTION
Structure
9.1 Introduction
Objectives
9.1 INTRODUCTION
In Unit 5 of the Course, you have studied random variables, their probability
functions and distribution functions. In Unit 8 of the Course, you have come to
know as to how the expectations and moments of random variables are
obtained. In those units, the definitions and properties of general discrete and
probability distributions have been discussed.
The present block is devoted to the study of some special discrete distributions
and in this list, Bernoulli and Binomial distributions are also included which
are being discussed in the present unit of the course.
Sec. 9.2 of this unit defines Bernoulli distribution and its properties. Binomial
distribution and its applications are covered in Secs. 9.3 and 9.4 of the unit.
Objectives
Study of the present unit will enable you to:
define the Bernoulli distribution and to establish its properties;
define the binomial distribution and establish its properties;
identify the situations where these distributions are applied;
know as to how binomial distribution is fitted to the given data; and
solve various practical problems related to these distributions.
There are experiments where the outcomes can be divided into two categories
with reference to presence or absence of a particular attribute or characteristic.
A convenient method of representing the two is to designate either of them as
success and the other as failure. For example, head coming up in the toss of a
fair coin may be treated as a success and tail as failure, or vice-versa.
Accordingly, probabilities can be assigned to the success and failure.
5
Discrete Probability
Distributions
Suppose a piece of a product is tested which may be defective (failure) or non-
defective (a success). Let p the probability that it found non-defective and
q = 1 – p be the probability that it is defective. Let X be a random variable
such that it takes value 1 when success occurs and 0 if failure occurs.
Therefore,
P X 1 p, and
P X 0 q 1 p .
The above experiment is a Bernoulli trial, the r.v. X defined in the above
experiment is a Bernoulli variate and the probability distribution of X as
specified above is called the Bernoulli distribution in honour of J. Bernoulli
(1654-1705).
Definition
X 0 1
px 1p p
6
Binomial Distribution
p 3p2 2p3
p 2p 2 3p 1 p 2p 1 p 1
p 1 p 1 2p
2 4
Fourth order central moment ( 4 ) '4 4 3' 1' 6 2' 1'
3 1'
2 4
p 4p.p 6p p 3 p
p 4p 2 6p 3 3p 4
p 1 4p 6p 2 3p3
= p 1 p 1 3p 3p 2
[Note: For relations of central moments in terms of moments about origin, see
Unit 3 of MST-002.]
7
Discrete Probability
Distributions
This can be written as
P X 0 3 C 0 p 0 q 3 0
n
[ 3 C0 1, p0 1, q 30 q3 . Recall n C x (see Unit 4 of MST-001)]
x nx
P[X = 1] = Probability of hitting the target once
= [(Success in the first trial and failure in the second and third trial)
or (success in the second trial and failure in the first and third
trials) or (success in the third trial and failure in the first two
trials)]
P S F F or F S F or F F S
P S F F P F S F P F F S
= P S .P F .P F P F .P S .P F P F .P F .P S
[ trials are independent]
p.q.q q.p.q q.q.p
pq 2 pq 2 pq 2
3pq 2
This can also be written as
P X 1 3C1p1q 31 [ 3
C1 3, p1 p, q 31 q 2 ]
P S .P S .P F P S .P F .P S P F .P S .P S
p.p.q p.q.p q.p.p
= 3 p2q
This can also be written as
P X 2 3 C 2 p 2 q 3 2 [ 3 C 2 3, q 3 2 q ]
8
This can also be written as Binomial Distribution
P X r 3C r p r q 3 r ; r 0, 1, 2, 3 .
which is the probability of r successes in 3 trials. 3 Cr , here, is the number of
ways in which r successes can happen in 3 trials.
The result can be generalized for n trials in the similar fashion and is given as
P X r n C r p r q n r ; r 0, 1, 2,..., n.
Binomial Expansion:
‘Bi’ means ‘Two’. ‘Binomial expansion’ means ‘Expansion of expression
having two terms, e.g.
(X Y) 2 X 2 2XY Y 2 2C 0 X 2 Y 0 2 C1X 21Y1 2C 2 X 2 2 Y 2 ,
3
X Y X3 3X 2 Y 3XY 2 Y 3
3C0 X3Y 0 3C1X31Y1 3C 2 X32 Y 2 3C3 X33Y 3
So, in general,
n
X Y n C0 Xn Y0 n C1 X n 1Y1 n C2 X n 2 Y 2 ... n Cn X n n Y n
9
Discrete Probability
Distributions Remark 2:
i) The binomial distribution is the probability distribution of sum of n
independent Bernoulli variates.
ii) If X is binomially distributed r.v. with parameters n and p, then we may
write it as X ~ B(n, p).
P X 4 P X 5 P X 6
1 6 1 1
= . C 4 . 6 C5 . 6 C6
64 64 64
1 6
C 4 6C5 6C6
64
1 65 22 11
= 6 1 .
64 2 64 32
(iv) P[at most 3 heads] = P [3 or less than 3 heads]
= P X 3 P X 2 P X 1 P X 0
1 6 1 1 1
= . C3 . 6 C2 . 6C1 . 6C0
64 64 64 64
1 6 6 6 6
C3 C 2 C1 C0
64
1 42 21
20 15 6 1 .
64 64 32
(v) P[at least 3 heads] = P[3 or more heads]
= P X 3 P X 4 P X 5 P X 6
or
= 1 P X 0 P X 1 P X 2
sum of probabilities of all possible
values of a random variable is 1
11 Already obtained in
= 1 part (ii) of this example
32
21
.
32
(vi) P [more than 6 heads] = P [7 or more heads]
in six tosses, it
= P [an impossible event] is impossible to get
more than six heads
=0
11
Discrete Probability
Distributions
Example 3: The chances of catching cold by workers working in an ice
factory during winter are 25%. What is the probability that out of 5 workers 4
or more will catch cold?
Solution: Let catching cold be the success and p be the probability of success
for each worker.
Here, n = 5, p = 0.25, q = 0.75 and by binomial distribution
P X x n C x p x q n x ; x 0, 1, 2, ..., n
x 5 x
5C x 0.25 0.75 ; 0, 1, 2, ...,5
Therefore, the required probability = P[X 4]
= p X 4 or X 5
P X 4 P X 5
4 1 5 0
5C4 0.25 0.75 5C5 0.25 0.75
5 0.002930 1 0.000977
0.014650 0.000977
= 0.015627
Now, we are sure that you can try the following exercises:
1
E1) The probability of a man hitting a target is . He fires 5 times. What is the
4
probability of his hitting the target at least twice?
E2) A policeman fires 6 bullets on a dacoit. The probability that the dacoit will
be killed by a bullet is 0.6. What is the probability that the dacoit is still
alive?
12
Binomial Distribution
9.4 MOMENTS OF BINOMIAL DISTRIBUTION
'r E X r x r .P X x
x0
n
1' E X x .P X x
x 0
n
= x. n C x p x q n x P X x n C x p x q n x ; x 0, 1, 2, ..., n
x 0
n
first term with x = 0 will be zero
x. n C x p x q n x
x 1 and hence we may start from x 1
n
n
x. . n 1C x 1 p x q n x
x 1 x
n n n n 1 n n 1
C x x n x x x 1 n 1 x 1 x C x 1 ,
see Unit 4 of MST 001
n
n. n 1C x 1 p x 1.p1.q
n 1 x 1
[n x = (n 1) (x 1)]
x 1
n
np C x 1 p x 1.q n 1 x 1
n 1
x 1
Mean = np
n n
'2 E X 2 x 2 .P X x x 2 . n C x p x q n x
x 0 x 0
2
Here, we will write x as x x 1 x [ x x 1 x x 2 x x x 2 ]
This is done because in the following expression, we get x x 1 in the
denominator:
13
Discrete Probability
Distributions n n n n 1 n 2
C x
x n x x x 1 x 2 n 2 x 2
n n 1 n 2
. C x 2
x x 1
n
'2 x x 1 x n C x p x q n x
x 0
n n
x x 1 n C x p x q n x x. n C x p x q n x
x 0 x 0
n
x x 1 n C x p x q n 2 x 2 1'
x 2
n
n n 1 n 2
x x 1. C x 2 p x q n x 1'
x 2 x x 1
n
n n 1 n 2 C x 2 p x 2 .p 2q n 2 x 2 1'
x 2
n
n n 1 p 2 n 2 C x 2 p 2 q n 2 x 2 1'
x 2
Sum of probabilities of all possible values of a
n n 1 p 2 '
1
binomial variate with parameters n 2 and p
Variance = npq
n
'3 x 3 .P X x
x 0
Let x 3 x x 1 x 2 Bx x 1 Cx
Comparing coefficients of x 2 , we have
0 =– 3+B B3
Comparing coeffs of x, we have
0=2–B+CC=B–2=3–2C=1
14
n Binomial Distribution
'3 x x 1 x 2 3x x 1 x .n C x p x q n x
x 0
n n n
x x 1 x 2 n C x p x q n x 3 x x 1 n C x p x q n x x. n C x p x q n x
x 0 x 0 x0
n
n n 1 n 2 n 3
x x 1 x 2 . . C x 3p x q n x 3 n n 1 p 2 np
x 0 x x 1 x 2
[The expression within brackets in the second term is the first term of
R.H.S. in the derivation of '2 and the expression in the third term is 1' as
already obtained.]
n n n n 1 n 2 n 3
C x
x n x x x 1 x 2 x 3 n 3 x 3
n n 1 n 2 n 3
. C x 3
x x 1 x 2
n
n n 1 n 2 .n 3 C x 3p 3p x 3q n 3 x 3 3n n 1p 2 np
x 3
n
n n 1 n 2 p3 n 3 C x 3 p x 3q
n 3 x 3
3n n 1 p 2 np
x 3
n n 1 n 2 p3 1 3n n 1 p 2 np
3 npq q p
n
'4 x 4 P X x
x 0
Writing
x 4 x x 1 x 2 x 3 6x x 1 x 2 7x x 1 x
and proceeding in the similar fashion as for 1' , '2 , 3' , we have
Now, recall the measures of skewness and kurtosis which you have studied in
Unit 4 of MST-002
15
Discrete Probability
Distributions
These measures are given as follows:
2 2
2 npq q p q p
1 33 3
,
2 npq npq
Remark 3:
(i) As 0 < q < 1
q<1
npq < np [Multiplying both sides by np > 0]
Variance < Mean
Hence, for binomial distribution
Mean > Variance
16
2 Binomial Distribution
Putting p = in the equation of mean, we have
3
2
n 4 n = 6
3
by binomial distribution,
P[X = x] = n C x p x q n x
x 6 x
6 2 1
C x ; x = 0, 1, 2, …, 6.
3 3
Thus, the required probability
P X 1 P X 1 P X 2 P X 3 ... P X 6
= 1 P X 0
0 6 0
6 2 1 1 728
= 1 C0 1 11 .
3 3 729 729
Example 6: If X B (n, p). Find p if n = 6 and 9P[X = 4] = P[X = 2].
9p 2 1 p 2 2p
8p 2 2p 1 0
8p 2 4p 2p 1 0
4p 2p 1 1 2p 1 0
(2p 1)(4p 1) 0
2p 1 0 or 4p 1 0
1 1
p or
2 4
1
But p = rejected [ probability can never be negative]
2
1
Hence, p =
4
E4) Find the binomial distribution when sum of mean and variance of 5 trails
is 4.8.
17
Discrete Probability
Distributions
E5) The mean of a binomial distribution is 30 and standard deviation is 5.
Find the values of
i) n, p and q,
iii) Kurtosis.
p x P X x nCx p xq n x … (1)
If we replace x by x + 1, we have
p x 1 n
C x 1p x 1q n x 1
px n
Cx p xq n x
n n
n x nx p C x 1 x 1 n x 1 and
x 1 n x 1 n q n n
Cx x n x
x n x n x 1 p nx p
=
x 1 x n x 1 q x 1 q
nx p
p x 1 px ... (3)
x 1 q
Suppose we are given the observed frequency distribution. We first find the
mean from the given frequency distribution and equate it to np. From this, we
can find the value of p. After having obtained the value of p, we obtain
p 0 q n , where q = 1 – p.
nx
Then the recurrence relation i.e. p x 1 p x is applied to find the
x 1
values of p(1), p(2),…. After that, the expected (theoretical) frequencies f(0),
(1), f(2), … are obtained on multiplying each of the corresponding
probabilities i.e. p(0), p(1), p(2), … by N.
In this way, the binomial distribution is fitted to the given data. Thus, fitting of
a binomial distribution involves comparing the observed frequencies with the
expected frequencies to see how best the observed results fit with the
theoretical (expected) results.
Example 7: Four coins were tossed and number of heads noted. The
experiment is repeated 200 times.
The number of tosses showing 0, 1, 2, 3 and 4 heads were found distributed as
under. Fit a binomial distribution to these observed results assuming that the
nature of the coins is not known.
Number of heads: 0 1 2 3 4
Number of tosses 15 35 90 40 20
19
Discrete Probability
Distributions
Solution: Here n = 4, N = 200.
First, we obtain the mean of the given frequency distribution as follows:
20
Binomial Distribution
Remark 3: In the above example, if the nature of the coins had been known
e.g. if it had been given that “the coins are unbiased” then we would have
taken
1
p= and then the observed data would not have been used to find p. Such a
2
situation can be seen in the problem E6).
Here are two exercises for you:
E6) Seven coins are tossed and number of heads noted. The experiment is
repeated 128 times and the following distribution is obtained:
Number of heads 0 1 2 3 4 5 6 7
Frequencies 7 6 19 35 30 23 7 1
21
Discrete Probability
Distributions
E7) Out of 800 families with 4 children each, how many families would you
expect to have 3 boys and 1 girl, assuming equal probability of boys and
girls?
Now before ending this unit, let’s summarize what we have covered in it.
9.6 SUMMARY
The following main points have been covered in this unit:
1) A discrete random variable X is said to follow Bernoulli distribution with
parameter p if its probability mass function is given by
p x 1 p 1 x ; x 0,1
P X x
0; elsewhere
Its mean and variance are p and p(1 p), respectively. Third and fourth
central moments of this distribution are p 1 p 1 2p and
p(1 p) (1 3p 3p 2 ) respectively.
2) A discrete random variable X is said to follow binomial distribution if it
assumes only a finite number of non-negative integer values and its
probability mass function is given by
n C p x q n x ; x 0, 1, 2, ..., n
P X x x
0; elsewhere
where, n is the number of independent trials,
x is the number of successes in n trial,
p is the probability of success in each trial, and
q = 1 – p is the probability of failure in each trial.
3) The constants of Binomial distribution are:
Mean= np, Variance= npq,
3 npq q p , 4 npq 1 3 n 2 pq
2
1
q p , 2 3
1 6pq
,
npq npq
1 2p 1 6pq
1 , and 2
npq npq
4) For a binomial distribution, Mean > Variance.
5) Recurrence relation for the probabilities of binomial distribution is
nx p
p x 1 . .p x , x = 0, 1, 2, …, n 1
x 1 q
6) The expected frequencies of the binomial distribution are given by
f(x) = N.P X x N. n C x p x q n x ; x 0, 1, 2, ..., n
22
Binomial Distribution
9.7 SOLUTIONS/ANSWERS
E1) Let p be the probability of hitting the target (success) in a trial.
1 1 3
n = 5, p = , q 1 ,
4 4 4
and hence by binomial distribution, we have
x 5 x
1 3
P X x n Cx p x q n x 5 Cx ; x 0,1, 2,3, 4, 5.
4 4
Required probability = P X 2
P X 2 P X 3 P X 4 P X 5
1 P X 0 P X 1
5 1 0 3 50 5 1 1 3 51
1 C0 C1
4 4 4 4
= 0.0041
E3) Mean = np = 3 ... (1)
Variance = npq = 4 ... (2)
Dividing (2) by (1), we have
4
q 1 and hence not possible
3
[ q, being probability, cannot be greater than 1]
23
Discrete Probability
Distributions
np + npq = 4.8 [ given that Mean + Variance = 4.8]
5p + 5pq = 4.8
5[p + p (1–p)] = 4.8
5[p + p – p2] = 4.8
5p2 – 10p + 4.8 = 0
25p2 – 50p + 24 = 0 [Multiplying by 5]
2
25p – 30 p – 20 p + 24 = 0
5p(5p – 6) – 4 (5p – 6) = 0
(5p 6) (5p 4) = 0
6 4
p= ,
5 5
6
The first value p = is rejected [ probability can never exceed 1]
5
4 1
p = and hence q = 1 – p = .
5 5
Thus, the binomial distribution is
P X x n Cx px q n x
x 5 x
5 4 1
Cx ; x 0, 1, 2, 3, 4, 5.
5 5
The binomial distribution in tabular form is given as
X p(x)
0 0
4 1
5
1
5
C0 =
5 5 3125
1 1
4 1 20
4
5
C1
5 5 3125
2 2
4 1
3
160
5
C2 =
5 5 3125
3 3
4 1
2
640
5
C3 =
5 5 3125
4 4 1
4 1 1280
5
C4
5 5 3125
5 5 0
4 1 1024
5
C5
5 5 3125
24 np = 30, npq = 25
npq 25 5 5 5 1 1 Binomial Distribution
i) q , p 1 q 1 , n 30 n 180
np 30 6 6 6 6 6
1 5
ii) 2 npq 180 25
6 6
5 1 50
3 npq q p 25
6 6 3
32 4
1 3
2 225
Moment coefficient of skewness is given by
2
1 1
15
1 5
1 6
1 6pq 6 6 = 3 1
iii) 2 3 3
npq 25 150
1
2 2 3 0
150
So, the curve of the binomial distribution is leptokurtic.
1
E6) As the coin is unbiased, p = .
2
1 1
Here, n = 7, N = 128, p = , q 1 p .
2 2
7
1
n 1
p(0) = q = .
2 128
Expected frequencies are, therefore, obtained as follows:
Number px Expected or
of 1 theoretical
heads nx p 7x 2 Frequency
(X) . .
x 1 q x 1 1 f x N.p x
2
7x 128.p x
x 1
0 70 1 1
=7
0 1 128
1 7 1 1 7 7
3 7
11 128 128
2 72 5 7 21 21
3
2 1 3 128 128
3 7 3 5 21 35 35
1
3 1 3 128 128
25
Discrete Probability
Distributions 4 74 3 35 35 35
1
4 1 5 128 128
5 7 5 1 3 35 21 21
5 1 3 5 128 128
6 76 1 1 21 7 7
6 1 7 3 128 128
7 77 1 7 1 1
0
7 1 7 128 128
1
E7) Here, probability (p) to have a boy is and the probability (q) to have
2
1
a girl is , n = 4, N = 800.
2
Let X be the number of boys in a family.
by binomial distribution, the probability of having 3 boys in a family
of 4 children
= P[X = 3] [ P X x n C x p x q n x ]
3 4 3 4
4 1 1 1
C3 = 4
2 2 2
Hence, the expected number of families having 3 boys and 1 girl
1
= N.p 3 = 128 = 32
4
26
Poisson Distribution
UNIT 10 POISSON DISTRIBUTION
Structure
10.1 Introduction
Objectives
10.1 INTRODUCTION
In Unit 9, you have studied binomial distribution which is applied in the cases
where the probability of success and that of failure do not differ much from
each other and the number of trials in a random experiment is finite. However,
there may be practical situations where the probability of success is very small,
that is, there may be situations where the event occurs rarely and the number of
trials may not be known. For instance, the number of accidents occurring at a
particular spot on a road everyday is a rare event. For such rare events, we
cannot apply the binomial distribution. To these situations, we apply Poisson
distribution. The concept of Poisson distribution was developed by a French
mathematician, Simeon Denis Poisson (1781-1840) in the year 1837.
In this unit, we define and explain Poisson distribution in Sec. 10.2. Moments
of Poisson distribution are described in Sec. 10.3 and the process of fitting a
Poisson distribution is explained in Sec. 10.4.
Objectives
After studing this unit, you would be able to:
know the situations where Poisson distribution is applied;
define and explain Poisson distribution;
know the conditions under which binomial distribution tends to Poisson
distribution;
compute the mean, variance and other central moments of Poisson
distribution;
obtain recurrence relation for finding probabilities of this distribution;
and
know as to how a Poisson distribution is fitted to the observed data.
27
Discrete Probability
Distributions 10.2 POISSON DISTRIBUTION
In case of binomial distributions, as discussed in the last unit, we deal with
events whose occurrences and non-occurrences are almost equally important.
However, there may be events which do not occur as outcomes of a definite
number of trials of an experiment but occur rarely at random points of time and
for such events our interest lies only in the number of occurrences and not in
its non-occurrences. Examples of such events are:
i) Our interest may lie in how many printing mistakes are there on each page
of a book but we are not interested in counting the number of words
without any printing mistake.
ii) In production where control of quality is the major concern, it often
requires counting the number of defects (and not the non-defects) per item.
iii) One may intend to know the number of accidents during a particular time
interval.
Under such situations, binomial distribution cannot be applied as the value of n
is not definite and the probability of occurrence is very small. Other such
situations can be thought of yourself. Poisson distribution discovered by S.D.
Poisson (1781-1840) in 1837 can be applied to study these situations.
Poisson distribution is a limiting case of binomial distribution under the
following conditions:
i) n, the number of trials is indefinitely large, i.e. n .
ii) p, the constant probability of success for each trial is very small, i.e. p 0.
iii) np is a finite quantity say ‘’.
Remark 1
i) If X follows Poisson distribution with parameter then we shall use the
notation X P().
ii) If X and Y are two independent Poisson variates with parameters 1 and 2
repectively, then X + Y is also a Poisson variate with parameter 1+2. This
is known as additive property of Poisson distribution.
28
Poisson Distribution
10.3 MOMENTS OF POISSON DISTRIBUTION
r th order moment about origin of Poisson variate is
e x
'r E X r x r p x x r
x
x 0 x 0
e x
e x
e . x
1' x x
x 0 x x 1 x x 1 x 1 x 1
1 2 3
= e ...
0 1 2
1 2
e 1 ...
1 2
2 3
e e e 1 ... see Unit 2 of MST-001
1 2 3
=
Mean =
e x
2 x 2
x 0 x
e . x
x x 1 x [As done in Unit 9 of this Course]
x 0 x
e x e x
x x 1 x
x 0 x x
e x
e x
x x 1 x
x 2 x x 1 x 2 x 0 x
x
e x
e x
x2 x 2 x 0 x
2 3 4
e ... 1'
0 1 2
2
e 2 1 ... 1'
1 2
e 2e 1'
2
2
Variance of X is given as V(X) = 2 = '2 1'
2
= 2
=
29
Discrete Probability 3
Distributions '3 x 3p x
x 0
x
e 3 2
x 3 x 3
3 4 5
e ... 3 2
0 1 2
2
e 3 1 ... 3 2
1 2
e 3 e 3 2
3 3 2
Third order central moment is
3
3 3' 3 2' 1' 2 1'
= [On simplification]
e x
'4 x 4 .
x 3 x
'4 4 63 7 2
3 2 [On simplification]
30
Therefore, measures of skewness and kurtosis are given by Poisson Distribution
32 2 1 1
1 , 1 1 ; and
32 3
4 3 2 1 1
2 2
2
3 , 2 2 3 .
2
P X x ; x 0, 1, 2, ...
x x
Thus, the desired probabilities are:
(a) P[arrival of no heavy truck] = P[X = 0]
e 2 20
=
0
e2
P X 2 P X 3 ...
1 P X 1 P X 0
e2 20 e2 21
1
0 1
20 21
1 e2 = 1 e 2 1 2
0 1
1 0.1353 3 = 1 0.4059 0.5941
Note: In most of the cases for Poisson distribution, if we are to compute the
probabilities of the type P X a or P X a , we write them as
P X a 1 P X a and
32
P X a 1 P X a , because n may not be definite and hence we cannot Poisson Distribution
go up to the last value and hence the probability is written in terms of its
complementary probability.
Example 2: If the probability that an individual suffers a bad reaction from an
injection of a given serum is 0.001, determine the probability that out of 500
individuals
i) exactly 3,
ii) more than 2
individuals suffer from bad reaction
Solution: Let X be the Poisson variate, “Number of individuals suffering from
bad reaction”. Then,
n = 1500, p = 0.001,
= np = (1500) (0.001) = 1.5
By Poisson distribution,
e x
P X x , x 0, 1, 2, ...
x
x
e1.5 . 1.5
; x 0, 1, 2, ...
x
Thus,
i) The desired probability = P[X = 3]
3
e1.5 . 1.5
3
0.22313.375
= 0.1255
6
See the table given in the Appendix
at the end of this unit
1 P X 2
= 1 P X 2 P X 1 P X 0
33
Discrete Probability
Distributions 2.25
1 e1.5 1.5 1 1 3.625 e1.5
2
1 3.625 0.2231 = 1 – 0.8087 = 0.1913
Now, P X 1 2P X 2
e .1 e . 2
2
1 2
= 2 2 = 0 ( 1) = 0 = 0, 1
But = 0 is rejected
[ if = 0 then either n = 0 or p = 0 which implies that Poisson distribution
does not exist in this case.]
=1
Hence mean = = 1, and
Variance = = 1.
Example 5: If X and Y be two independent Poisson variates having means 1
and 2 respectively, find P[X + Y < 2].
Solution: As X ~ P(1), Y ~ P(2), therefore,
X + Y follows Poisson distribution with mean = 1 + 2 = 3.
Let X + Y = W. Hence, probability function of W is
e3 .3w
PW w ; w 0, 1, 2, ... .
w
Thus, the required probability= P[X + Y < 2]
= P[W < 2]
= P[W = 0] + P[W = 1]
34
e 3 .30 e3 .31 Poisson Distribution
=
0 1
To fit a Poisson distribution to the observed data, we find the theoretical (or
expected) frequencies corresponding to each value of the Poisson variate.
Process of finding the probabilities corresponding to each value of the Poisson
variate becomes easy if we use the recurrence relation for the probabilities of
Poisson distribution. So, in this section, we will first establish the recurrence
relation for probabilities and then define the Poisson frequency distribution
followed by the process of fitting a Poisson distribution.
Recurrence Formula for the Probabilities of Poisson Distribution
For a Poisson distribution with parameter , we have
e x
px … (1)
x
Changing x to x + 1, we have
e x 1
p x 1 … (2)
x 1
Dividing (2) by (1), we have
e
x 1
p x 1 x 1
px e x x 1
x
35
Discrete Probability
Distributions p x 1 px … (3)
x 1
This is the recurrence relation for probabilities of Poisson distribution. After
obtaining the value of p(0) using Poisson probability function i.e.
e 0
p 0 e , we can obtain p(1), p(2), p(3),…, on putting
0
x = 0, 1, 2, …. successively in (3).
1 P X 2
= 1– P X 0 P X 1
= N.P[X 2]
= (100) (0.09025)
= 9.025
Number of Accidents 0 1 2 3 4 5
Frequencies 1970 422 71 13 3 1
by Poisson distribution,
p(0) = e = e 0.25
Now, using the recurrence relation for probabilities of Poisson distribution i.e.
p x 1 p x and then multiplying each probability with N, we get the
x 1
expected frequencies as shown in the following table
38
E5) A typist commits the following mistakes per page in typing 100 pages. Poisson Distribution
Fit a Poisson distribution and calculate the theoretical frequencies.
Mistakes per 0 1 2 3 4 5
page(X)
Frequency 42 33 14 6 4 1
(f)
We now conclude this unit by giving a summary of what we have covered in it.
10.5 SUMMARY
The following main points have been covered in this unit:
1. A random variable X is said to follow Poisson distribution if it
assumes indefinite number of non-negative integer values and its
probability mass function is given by:
e x
; x 0, 1, 2, 3,... and 0.
px PX x x
0; elsewhere
2. For Poisson distribution, Mean = Variance = 3 = , 4 3 2
1 1 1 1
3. 1 , 1 , 2 3 , 2 for this distribution .
4. Recurrence relation for probabilities of Poisson distribution is
p x 1 .p x , x 0, 1, 2, 3,...
x 1
5. Expected frequencies for a Poisson distribution are given by
e x
f x N.P X x N. ; x 0, 1, 2, ...
x
If you want to see what our solutions/answers to the exercises in the unit are,
we have given them in the following section.
10.6 SOLUTIONS/ANSWERS
39
Discrete Probability
Distributions e . x
P X x , x 0, 1, 2, ...
x
x
e0.25 0.25
, x 0, 1, 2, ...
x
Therefore, P [at least one fatal accident]
P X 1 = 1 – P[X < 1] = 1 – P[X = 0]
0
e0.25 0.25
1 = 1 – e 0.25 = 1 – 0.78 = 0.22
0
2
2 – 2 = 0 ( – 2) = 0
2
= 0, 2.
= 0 is rejected,
=2
Hence, Mean = 2.
e 0
Now, P[X = 0] = e e2 = 0.1353,
0
[See table given in the Appendix at the end of this unit.]
2 4
e 4 e 2 e 2 16 2
and P X 4 0.1353
4 4 24 3
= 2(0.0451)
= 0.0902.
40
1 Poisson Distribution
E4) Here p , n 10, N 20000,
500
1
= np = 10 0.02
500
By Poisson frequency distribution
f x N.P X x
e x
= 20000 ; x 0, 1, 2,...
x
Now,
i) The number of packets containing one defective
= f(1)
1
e 0.02 . 0.02
= 20000
1
0.9802 0.0004
= 20000 = 3.9208 4
2
E5) The mean of the given distribution is computed as follows
X f fX
0 42 0
1 33 33
2 14 28
3 6 18
4 4 16
5 1 5
Total 100 100
Mean = fx 100 1
f 100
p 0 e e 1 = 0.3679.
41
Discrete Probability
Distributions
Now, we obtain p(1), p(2), p(3), p(4), p(5) using the recurrence relation for
probabilities of Poisson distribution i.e.
p x 1 p x ; x 0, 1, 2, 3, 4 and then obtain the expected frequencies
x 1
as shown in the following table:
X 1 px Expected/Theoretical
frequency
x 1 x 1
f x N.P X x
100.P X x
0 1 p 0 0.3679 36.79 37
1
0 1
1 1 p 1 1 0.3679 0.3679 36.79 37
0.5
11
2 1 p 2 0.5 0.3679 0.184 18.4 18
0.3333
2 1
3 1 p(3) 0.3333 0.184 0.0613 6.13 6
0.25
31
4 1 p(4)=0.25 0.0613 = 0.0153 1.53 2
0.2
4 1
5 1 p(5)=0.2 0.0153 = 0.0031 0.3 0
0.1667
5 1
42
Appendix Poisson Distribution
(0 < < I)
0 1 2 3 4 5 6 7 8 9
0.0 1.0000 0.9900 0.9802 0.9704 0.9608 0.9512 0.9418 0.9324 0.9231 0.9139
0.1 0.9048 0.8958 0.8860 0.8781 0.8694 0.8607 0.8521 0.8437 0.8353 0.8270
0.2 0.7187 0.8106 0.8025 0.7945 0.7866 0.7788 0.7711 0.7634 0.7558 0.7483
0.3 0.7408 0.7334 0.7261 0.7189 0.7118 0.7047 0.6970 0.6907 0.6839 0.6771
0.4 06703 0.6636 0.6570 0.6505 0.6440 0.6376 0.6313 0.6250 0.6188 0.6125
0.5 0.6065 0.6005 0.5945 0.5886 0.5827 0.5770 0.5712 0.5655 0.5599 0.5543
0.6 0.5448 0.5434 0.5379 0.5326 0.5278 0.5220 0.5160 0.5113 0.5066 0.5016
0.7 0.4966 0.4916 0.4868 0.4810 0.4771 0.4724 0.4670 0.4630 0.4584 0.4538
0.8 0.4493 0.4449 0.4404 0.4360 0.4317 0.4274 0.4232 0.4190 0.4148 0.4107
0.9 0.4066 0.4026 0.3985 0.3946 0.3906 0.3867 0.3829 0.3791 0.3753 0.3716
(=1, 2, 3, ...,10)
1 2 3 4 5 6 7 8 9 10
e 0.3679 0.1353 0.0498 0.0183 0.0070 0.0028 0.0009 0.0004 0.0001 0.00004
Note: To obtain values of e for other values of , use the laws of exponents i.e.
43
Discrete Uniform and
UNIT 11 DISCRETE UNIFORM AND Hypergeometric
Distributions
HYPERGEOMETRIC
DISTRIBUTIONS
Structure
11.1 Introduction
Objectives
11.1 INTRODUCTION
In the previous two units, we have discussed binomial distribution and its
limiting form i.e. Poisson distribution. Continuing the study of discrete
distributions, in the present unit, two more discrete distributions – Discrete
uniform and Hypergeometric distributions are discussed.
Discrete uniform distribution is applicable to those experiments where the
different values of random variable are equally likely. If the population is finite
and the sampling is done without replacement i.e. if the events are random but
not independent, then we use Hypergemetric distribution.
In this unit, discrete uniform distribution and hypergeometric distribution are
discussed in Secs. 11.2 and 11.3, respectively. We shall be discussing their
properties and applications also in these sections.
Objectives
After studing this unit, you should be able to:
define the discrete uniform and hypergeometric distributions;
compute their means and variances;
compute probabilities of events associated with these distributions; and
know the situations where these distributions are applicable.
45
Discrete Probability Definition: A random variable X is said to have a discrete uniform
Distributions
(rectangular) distribution if it takes any positive integer value from 1 to n,
and its probability mass function is given by
1
for x 1, 2, ..., n
P X x n
0, otherwise.
where n is called the parameter of the distribution.
For example, the random variable X, “the number on the unbiased die
when thrown”, takes on the positive integer values from 1 to 6 follows
discrete uniform distribution having the probability mass function.
1
, for x 1, 2, 3, 4, 5, 6.
P X x 6
0 , otherwise.
where
n 1
E X [Obtained above]
2
n
E X 2 x 2 .p(x)
x 1
n
1
and E X 2 x 2 .
x 1 n
1
[12 22 32 ... n 2 ]
n
sum of squares of first n
1 n n 1 2n 1 n n 1 2n 1
natural numbers
n 6 6
(see Unit 3of Course MST 001)
46
n 1 2n 1 Discrete Uniform and
Hypergeometric
6 Distributions
Variance =
n 1 2n 1 n 1 2
6 2
n 1 2
2n 1 3 n 1
12
n 1 n 1 n 1 n 2 1
4n 2 3n 3
12 12 12
Example 1: Find the mean and variance of a number on an unbiased die when
thrown.
Solution: Let X be the number on an unbiased die when thrown,
X can take the values 1, 2, 3, 4, 5, 6 with
1
P X x ; x 1, 2, 3, 4, 5, 6.
6
Hence, by uniform distribution, we have
n 1 6 1 7
Mean = , and
2 2 2
2
n 2 1 6 1 35
Variance = .
12 12 12
Uniform Frequency Distribution
If an experiment, satisfying the requirements of discrete uniform distribution,
is repeated N times, then expected frequency of a value of random variable is
given by
f x N.P X x ; x 1, 2, ..., n
1
N. ; x 1, 2, 3,..., n.
n
Example 2: If an unbiased die is thrown 120 times, find the expected
frequency of appearing 1, 2, 3, 4, 5, 6 on the die.
Solution: Let X be the uniform discrete random variable, “the number on the
unbiased die when thrown”.
1
P X x ; x 1, 2, ..., 6
6
Hence, the expected frequencies of the value of random variable are given as
computed in the following table:
47
Discrete Probability
Distributions
X P X x Expected/Theoretical frequencies
f x N.P[X x] 120.P[X x]
1 1 1
120 20
6 6
2 1 1
120 20
6 6
3 1 1
120 20
6 6
4 1 1
120 20
6 6
5 1 1
120 20
6 6
6 1 1
120 20
6 6
P A1 A 2 A 3 P A1 A 2 A 3 P A1 A 2 A3
P A1 P A 2 A1 P A3 A1 A 2 P A1 P A 2 A1 P A 3 A1 A 2
P A1 P A 2 A1 P A 3 A1 A 2
49
Discrete Probability 5
Distributions C 2 15C8
20
.
C10
Note: The result remains exactly same whether the items are drawn one by one
without replacement or drawn at once.
Let us now generalize the above argument for N balls, of which M are white
and N M are black. Of these, n balls are chosen at random without
replacement. Let X be a random variable that denote the number of white balls
drawn. Then, the probability of X x white balls among the n balls drawn is
given by
M
C x . N M Cn x
P X x N
Cn
n M
C x . N M C n x
= x.
x 1
N
Cn
50
n
M M 1 C x 1. N M C n x Discrete Uniform and
= x.
x 1 x
. N
Cn
Hypergeometric
Distributions
n
M
= N
Cn
x 1
M 1
C x 1. N M Cn x
M M 1
N
C0 . N M Cn 1 M 1C1. N M C n 2 ... M 1Cn 1. N M C 0
Cn
M
N
Cn
N 1
Cn 1
[This result is obtained using properties of binomial coefficients and involves
lot of calculations and hence its derivation may be skipped. It may be noticed
that in this result the left upper suffix and also the right lower suffix is the sum
of the corresponding suffices of the binomial coefficients involved in each
product term. However, the result used in the above expression is enrectangled
below for the interesting learners.]
We know that
mn m n
1 x 1 x . 1 x [By the method of indices]
m
C0 x m m C1 x m 1 m C 2 x m 2 ... m Cm
. n C0 x n n C1 x n 1 n C2 x n 2 ... n C n
Comparing coefficients of X m n r , we have
mn
Cr m
C0 .n Cr m C1 . n C r 1 ... mC r . n C 0
M n Nn N 1
= .
N N n n 1
M.n n 1 N 1 nM
. .
N. N 1 n 1 N
E X 2 E X X 1 X
E X X 1 E X
n M
C x . N M Cn x nM
x x 1 . N
x 0 Cn N
n
M M 1 M 2 C x 2 . N M C n x nM
x x 1 . . . N
x 0 x x 1 Cn N
51
Discrete Probability
M M 1 n nM
Distributions N
Cn x 0
M2
C x 2 . N M C n x
N
M M 1 N 2 nM
N
Cn
C n 2
N
[The result in the first term has been obtained using a property of
binomial coefficients as done above for finding E(X).]
M M 1 N n n N2 nM
= .
N n 2 Nn N
M(M 1)n(n 1) nM
N(N 1) N
Thus,
2
2 M M 1 n n 1 nM nM
V X E X 2
E X
N N 1
N N
NM N M N n
[On simplification]
N 2 N 1
We now conclude this unit by giving a summary of what we have covered in it.
11.4 SUMMARY
The following main points have been covered in this unit:
1) A random variable X is said to have a discrete uniform (rectangular)
distribution if it takes any positive integer value from 1 to n, and its
probability mass function is given by
1
for x 1, 2, ..., n
P X x n
0, otherwise.
53
Discrete Probability
Distributions 11.5 SOLUTIONS/ANSWERS
E1) Let X be the number on the ticket drawn randomly from an urn containing
tickets numbered from 1 to 10.
X is a discrete uniform random variable having the values
1
1, 2, 3, 4, …, 10 with probability of each of these values equal to .
10
Thus, the expected frequencies for the values of X are obtained as in the
following table:
X P X x Expected/Theoretical frequency
f x N.P X x
150.P X x
1 1 1
150 15
10 10
2 1 1
150 15
10 10
3 1 1
150 15
10 10
4 1 1
150 15
10 10
5 1 1
150 15
10 10
6 1 1
150 15
10 10
7 1 1
150 15
10 10
8 1 1
150 15
10 10
9 1 1
150 15
10 10
10 1 1
150 15
10 10
C 0 . 2510 C2 1 . C2
10 15
15 14
25
25 = 0.35.
C2 C2 25 24
54
Geometric and Negative
UNIT 12 GEOMETRIC AND NEGATIVE Binomial Distributions
BINOMIAL DISTRIBUTIONS
Structure
12.1 Introduction
Objectives
12.1 INTRODUCTION
In Units 9 and 11, we have studied the discrete distributions – Bernoulli,
Binomial, Discrete Uniform and Hypergeometric. In each of these
distributions, the random variable takes finite number of values. There may
also be situations where the discrete random variable assumes countably
infinite values. Poisson distribution, wherein discrete random variable takes an
indefinite number of values with very low probability of occurrence of event,
has already been discussed in Unit 10. Dealing with some more situations
where discrete random variable assumes countably infinite values, we, in the
present unit, discuss geometric and negative binomial distributions. It is
pertinent to mention here that negative binomial distribution is a generalization
of geometric distribution. Some instances where these distributions can be
applied are “deaths of insects”, “number of insect bites”.
Like binomial distribution, geometric and negative binomial distributions also
have independent trials with constant probability of success in each trial. But,
in binomial distribution, the number of trials (n) is fixed whereas in geometric
distribution, trials are performed till first success and in negative binomial
distribution trials are performed till a certain number of successes.
Secs. 12.2 and 12.3 of this unit discuss geometric and negative binomial
distribution, respectively along with their properties.
Objectives
After studing this unit, you would be able to:
define the geometric and negative binomial distributions;
calculate the mean and variance of these distributions;
compute probabilities of events associated with these distributions;
identify the situations where these distributions can be applied; and
know about distinguishing features of these distributions like
memoryless property of geometric distribution.
55
Discrete Probability
Distributions 12.2 GEOMETRIC DISTRIBUTION
Let us consider Bernoulli trials i.e. independent trials having the constant
probability ‘p’ of success in each trial. Each trial has two possible outcomes –
success or failure. Now, suppose the trial is performed repeatedly till we get
the success. Let X be the number of failures preceding the first success.
Example of such a situation is “tossing a coin until head turns up”. X defined
above may take the values 0, 1, 2, …. Letting q be the probability of failure in
each trial, we have
P X 0 P[Zero failure preceding the first success]
= P(S)
= p,
P X 1 = P[One failure preceding the first success]
= P[F S]
= P(F) P(S) [ trials are independent]
= qp
P[X = 2] = P[Two failures preceding the first success]
= P[F F S]
= P(F) P(F) P(S)
= qqp
= q2 p
and so on.
Therefore, in general, probability of x failures preceding the first success is
P[X = x] = q x p; x 0, 1, 2, 3, ...
Notice that for x 0, 1, 2, 3,... the respective probabilities p, qp, q2p, q3p,…
are the terms of geometric progression series with common ratio q. That is
why, the above probability distribution is known as geometric distribution [see
Unit 3 of MST-001].
Hence, the above discussion leads to the following definition:
Definition: A random variable X is said to follow geometric distribution if it
assumes non-negative integer values and its probability mass function is given
by
q x p for x 0, 1, 2, ...
P X x
0, otherwise
Notice that
x 2 3
q p p q p q p q p ...
x0
= p[1 + q + q2 + q3+ …]
56
1 p Geometric and Negative
= p 1 Binomial Distributions
1 q p
a 1
[ sum of infinite terms of G.P. (see Unit 3 of MST-001)]
1 r 1 q
Now, let us take up some examples of this distribution.
Example 1: An unbiased die is cast until 6 appear. What is the probability that
it must be cast more than five times?
Solution: Let p be the probability of a success i.e. getting 6 in a throw of the
die
1 5
p and q = 1 – p =
6 6
Let X be the number of failures preceding the first success.
by geometric distribution,
P X x q x p; x 0, 1, 2, 3,...
x
5 1
for x 0, 1, 2, 3, ...
6 6
Thus, the desired probability = P[The die is to be cast more than five times]
= P [The number of throws is at least 6]
The number of failures preceding
P
the first success is at least 5
= P[X 5]
= P[X = 5] + P[X = 6] + P[X = 7] +…
5 6 7
5 1 5 1 5 1
= ...
6 6 6 6 6 6
5 2
5 1 5 5
= 1 ...
6 6 6 6
5
5
5 1 1 5
=
6 6 1 5 6
6
Let us now discuss some properties of geometric distribution.
Mean and Variance
Mean of the geometric distribution is given as
Mean = E(X) = xq p x
= p x q x p x q x 1.q
x0 x 1 x 1
57
Discrete Probability
d x
Distributions pq x q x 1 p q q
x 1 x 1 dq
d x
dq
q is the differentiation of qx w.r.t. q where x is kept as constant
d
[ x m mx m 1 , where m is constant (see Unit 6 of MST-001)]
dx
d
pq q q 2 q 3 ...
dq
d q
pq
dq 1 q
1 q q
= pq 2
p
q
. ... (1)
p
Variance of the geometric distribution is
V(X) = E(X2) – [E(X)]2,
where
E(X2) = x px 2
x
= x x 1 x p x
x 0
q
x x 1 q x p [Using (1) in second term]
x 2 p
q
pq 2 x x 1 q x 2 [ q x q x 2 .q 2 ]
x2 p
58
d2 x Geometric and Negative
d d x
2 q q Binomial Distributions
dq dq dq
d
q d
pq 2 2 q x xq x 1 x x 1 q x 2
x 2 dq p dq
treating x as constant
d2 x q
pq 2 q
dq 2 x 2 p
d2 q
pq 2 q 2 q 3 q 4 ...
2
dq p
d 2 q2 q
pq 2
dq 2 1 q p
d 1 q 2q q 1 q
2
2
pq 2
dq 1 q p
d 2q 2q 2 q 2 q
pq 2
dq 1 q 2 p
d 2q q 2 q
pq 2
dq 1 q 2 p
1 q 2 2 2q 2q q 2 .2 1 q 1 1 q
2
pq 4
1 q p
pq
2
1 q 2 1 q 2 2 2q q 2
q
1 q
4 p
2
p 2p 2 2q 2 q q
pq . 4
as p = 1 q
p p
2
q q
2 p 2 q 2 q
p p
2
q 2 q
2 1 q q 2 q
p p
2
q q
2 1 q 2 2q 2q q 2
p p
q2 q 2q 2 q
2 2
1
p2 p p p
59
Discrete Probability
Distributions V(X) = E(X2) – [E(X)]2
2
2q 2 q q
= 2
p p p
q2 q q q q 1 p q 1
= 2
1 1 1 1
p p pp p p pp
q 1 q
. 2
p p p
q q Mean
Remark 1: Variance = 2
p p.p p
Mean
Variance > Mean [ p < 1 Mean ]
p
E2) Determine the geometric distribution for which the mean is 3 and variance
is 4.
60
Lack of Memory Property Geometric and Negative
Binomial Distributions
Now, let us discuss the distinguishing property of the geometric distribution
i.e. the ‘lack of memory’ property or ‘forgetfulness property’. For example, in
a random experiment satisfying geometric distribution the wait up to 3 trials
(say) for the first success does not affect the probability that one will have to
wait for a further 5 trials if it is given that the first two trials are failures. The
geometric distribution is the only discrete distribution which has the
forgetfulness (memoryless) property. However, there is one continuous
distribution which also has the memoryless property and that is the exponential
distribution which we will study in Unit 15 of MST-003. The exponential
distribution is also the only continuous distribution having this property. It is
pertinent to mention here that in several aspects, the geometric distribution is
discrete analogs of the exponential distribution.
Let us now give mathematical/statistical discussion on ‘memoryless property’
of geometric distribution.
Suppose an event occurs at one of the trials 1, 2, 3, 4,… and the occurrence
time X has a geometric distribution with probability p. Let X be the number of
trials preceding to which one has to wait for successful attempt.
Thus, P X j P X j P X j 1 ...
= q jp q j1p q j 2 p ...
= q jp 1 q q 2 ...
1 1
= q jp qj p qj
1 q p
Now, let us consider the event X j k
P (X j k) X j
P X j
P X j k
[ X j k implies that j]
P X j
q j k
j
qk
q
P X j = q j already
P X k
obtained in this section
So, P X j k X j P X k
61
Discrete Probability
Distributions
The above result reveals that the conditional probability of at least first j+k
trials are unsuccessful before the first success given that at least first j trial
were unsuccessful, is the same as the probability that the first k trials were
unsuccessful. So, the probability to get first success remains same if we start
counting of k unsuccessful trials from anywhere provided all the trials
preceding to it are unsuccessful i.e. the future does not depend on past, it
depends only on the present. So, the geometric distribution forgets the
preceding trials and hence this property is given the name “forgetfulness
property” or “Memoryless property” or “lack of memory” property.
C r 1 p r 1 q
x r 1 x r 1 r 1
x r 1C r 1 p r 1q x , where q = 1 – p
[ by binomial distribution, the probability of x successes in n trials with p as
the probability of success is n C x p x q n x . ]
Therefore,
P[ x failures preceding the r th success]
th
=P[{First (r – 1) successes in x r 1 trials} {success in x r trial}]
th
=P[First(r – 1) successes in x r 1 trails]. P[success in x r trial]
x r 1
C r 1 p r 1 q x p
x r 1Cr 1 p r q x
The above discussion leads to the following definition:
Definition: A random variable X is said to follow a negative binomial
distribution with parameters r (a positive integer) and p (0 < p < 1) if its
probability mass function is given by:
62
x r 1 C r 1 p r q x for x 0, 1, 2, 3,... Geometric and Negative
P X x Binomial Distributions
0, otherwise
Now, as we know that
n
Cr n Cn r , [See ‘combination’ in Unit 4 of MST-001]
x r 1
C r 1 can be written as
x r 1
C x r 1 r 1 x r 1C x
x r 1 r x 1
x x r 1 x x r 1
r x 1 r x 2 ... r 1 r r 1
x r 1
r x 1 r x 2 ... r 1 r
x
r x 1 r x 2 ... r 1 r
x
1
x r x 1 r x 2 ... r 1 r
x
x
1 r r 1 r 2 ...r x 1
x
[Writing the terms in the numerator in reverse order]
x r
1
x
n
Note: The symbol stands for n C x if n is positive integer and is equal to
x
n n 1 n 2 ...n x 1 n
. We may also use the symbol if n is any
x x
real but in this case though it does not stand for n C x , yet it is equal to
n n 1 n 2 ...n n x
.
x
63
Discrete Probability
Distributions x r
P X x 1 p r q x
x
r x
q p r for x 0, 1, 2, 3, ...
x
r x r x r
q 1 p for x 0, 1, 2, ...
x
r x rx
Here, the expression q 1 is similar to the binomial distribution
x
n x n x
p q
x
r x rx
r r
q 1 is the general term of 1 q 1 q
x
n
You have already studied in Unit 9 of this Course that p x q n x is the
x
n
general term of q p .
and hence
r x r x r
P X x q 1 .p r is the general term of 1 q p r
x
P[X = 0], P[X = 1], P[X = 2],… are the successive terms of the binomial
r
expansion 1 q p r and hence the sum of these probabilities
r
= 1 q p r
= p r pr [ 1 – q = p]
= 1,
which must be, being a probability distribution.
Also, as the probabilities of the negative binomial distribution for
X 0, 1, 2, ... are the successive terms of
r r r
r r 1
r 1 1 q
1 q p = 1 q 1 q , which is a binomial
p p p p
expansion with negative index ( r), it is for this reason the probability
distribution given above is called the negative binomial distribution.
Mean and Variance
Mean and variance of the negative binomial distribution can be obtained on
observing the form of this distribution and comparing it with the binomial
distribution as follows:
64
The probabilities of binomial distribution for X = 0, 1, 2, … are the successive Geometric and Negative
terms of the binomial expansion of (q + p)n and the mean and variance Binomial Distributions
obtained for the distribution are
Mean = np n p i.e. Product of index and second term in (q + p)
Variance = npq = (n) (p) (q) i.e. Product of index, second term in (q + p) and
first term in (q + p)
Similarly, the probabilities of negative binomial distribution for X = 0, 1, 2, ...
r
1 q
are the successive term of the expansion of and thus, its mean
p p
and variance are:
1 q q rq
Mean = (index) [second term in ] = r , and
p p p p
1 q 1 q
Variance = (index) [second term in ] [First term in ]
p p p p
q 1
= r
p p
rq
.
p2
Remark 2
i) If we take r = 1, we have P X x pq x for x 0, 1, 2, ... which is
geometric probability distribution.
Hence, geometric distribution is a particular case of negative binomial
distribution and the latter may be regarded as the generalisation of the
former.
ii) Putting r = 1 in the formulas of mean and variance of negative binomial
distribution, we have
Mean =
1 q q , and
p p
Variance =
1 q q
,
2
p p2
which are the mean and variance of geometric distribution.
Hence, the expected number of misprints in the document till he catches the
20th misprint is 25.
Now, we are sure that you will be able to solve the following exercises:
E3) Find the probability that fourth five is obtained on the tenth throw of an
unbiased die.
E4) An item is produced by a machine in large numbers. The machine is
known to produce 10 per cent defectives. A quality control engineer is
testing the item randomly. What is the probability that at least 3 items
are examined in order to get 2 defectives?
E5) Find the expected number of children in a family which stops
producing children after having the second daughter. Assume, the male
and female births are equally probable.
66
We now conclude this unit by giving a summary of what we have covered in it. Geometric and Negative
Binomial Distributions
12.4 SUMMARY
The following main points have been covered in this unit:
1) A random variable X is said to follow geometric distribution if it
assumes non-negative integer values and its probability mass function is
given by
q x p for x 0, 1, 2, ...
P X x
0, otherwise
q q
2) For geometric distribution, mean and variance = 2 .
p p
3) A random variable X is said to follow a negative binomial distribution
with parameters r (a positive integer) and p (0 < p < 1) if its probability
mass function is given by:
x r 1 C r 1 p r q x for x 0, 1, 2, 3, ...
P X x
0, otherwise
rq rq
4) For negative binomial distribution, mean and variance = 2 .
p p
5) For both these distributions, variance > mean.
12.5 SOLUTIONS/ANSWERS
E1) Let p be the probability of success i.e. hitting the target in an attempt.
p 0.6, q 1 p 0.4 .
Let X be the number of unsuccessful attempts preceding the first
successful attempt.
by geometric distribution,
P X x q x p for x 0, 1, 2, ...
x
= 0.4 0.6 for x 0, 1, 2, ...
67
Discrete Probability
Distributions 1 4 3 1
p and hence q = .
p 3 4 4
Now, let X be the number of failures preceding the first success,
P X x q xp
x
1 3
= for x 0, 1, 2, ...
4 4
This is the desired probability distribution.
E3) It is a negative binomial situation with
1 5
r = 4, x r 10 x 6, p and hence q
6 6
P X 6 x r 1Cr 1p r q x
4 6
1 5
6 41C 41
6 6
625 25
9 C3 .
36 36 36 36 36
9 8 7 625 25
= 0.0217
6 36 36 36 36 36
E4) It is a negative binomial situation with r 2, x r 3 x 1, p 0.1 and
hence q = 0.9.
Now, the required probability P X r 3
= P X 1
= 1 P X 0
1 0 r 1 C r 1 p r q 0
68
Normal Distribution
UNIT 13 NORMAL DISTRIBUTION
Structure
13.1 Introduction
Objectives
13.1 INTRODUCTION
In Units 9 to 12, we have studied standard discrete distributions. From this unit
onwards, we are going to discuss standard continuous univariate distributions.
This unit and the next unit deal with normal distribution. Normal distribution
has wide spread applications. It is being used in almost all data-based research
in the field of agriculture, trade, business, industry and the society. For
instance, normal distribution is a good approximation to the distribution of
heights of randomly selected large number of students studying at the same
level in a university.
The normal distribution has a unique position in probability theory, and it can
be used as approximation to most of the other distributions. Discrete
distributions occurring in practice including binomial, Poisson,
hypergeometric, etc. already studied in the previous block (Block 3) can also
be approximated by normal distribution. You will notice in the subsequent
courses that theory of estimation of population parameters and testing of
hypotheses on the basis of sample statistics have also been developed using
the concept of normal distribution as most of the sampling distributions tend to
normality for large samples. Therefore, study of normal distribution is very
important.
Due to various properties and applications of the normal distribution, we have
covered it in two units – Units 13 and 14. In the present unit, normal
distribution is introduced and explained in Sec. 13.2. Chief characteristics of
normal distribution are discussed in Sec. 13.3. Secs. 13.4, 13.5 and 13.6
describes the moments, mode, median and mean deviation about mean of the
distribution.
Objectives
After studing this unit, you would be able to:
introduce and explain the normal distribution;
5
Continuous Probability
Distributions know the conditions under which binomial and Poisson distributions
tend to normal distribution;
state various characteristics of the normal distribution;
compute the moments, mode, median and mean deviation about mean
of the distribution; and
solve various practical problems based on the above properties of
normal distribution.
q p 1 2p
1 1 , and
npq npq
1 6pq
2 2 3 .
npq
6
From the above results, it may be noticed that if n , then moment Normal Distribution
coefficient of skewness ( 1 ) 0 and the moment coefficient of kurtosis i.e.
2 3 or 2 0 . Hence, as n , the distribution becomes symmetrical
and the curve of the distribution becomes mesokurtic, which is the main
feature of normal distribution.
Normal Distribution as a Limiting Case of Poisson Distribution
You have already studied in Unit 10 of this course that Poisson distribution is a
limiting case of binomial distribution under the following conditions:
i) n, the number of trials is indefinitely large i.e. n
ii) p, the constant probability of success for each trial is very small i.e. p 0.
iii) np is a finite quantity say ‘’.
As we have discussed above that there is a relation between the binomial and
normal distributions. It can, in fact, be shown that the Poisson distribution
approaches a normal distribution with standardized variable given by
X
Z as λ increases indefinitely.
For Poisson distribution, you have already studied in Unit 10 of the course that
32 2 1 1
1 3
3 1 1 ; and
2
4 3 2 1 1
2 2
2 3 2 2 3 .
2
Like binomial distribution, here in case of Poisson distribution also it may be
noticed from the above results that the moment coefficient of skewness
( 1 ) 0 and the moment coefficient of kurtosis i.e. 2 3 or 2 0 as
λ . Hence, as λ , the distribution becomes symmetrical and the curve
of the distribution becomes mesokurtic, which is the main feature of normal
distribution.
Under the conditions discussed above, a random variable following a binomial
distribution or following a Poisson distribution approaches to follow normal
distribution, which is defined as follows:
Definition: A continuous random variable X is said to follow normal
distribution with parameters ( ) and 2(>0) if it takes on any real
value and its probability density function is given by
2
1 x
1
f x e 2
, x ;
2
which may also be written as
1 1 x 2
= exp , x .
2 2
7
Continuous Probability
Distributions Remark
i) The probability function represented by f x may also be written as
f x; , 2 .
ii) If a random variable X follows normal distribution with mean and
variance 2, then we may write, “X is distributed to N(, 2)” and is
expressed as X N(, 2).
iii) No continuous probability function and hence the normal distribution
can be used to obtain the probability of occurrence of a particular value
of the random variable. This is because such probability is very small,
so instead of specifying the probability of taking a particular value by
the random variable, we specify the probability of its lying within
interval. For detail discussion on the concept, Sec. 5.4 of Unit 5 may be
referred to.
X
iv) If X ~ N , 2 , then Z is standard normal variate having
mean ‘0’ and variance ‘1’. The values of mean and variance of standard
normal variate are obtained as under, for which properties of
expectation and variance are used (see Unit 8 of this course).
X 1
Mean of Z i.e. E Z E = E X
1
E X
1
= 0 [ E(X) = Mean of X = ]
X
Variance of Z i.e. V(Z) = V
1 1
V X 2 V X
2
1 2
2
[ variance of X is 2]
= 1.
X
v) The probability density function of standard normal variate Z
1 12 z2
is given by z e , z .
2
8
vi) The graph of the normal probability function f x with respect to x is Normal Distribution
famous ‘bell-shaped’ curve. The top of the bell is directly above the
mean . For large value of , the curve tends to flatten out and for
small values of , it has a sharp peak as shown in (Fig. 13.1):
Fig. 13.1
μ = 40, σ 2 = 25 σ = ± 25
5 0always
Now, the p.d.f. of random variable X is given by
2
1 x
1
2 σ
f(x) = e
σ 2π
2
1 x 40
1
2 5
= e , x
5 2π
9
Continuous Probability
Distributions
(ii) Here we are given X ~ N ( 36, 20).
in usual notations, we have
μ 36, σ 2 20 σ 20
Now, the p.d.f. of random variable X is given by
2
1 x µ
1
2 σ
f(x) = e
σ 2π
2
1 x ( 36) 1
1 1 (x +36) 2
2 20
= e = e 40
20 2π 40π
1
1 (x +36) 2
= e 40 , x
2 10π
(iii) Here we are given X ~ N (0, 2).
in usual notations, we have
μ 0, σ 2 2 σ 2
Now, the p.d.f. of random variable X is given by
2 2
1 x µ 1 x 0
1
2 σ
1
2 2
f(x) = e = e
σ 2π 2 2π
1
1 x2
4
e , x
2 π
Example 2: Below, in each case, there is given the p.d.f. of a normally
distributed random variable. Obtain the parameters (mean and variance)
of the variable.
2
1 x 46
1
2 6
(i) f(x) = e , x
6 2π
1
1 (x 60) 2
(ii) f(x) = e 32 , x
4 2π
2
1 x 46
1
2 6
Solution: (i) f(x) = e , x
6 2π
Comparing it with,
2
1 x µ
1
2 σ
f(x) = e
σ 2π
we have
µ 46, 6
10
1
1 (x 60) 2 Normal Distribution
(ii) f(x) = e 32 , x
4 2π
2
1 X 60
1
2 4
= e
4 2π
Comparing it with,
2
1 x µ
1
2 σ
f(x) = e
σ 2π
we get
µ = 60, 4
11
Continuous Probability
Distributions iv) f x , being the probability, can never be negative and hence no portion
of the curve lies below x-axis.
v) Though x-axis becomes closer and closer to the normal curve as the
magnitude of the value of x goes towards or , yet it never touches it.
µ 32
β1 = 3
0, 2 24 3
µ2 2
i.e. the distribution is symmetrical and curve is always mesokurtic.
Note: Not only µ1 and µ 3 are 0 but all the odd order central moments
are zero for a normal distribution.
P X f x dx 0.6827,
1
Or P 1 Z 1 z dz = 0.6827,
1
2
P 2 X 2 f x dx 0.9544,
2
2
Or P 2 Z 2 z dz = 0.9544, and
2
3
P 3 X 3 f x dx 0.9973.
3
3
Or P 3 Z 3 z dz = 0.9973.
3
This property and its applications will be discussed in detail in Unit 14.
Let us now establish some of these properties.
n .
2 x 31 x
e.g. x e dx x e dx 3
0 0
1
1/ 2 x
1
x 1
and 0 x e dx 0 x 2 e dx 2
Some properties of the gamma function are
i) If n > 1, n n 1 n 1
ii) If n is a positive integer, then n n 1
1
iii) .
2
13
Continuous Probability
Distributions
Now, the first four central moments of normal distribution are obtained as
follows:
First Order Central Moment
As first order central moment (1) of any distribution is always zero [see Unit
3 of MST-002], therefore, first order central moment (1) of normal
distribution = 0.
Second Order Central Moment
2
2 = x f x dx
[See Unit 8 of MST-003]
2
1 x
2 1
x
2
. e dx
2
x
Put z x z
Differentiating
dx
dz
dx dz
Also, when x , we have z and
and when x , z
1
2 1 z2
2 z
2
e 2 dz
1
2 2
z2
2
ze dz
2
2
z
2 2 2
2
z e dz
2 0
z2
2
on changing z to – z, the integrand i.e. z e does not get changed 2
f z dz 2 f z dz if f z is even function of z
0
1
1 12
Now, put z 2 2t z 2 t 2 t 2 dz 2 t dt
2
14
1 Normal Distribution
2 1 2
2 2 (2t)e t 1
dt 2 2 t 2 e t dt
0 0
2t 2
2 2 32 1 t
t e dt
0
2 2 3
[By def. of gamma function]
2
2 2 1 1
[By Property (i) of gamma function]
2 2
2
[By Property (iii) of gamma function]
2
Third Order Central Moment
3
3 = x f x dx
2
1 x
3 1
x
2
e dx
2
x
Put z x z dx dz
and hence
3 1 12 z 2
3
z .
2
e dz
1
1 z2
3 z 3e 2
dz
2
1 1 1
z2 z2 z2
3 3 3
Now, as integrand z e 2 changes to z e 2 on changing z to – z i.e. z e 2
is an odd function of z.
Therefore, using the following property of definite integral:
a
we have,
1
3 3 0 = 0
2
15
Continuous Probability
Distributions Fourth Order Central Moment
2
1 x
4 4 1
x f x dx x
2
4 = e dx
2
x
Putting z
dx dz
4 1 12 z 2
4 z
2
e dz
1 1
4 z2 2 4 z2
z 4e 2
dz z 4
e 2
dz
2 2 0
z2
4
integrand z . e does not get changed on changing z to – z and hence it is
2
z2
Put t z2 = 2t
2
2zdz = 2dt
z dz = dt
dt dt
dz =
z 2t
2 4 1
4 (2t) 2 e t dt
2 0 2t
24 .4 2 t 1 4 4 32 t
= t e dt t e dt
2 2 0 t 0
4 4 52 1 t
t e dt
0
4 4 5
[By definition of gamma function]
2
4 4 3 3
[By Property (i) of gamma function]
2 2
4 4 3 1 1
[By Property (i) of gamma function]
22 2
3 4 1
[Using (Property (iii) of gamma function)]
2
3 4
16
Thus, the first four central moments of normal distribution are Normal Distribution
1 0, 2 2 , 3 0, 4 3 4 .
32 4 34 3 4
1 = 0, 2 = 3
23 22 2
2
4
Therefore, moment coefficient of skewness ( 1 ) 0
Now, let us obtain the mode and median for normal distribution in the next
section.
(x )
f(x) 0
2
x 0 as f (x) 0
17
Continuous Probability x
Distributions
Now differentiating (2) w.r.t. x, we have
x 1
f (x) f '(x) f (x)
2
f () f ( )
f (x) at x 0 2
2 0
x = is point where function has a maximum value.
Mode of X is .
Median
Let M denote the median of the normally distributed random variable X.
We know that median divides the distribution into two equal parts
M
1
f (x)dx f (x)dx
M
2
M
1
f (x)dx 2
2
µ 1 x M
1
1
e 2 dx f (x)dx
σ 2π
2
x
In the first integral, let us put z
Therefore, dx dz
Also when x z 0 , and
when x z .
Thus, we have
0 M
1 12 z 2 1
e dz f (x)dx
2
2
0 1 M
1 z2 1
e 2 dz f (x)dx
2
2
1 12 z2
M Z is s.n.v.with p.d.f. (z) e
1 1 2
+ f (x) dx
2 2 0
1
So (z)dz 1 (z)dz
2
M
f(x)dx = 0
µ
M as f (x) 0
18
Median of X Normal Distribution
2
1 x
1
2
x e dx
2
x
Put z x z
dx
dz
1
1 z2
M.D. about mean = | z | e 2 dz
2
1
z2
2
| z |e dz
2
1
z2
Now, | z | e 2 (integrand) is an even function z as it does not get changed on
changing z to –z, by the property,
a a
“ f (x)dx 2 f (x)dx, if f x is an even function of x ”, we have
a 0
1
z2
M.D. about mean = 2 z e 2 dz
2 0
Now, as the range of z is from 0 to ∞ i.e. z takes non-negative values,
z = z and hence
1
2 z2
2
M.D. about mean ze dz
2 0
z2
Put t z 2 2t 2zdz = 2dt zdz = dt
2
2 e t 2 2
M.D. about mean = e t
dt 2 0 1 =
2 0 1 0
19
Continuous Probability
Distributions 2
In practice, instead of , its approximate value is mostly used and that is
4
.
5
2 2 7 7 4
0.6364 0.7977 0.08 or (approx.)
22 11 5
Let us now take up some problems based on properties of Normal Distribution
in the next section.
X1 ~ N 1 , 12 and X 2 ~ N 2 , 22
then
X1 X 2 ~ N 1 2 , 12 22 , and
X1 X 2 ~ N 1 2 , 12 22 [See Property xiii (Section 13.3)]
i) X1 X 2 ~ N 0 0, 1 1
i.e. X1 X 2 ~ N 0, 2 , and
ii) X1 X 2 ~ N 0 0, 1 1
i.e. X1 X 2 ~ N 0, 2
2 2 2
Mean deviation about mean = = .5 = 5
Solution: Here = 0, 2 = 1 σ = 1.
first four central moments are:
1 0, 2 2 1, 3 0, 4 3 4 3.
Mean of X1 = E(X1 ) 40
Variance of X1 Var(X1 ) 25
Mean of X 2 = E(X 2 ) 60
Variance of X 2 V ar(X 2 ) 36
Now,
(i) Mean of X = E(X) E(2X1 3X 2 ) E(2X1 ) E(3X 2 )
4Var(X1 ) + 9Var(X 2 )
E(3X1 ) E( 2X 2 )
= 3E(X1 ) + ( 2)E(X 2 )
21
Continuous Probability
Distributions 13.8 SUMMARY
The following main points have been covered in this unit:
1) A continuous random variable X is said to follow normal distribution with
parameters ( ) and 2(>0) if it takes on any real value and its
probability density function is given by
2
1 x
1
f x e 2
, x
2
X
2) If X
N , 2 , then Z
is standard normal variate.
13.9 SOLUTIONS/ANSWERS
1 4
E 1) (i) Here we are given X ~ N ,
2 9
in usual notations, we have
1 4 2
, 2
2 9 3
Now, p.d.f. of r.v. X is given by
2
1 x
1
2
f(x) = e , x
2
2
1 x 1/2
1
2 2/3
= e
2
2π
3
22
2
9 2x 1 Normal Distribution
3
2 4
= e , x
2 2π
(ii) Here we are given X ~ N(40, 16)
in usual notations, we have
40, 2 16 4
Now, p.d.f. of r.v. X is given by
2
1 x
1
2
f(x) = e , x
σ 2π
2
1 x ( 40)
1
2 4
= e
4 2π
2
1 x + 40
1
2 4
= e , x
4 2π
x2
1
E 2) (i) f(x) = e 8, x
2 2π
2
1x
1
= e 24
2 2π
2
1 x 0
1
2 2
= e ...(1) , x
2 2π
Comparing (1) with,
2
1 x
1
2
f(x) = e , x
σ 2π
we get
0, 2
2, 2
30 = x 5 x 35 Mean = 35
Given that fourth moment about 35 is 768. But mean is 35, and hence the
fourth moment about mean = 768.
4 = 768
34 = 768
768
4 = 4 3 4
3
4 = 256 = (4)4 = 4.
24
Area Property of
UNIT 14 AREA PROPERTY OF NORMAL Normal Distribution
DISTRIBUTION
Structure
14.1 Introduction
Objectives
14.1 INTRODUCTION
In Unit 13, you have studied normal distribution and its chief characteristics.
Some characteristics including moments, mode, median, mean deviation about
mean have been established too in Unit 13. The area property of normal
distribution has just been touched in the preceding unit. Area property is very
important property and has lot of applications and hence it needs to be studied in
detail. Hence, in the Unit 14 this property with its diversified applications has
been discussed in detail. Fitting of normal distribution to the observed data and
computation of expected frequencies have also been discussed in one of the
sections i.e. Sec. 14.3 of this unit.
Objectives
After studing this unit, you would be able to:
describe the importance of area property of normal distribution;
explain use of the area property to solve many practical life problems; and
fit a normal distribution to the observed data and compute the expected
frequencies using area property.
25
Continuous Probability
Distributions
z1 z
1 12 Z2 1
e dz z dz
0 2 0
1 12 z 2
where z e is the probability density function of standard normal
2
z1 z
1 12 z2 1
variate and the definite integral e dz i.e. z dz represents the area
0 2 0
under standard normal curve between the ordinates at Z = 0 and Z = z1. (Fig.
14.2).
You need not to evaluate the integral to find the area. Table is available to find
such area for different values of z1.
Here, we have transformed the integral from
2
x1 1 x z1
1 2
1 12 z2
2 e dx to
0 2
e dz
26
i.e. we have transformed normal variate ‘X’ to standard normal variate (S.N.V.) Area Property of
X Normal Distribution
Z .
This is because, the computation of
2
x1 1 x
1 2
e dx requires construction of separate tables for different values of
2
and as the normal variate X may have any values of mean and standard
deviation and hence different tables are required for different and . So,
infinitely many tables are required to be constructed which is impossible. But
beauty of standard normal variate is that its mean is always ‘0’ and standard
deviation is always ‘1’ as shown in Unit 13. So, whatever the values of mean
and standard deviation of a normal variate be, the mean and standard deviation
on transforming it to the standard normal variate are always ‘0’ and ‘1’
respectively and hence only one table is required.
In particular,
P X f x dx [See Fig.14.3]
X
1 Z when X , Z
1
P 1 Z 1 z dz
1 when X , Z 1
1
2 z dz [By Symmetry]
0
P 2 X 2 f x dx See Fig.14.4
2
27
Continuous Probability
Distributions X
for Z , we have
2 2
P 2 Z 2 z dx 2 z dx Z 2 whenX 2
2 0 and Z 2 when X 2
From the table given in the
= 2 0.4772 Appendix at the end of the unit
= 0.9544
= 2 0.49865 = 0.9973
P[X lies within the range 3] = 0.9973
P[X lies outside the range 3] = 1 – 0.9973 = 0.0027
which is very small and hence usually we expect a normal variate to lie within
the range from – 3 to 3, though, theoretically it ranges from – to .
45, 2 16 16 4 0 always
X X 45
Now Z
4
45 45 0
(i) When X = 45, Z 0
4 4
53 45 8
(ii) When X = 53, Z= 2
4 4
41 45 4
(iii) When X = 41, Z 1
4 4
47 45 2
(iv) When X = 47, Z 0.5
4 4
Example 2: If the r.v. X is normally distributed with mean 80 and standard
deviation 5, then find
(i) P X 95 , (ii) P X 72 , (iii) P 60.5 X 90 ,
Solution: Here we are given that X is normally distributed with mean 80 and
standard deviation (S.D.) 5.
i.e. Mean = 80 and var iance 2 (S.D.) 2 25.
X X 80
If Z is the S.N.V., then Z
5
Now
95 80 15
(i) X = 95, Z = 3
5 5
P X 95 P Z 3 [See Fig.14.6]
= 0.5 P 0 Z 3
60.5 80 19.5
(iii) X = 60.5, Z 3.9
5 5
90 80 10
X = 90, Z 2
5 5
P 60.5 X 90 P 3.9 X 2 [See Fig.14.8]
= P 3.9 X 0 P 0 Z 2
normal curve is
= P 0 X 3.9 P 0 Z 2 symmetrical about
the line Z 0
30
Area Property of
Normal Distribution
= P 0 Z 3.4 P 0 Z 1
64 80 16
(v) X = 64, Z 3.2
5 5
76 80 4
X = 76, Z 0.8
5 5
P 64 X 76 P 3.2 Z 0.8 [See Fig.14.10]
31
Continuous Probability
Distributions
(b) What is the lowest weight of the 100 heaviest male students?
(Assuming that the weights are normally distributed)
Solution: Let X be a normal variate, “The weights of the male students of the
university”. Here, we are given that µ = 60 kg, σ = 16 kg, therefore,
X ~ N(60, 256).
We know that if X ~ N(µ, σ2), then the standard normal variate is given by
X
Z .
X 60
Hence, for the given information, Z
16
55 60
(a) i) For X = 55, Z 0.3125 0.31 .
16
Therefore,
P[X < 55] = P [Z < 0.31] = P [Z > 0.31] [See Fig. 14.11]
area on both
= 0.5 P [0 < Z < 0.31] sides of Z 0 is 0.5
Using table area
= 0.5 0.1217 under normal curve
= 0.3783
Number of male students having weight more than 70 kg = N P[X > 70]
= 1000 0.2643
= 264
45 60
iii) For X 45, Z 0.9375 0.94
16
65 60
For X 65, Z 0.3125 0.31
16
P 45 X 65 P 0.94 Z 0.31 [See Fig. 14.13]
33
Continuous Probability
Distributions
x1 60
Now, for X x1 , Z z1 (say) .
16
100
P [X x1 ] 0.1 [See Fig.14.14]
1000
P[Z z1 ] 0.1
Therefore, the lowest weight of 100 heaviest male students is 80.48 kg.
Example 4: In a normal distribution 10% of the items are over 125 and 35% are under
60. Find the mean and standard deviation of the distribution.
Solution:
Fig. 14.15: Area Representing the Items under 60 and over 125
Let X ~ N(µ, σ2), where µ and σ2 are unknown and are to be obtained.
34
Here we are given Area Property of
Normal Distribution
P[X > 125] = 0.1 and P[X < 60] = 0.35. [See Fig. 14.15]
X
We know that if X ~ N(µ, σ2), then Z .
60 vesign is taken because
For X = 60, Z z1 (say) ... (1) P[Z 0] P[Z 0] 0.5
125
For X 125, Z z 2 (say) ...(2)
Now P X 60 P Z z1 0.35
P[0 Z z1 ] 0.15
P[0 Z z 2 ] 0.40
60
0.39 … (3)
125
1.28 … (4)
(4) – (3) gives
125 60
1.28 0.39
65 65
1.67 38.92
1.67
From Eq. (4), 125 1.28 125 1.28 38.92 = 75.18
Hence mean 75.18; S.D. 38.92
Example 5: Find the quartile deviation of the normal distribution having mean µ
and variance 2 .
Solution: Let X N(, 2). Let Q1 and Q3 are the first and third quartiles. Now
as Q1, Q2 and Q3 divide the distribution into four equal parts, therefore, areas
35
Continuous Probability
Distributions
under the normal curve to the left of Q1 , between Q1 and Q2 (Median), between
Q2 and Q3 and to the right of Q3 all are equal to 25 percent of the total area. This
has been shown in Fig. 14.16.
and when
Q3
X Q3 , Z z1
Due to symmetry of normal curve, the values of Z corresponding to Q1 and Q3
are equal in magnitude because they are equidistant from mean.
Q3 z1 Q3 z1
X
E1) If X ~ N(150, 9) and Z is a S.N.V. i.e Z then find Z scores
corresponding to the following values of X
(i) X = 165 (ii) X = 120
36 E2) Suppose X ~ N (25, 4) then find
(i) P[X < 22], (ii) P [X > 23], (iii) P[X – 24< 3], and (iv) P[X – 21 > 2] Area Property of
Normal Distribution
E3) Suppose X ~ N (30, 16) then find in each case
(i) P[X ] 0.2492
(ii) P[X ] 0.0496
E4) Let the random variable X denote the chest measurements (in cm) of
2000 boys, where X ~ N(85, 36).
a) Then find the number of boys having chests measurement
i) less than or equal to 87 cm,
ii) between 86 cm and 90 cm,
iii) more than 80 cm.
b) What is the lowest value of the chest measurement among the 100
boys having the largest chest measurements?
E5) In a particular branch of a bank, it is noted that the duration/waiting time
of the customers for being served by the teller is normally distributed
with mean 5.5 minutes and standard deviation 0.6 minutes. Find the
probability that a customer has to wait
a) between 4.2 and 4.5 minutes, (b) for less than 5.2 minutes, and (c)
more than 6.8 minutes
E6) Suppose that temperature of a particular city in the month of March is
normally distributed with mean 24 C and standard deviation 6 C . Find
the probability that temperature of the city on a day of the month of
March is
(a) less than 20 C (b) more than 26 C (c) between 23 C and 27 C
37
Continuous Probability
Distributions X
(ii) Find the standard normal variate Z corresponding to each lower
limit. Suppose the values of the standard normal variate are obtained as z1,
z2, z3, …
(iii) Find P[Z z1], P[Z z2], P[Z z3],…i.e. the areas under the normal curve
to the left of ordinate at each value of Z obtained in step (ii). Using table
given in the Appendix at the end of the unit Z = zi may be to the right or left
of Z = 0.
If Z = zi is to the right of Z = 0 as shown in the following figure:
Then
P[Z zi] = 0.5 – P[zi Z 0]
= 0.5 – P[0 Z – zi] [Due to symmetry]
e.g. zi = 2 (say),
Then P[Z – 2] = 0.5 – P[–2 Z 0]
= 0.5 – P[0 Z – (–2)]
= 0.5 – P[0 Z 2]
(iv) Obtain the areas for the successive class intervals on subtracting the area
corresponding to every lower limit from the area corresponding to the
succeeding lower limit.
38
e.g. suppose 10, 20, 30 are three successive lower limits. Area Property of
Normal Distribution
Then areas corresponding to these limits are
P[X 10], P[X 20], P[X 30] respectively.
Now the difference P[X 30] – P[X 20] gives the area corresponding to
the interval 20-30.
(v) Finally, multiply the differences obtained in step (iv) i.e. areas
corresponding to the intervals by N (the sum of the observed frequencies),
we get the expected frequencies.
X f
0-10 3
10-20 5
20-30 8
30-40 3
40-50 1
Solution: First we are to find the mean and variance of the given frequency
distribution. This you can obtain yourself as you did in Unit 2 of MST-002 and
at many other stages. So, this is left an exercise for you.
You will get the mean and variance as
= 22 and 2 = 111 respectively
= 10.54
Hence, the equation of the normal curve is
2
1 x
1
f x e 2
2
2
1 x 22
1
= e 2 10.54 , x
10.54 2
39
Continuous Probability
Distributions
Expected frequencies are computed as follows:
The areas under the normal curve shown in the fourth column of the above tables
are obtained as follows:
there is no value
P[Z < –] = 0 to the left of
P[Z – 2.09] = 0.5 – P[–2.09 Z 0] [See Fig. 14.19]
= 0.5 – P[0 Z 2.09] [Due to symmetry]
From table given at
= 0.5 – 0.4817 the end of the unit
= 0.0183
40
Area Property of
Normal Distribution
Similarly
P[Z 1.71] = 0.5 + 0.4564 = 0.9564
P[Z 2.66] = 0.5 = 0.4961 = 0.9961
You can now try the following exercises:
E7) Fit a normal curve to the following distribution and find the expected
frequencies by area method.
41
Continuous Probability
Distributions
14.4 SUMMARY
The main points covered in this unit are:
1) Area property and its various applications has been discussed in detail.
2) Quartile deviation has also been obtained using the area property in an
example.
3) Fitting of normal distribution using area property and computation of
expected frequencies using area method have been explained.
14.5 SOLUTIONS/ANSWERS
E1) We are given X ~ N(150, 9)
in usual notations, we have
150, 2 9 3
X X 150
Now, Z
3
165 150 15
(i) When X = 165, Z= 5
3 3
120 150 30
(ii) When X = 120, Z= 10
3 3
E2) Here X ~ N(25, 4)
in usual notations, we have
Mean = 25, var iance 2 4 2
X X 25
If Z is the S.N.V then Z
2
22 25 3
i) X = 22, Z 1.5
2 2
P[X 22] P[Z 1.5] [See Fig. 14.21]
due to symmetry of
P[Z 1.5] normal curve
0.5 P[0 Z 1.5]
42
Area Property of
Normal Distribution
23 25 2
ii) X = 23, Z 1
2 2
P[X 23] P[Z 1] [See Fig.14.22]
due tosymmetry of
= P[Z 1] normal curve
= 0.5 P[0 Z 1]
x a b
iii) P[| X 24 | 3] P[ 3 X 24 3]
b x a b
= P[ 3 24 X 3 24)
= P[21<X 27]
21 25 4
X = 21, Z 2
2 2
27 25 2
X = 27, Z 1
2 2
P[| X 24 | 3] P[21 X 27] See Fig.14.23
P[ 2 Z 1]
43
Continuous Probability
Distributions
= P[–2< Z< 0] + P[0 < Z < 1]
= P[0 Z 2] P[0 Z 1]
= 0.4772 – 0.3413 = 0.1359
x a b x a b
x a b or x a b
= P[X 23or X 2 21]
y a
= P[X 23or X 19] y a
19 25 6
For X=19, Z 3
2 2
23 25 2
For X=23, Z 1
2 2
P[| X 21| 2] P[X 23or X 19] See Fig14.24
= P[Z 1or Z 3]
44
Area Property of
Normal Distribution
P 0 Z z1 0.2508
30
= 0.67
4
30 2.68
30 2.68= 32.68
30
ii) For X , Z z 2 (say) … (2)
4
Now P[X ] 0.0496
30
1.65
4
30 1.65 4
30 6.60
30 6.6 = 23.4
E4) We are given X ~ N(85, 36), N = 2000
i.e. 85cm, 2 36cm, N 2000
x
If X ~ N(µ, σ2) and Z then we know that Z ~ N(0, 1)
87 85 2
a) i) For X = 87, Z 0.33
6 6
Now P[X < 87] = P [Z < 0.33] [See Fig. 14.27]
= 0.5 + P [0 < Z < 0.33]
47
Continuous Probability
Distributions
b) Let x1 be the lowest chest measurement amongst 100 boys having the
largest chest measurements.
x 85
Now, for X x1 , Z 1 z1 (say) .
6
100
P[X x1 ] 0.05
2000
P Z z1 0.05 See Fig.14.30
Fig. 14.30: Area Representing the 100 Boys having Largest Chest Measurements
Therefore, the lowest value of the chest measurement among the 100
boys having the largest chest measurement is 94.84 cm.
E5) We are given
5.5 minutes, = 0.6 minutes
X
If X ~ N(, 2 ) and Z then we know that Z ~ N (0, 1)
4.2 5.5 1.3 13
a) For X = 4.2, Z 2.17
0.6 0.6 6
4.5 5.5 1.0 10 5
For X = 4.5, Z 1.67
0.6 0.6 6 3
48
Area Property of
Normal Distribution
Fig. 14.31: Area Representing Probability of Waiting Time between 4.2 and 4.5 Minutes
Fig. 14.32: Area Representing Probability of Waiting Time Less than 5.2 Minutes
49
Continuous Probability
Distributions
Fig. 14.33: Area Representing Probability of Waiting Time Greater than 6.8 Minutes
X
We know that if X ~ N(, 2 ), and Z then Z ~ N (0, 1)
20 24 4 2
a) For X = 20, Z 0.67
6 6 3
Fig. 14.34: Area Representing Probability of Temperature Less than 20 C
50
Area Property of
Normal Distribution
Fig. 14.35: Area Representing Probability of Temperature Greater than 26 C
Since, P[X > 26] = P [Z > 0.33] [See Fig. 14.35]
= 0.5 – P[0 < Z < 0.33]
From the table of areas
= 0.5 – 0.1293
under normal curve
= 0.3707
Therefore, probability that temperature of the city is more than
26 C is 0.3707
23 24 1
c) For X= 23, Z 0.17
6 6
27 24 3 1
For X= 27, Z 0.5
6 6 2
P[23 < X < 27] = P[–0.17 < Z < 0.5] [See Fig. 14.36]
= P [–0.17 < Z < 0] + P [0 < Z < 0.5]
= P[0 < Z < 0.17] + P[0 < Z < 0.5]
From the table of areas
= 0.0675 + 0.1915 under Normal Curve
= 0.2590
Therefore, probability that temperature of the city is between
23 C and 27 C is 0.2590
51
Continuous Probability
Distributions E7) Mean () 73, variance( 2 ) 39.75
and hence S.D. ( ) 6.3
The equation of the normal curve fitted to the given data is
2
1 x 73
1
f(x)= e 2 6.3
, x
(6.3) 2
Using area method,
The expected frequencies are obtained as follows:
Class Lower X Area under Difference Expected
interval limit Z= normal curve between frequency
X X 73 to the left of successive areas 40 col. V
z
6.3
Below 0 0.0197 – 0 0.8 1
60
= 0.0197
30
E8) P[X 40] 0.3,
100
33
P[40 X 50] 0.33, and
100
52
37 Area Property of
P[X 50] 0.37, Normal Distribution
100
Now, Let X ~ N(, 2 ),
Standard normal variate is
X
Z=
It is taken as ve as area to
40
When X = 40, Z z1 , (say) the left of this value is 30%
as probabilityis 0.3
It is taken as +ve as
50 area to the right of this
When X = 50, Z = z2 (say)
valueis given as37%
Now,
P[X 40] P[Z z1 ] 0.3
P[0 Z z1 ] 0.2
From table at the end of this unit,
P[0 Z z 2 ] 0.13
53
Continuous Probability
Distributions
40 50
0.525 and 0.33
40 0.525 and 50 0.33
Solving these equations for and , we have
11.7 and 46.14
54
APPENDIX Area Property of
Normal Distribution
AREAS UNDER NORMAL CURVE
The standard normal probability curve is given by
1 1
(z)= exp z 2 , z
2 2
The following table gives probability corresponding to the shaded area as shown
in the following figure i.e. P[0 Z z] for different values of z
TABLE OF AREAS
z 0 1 2 3 4 5 6 7 8 9
0.0 .0000 .0040 .0080 .0120 .0160 .0199 .0239 .0279 .0319 .0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0759
0.2 .0793 .0832 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1179 .1217 .1255 .1293 .1331 .1368 .1406 .1443 .1480 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1772 .1808 .1844 .1879
0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2257 .2291 .2324 .2357 .2389 .2422 .2454 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2703 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2005 .3023 .3051 .3078 .3106 .3133
0.9 .3159 .3186 .3212 .3238 .3264 .3289 .3315 .3340 .3365 .3389
1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
1.1 .3643 .3655 .3686 .3708 .3729 .3749 .3770 .3790 .3810 .3820
1.2 .3849 .3869 .3888 .3907 .3925 .3944 .3962 .3980 .3997 .4015
1.3 .4032 .4049 .4066 .4082 .4099 .4115 .4131 .4147 .4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 4279 .4292 .4306 .4319
55
Continuous Probability
Distributions
1.5 .4332 .4345 .4357 .4370 .4382 .4394 .4406 .4418 .4429 .4441
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 .4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .4656 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767
2.0 .4772 .4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .4830 .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .4861 .4864 .4868 .4871 .4875 .4678 .4881 .4884 .4887 .4890
2.3 .4893 .4896 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 .4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936
2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4959 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .1960 .4961 .4962 .4963 .4964
2.7 .4965 .4966 .4967 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4974 .4975 .4976 .4977 .4977 .4978 .4979 .4879 .4980 .4981
2.9 .4981 .4982 .4982 .4983 .4984 .4984 .4985 .4985 .4986 .4986
3.0 .4987 .4987 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
3.1 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993
3.2 .4993 .4493 .4994 .4994 .4994 .4994 .4994 .4995 .4995 .4995
3.3 .4995 .4995 .4995 .4996 .4996 .4996 .4996 .4996 .4996 .4997
3.4 .4997 .4997 .4997 .4997 .4997 .4997 .4997 .4997 .4997 .4998
3.5 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998 .4998
3.6 .4998 .4998 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999
3.7 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999
3.9 .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000 .5000
56
Continuous Uniform and
UNIT 15 CONTINUOUS UNIFORM AND Exponential Distributions
EXPONENTIAL DISTRIBUTIONS
Structure
15.1 Introduction
Objectives
15.1 INTRODUCTION
In Units 13 and 14, you have studied normal distribution with its various
properties and applications. Continuing our study on continuous distributions,
we, in this unit, discuss continuous uniform and exponential distributions. It
may be seen that discrete uniform and geometric distributions studied in Unit
11 and Unit 12 are the discrete analogs of continuous uniform and exponential
distributions. Like geometric distribution, exponential distribution also has the
memoryless property. You have also studied that geometric distribution is the
only discrete distribution which has the memoryless property. This feature is
also there in exponential distribution and it is the only continuous distribution
having the memoryless property.
The present unit discusses continuous uniform distribution in Sec. 15.2 and
exponential distribution in Sec. 15.3.
Objectives
After studing the unit, you would be able to:
define continuous uniform and exponential distributions;
state the properties of these distributions;
explain the memoryless property of exponential distribution; and
solve various problems on the situations related to these distributions.
1
for a x b
f x b a
0, otherwise
57
Continuous Probability The distribution is called uniform distribution since it assumes a constant
Distributions
(uniform) value for all x in (a, b). If we draw the graph of y = f(x) over x-axis
and between the ordinates x = a and x = b (say), it describes a rectangle as
shown in Fig. 15.1
1
ba
X
a b
Fig. 15.1: Graph of uniform function
For x b
x
F x P X x f x dx
a b
f x dx f x dx f x dx
a b
a b
1
0 dx
a
ba
dx 0 dx
b
1 b ba
=0+ x a 0 = 1.
ba ba
So,
0 for x a
x a
Fx for a x b
b a
1 for x b
58
On plotting its graph, we have Continuous Uniform and
Exponential Distributions
Fx
a b
Fig. 15.2: Graph of distribution function
b
1 x2 1 b2 a 2
=
b a 2 a b a 2 2
=
b a b a
ab
2b a 2
b b
1 1 x3 1 b3 a 3
= x2. dx = =
a
ba b a 3 a ba 3 3
b3 a 3
=
3 b a
b a b2 ab a 2
= x 3 y 3 x y x 2 xy y 2
3 b a
a 2 ab b 2
3
2
2 a 2 ab b 2 a b
2
Variance of X = E(X ) – [E(X)] =
3 2
2
4 a 2 ab b 2 3 a b
12
59
Continuous Probability
Distributions 4a 2 4ab 4b 2 3a 2 3b 2 6ab
12
2
b2 a 2 2ab b a
.
12 12
2
So, Mean =
ab
and Variance =
b a .
2 12
Let us now take up some examples on continuous uniform distribution.
Example 1: If X is uniformly distributed with mean 2 and variance 12, find
P[X < 3].
Solution: Let X U [a, b]
probability density function of X is
1
f x , a x b.
ba
Now as Mean = 2
ab
2
2
a+b=4 … (1)
Variance = 12
2
b a 12
12
2
b a 144
b – a = 12
Variance =
b a
12 0
144
12.
12 12 12
S.D. = 12
Thus, the coefficient of variation
S.D.
= 100 [Also see Unit 2 of MST-002]
Mean
12
100 = 57.74%
6
Example 3: Metro trains are scheduled every 5 minutes at a certain station. A
person comes to the station at a random time. Let the random variable X count
the number of minutes he/she has to wait for the next train. Assume X has a
uniform distribution over the interval (0, 5). Find the probability that he/she
has to wait at least 3 minutes for the train.
Solution: As X follows uniform distribution over the interval (0, 5),
probability density function of X is
1 1 1
f x , 0 x5
ba 50 5
Thus, the desired probability
5 5 5
1 1
P X 3 f x dx dx 1 dx
3 3
5 53
1 5 1 2
x 3 5 3 0.4
5 5 5
E2) A random variable X has a uniform distribution over (–2, 2). Find k for
1
which P[X > k] = .
2
61
Continuous Probability Now, let us discuss exponential distribution in the next section.
Distributions
x
ex x x
x
1 e 0 e e
0
0
e x 1 1 e x .
1 e x for x 0
So, F x .
0, elsewhere
x ex dx
0
ex
ex
x 1 dx [Integrating by parts]
0 0
62
Continuous Uniform and
1 ex
Mean 0 0 Exponential Distributions
0
1 1 1
2 0 1 2 .
Now, E X 2 x 2f x dx x 2 e x dx
0 0
x 2e x dx
0
ex ex
= x 2 2x dx [Integrating by parts]
0 0
2
0 0 x e x dx
0
2 2
0 0
x e x dx = x ex dx
2
E X
21
[E(X) is mean and has already been obtained]
2
2
2
2 1 2 1 1
Thus, Variance = E(X2) – [E(X)] 2 = 2
2 2 2
1 1
So, Mean = and Variance = 2 .
1 1 Mean
Remark 1: Variance = 2
Mean = Variance
.
So,
Value of Implies
<1 Mean < Variance
=1 Mean = Variance
>1 Mean > Variance
63
Continuous Probability Memoryless Property of Exponential Distribution
Distributions
Now, let us discuss a very important property of exponential distribution and
that is the memoryless (or forgetfulness) property. Like geometric distribution
in the family of discrete distributions, exponential distribution is the only
distribution in the family of continuous distributions which has memoryless
property. The memorless property of exponential distribution is stated as:
If X has an exponential distribution, then for every constant a 0, one has
P[X x + a X a] = P X x for all x i.e. the conditional probability of
waiting up to the time ' x a ' given that it exceeds ‘a’ is same as the
probability of waiting up to the time ‘ x ’. To make you understand the above
concept clearly let us take the following example: Suppose you purchase a TV
set, assuming that its life time follows exponential distribution, for which the
expected life time has been told to you 10 years (say). Now, if you use this TV
set for say 4 years and then you ask a TV mechanic, without informing him/her
that you had purchased it 4 years ago, regarding its expected life time. He/she,
if finds the TV set as good as new, will say that its expected life time is 10
years.
So, here, in the above example, 4 years period has been forgotten, in a way,
and for this example:
P[life time up to 10 years]
= P[life time up to 14 years | life time exceeds 4 years]
i.e. P[X 10] = P [X 14 X 4]
or P[X 10] = P[X 10 + 4 X 4]
Here a = 4 and x = 10.
Let us now prove the memoryless property of exponential distribution.
X x a X a
Proof: P X x a X a [By conditional probability]
P X a
where
P X x a X a P a X x a
xa xa
x
f x dx e dx
a a
xa
ex e x a e a
a
x a
e ea e x .ea ea
x e x a a
P[X a] = f x dx = e dx 0 e e
a a a
64
e a 1 e x Continuous Uniform and
P X x a X a 1 e x Exponential Distributions
ea
x
Also, P[ X x ] e x dx
0
x ex
Ae dx 1 A 1
0 (1) 0
–A [0 –1] = 1 A = 1
f x e x
=1
1 1
Hence, mean = 1,
1
1 1
and variance = 1.
2 1
So, the mean and variance are equal for the given exponential distribution.
Example 5: Telephone calls arrive at a switchboard following an exponential
distribution with parameter = 12 per hour. If we are at the switchboard, what
is the probability that the waiting time for a call is
i) at least 15 minutes
ii) not more than 10 minutes.
Solution: Let X be the waiting time (in hours) for a call.
f x ex , x 0
= 1– e 2 = 1– (0.1353) = 0.8647
Now, we are sure that you can try the following exercises.
E3) What are the mean and variance of the exponential distribution given
by:
f x 3e3x , x 0
E4) Obtain the value of k > 0 for which the function given by
f x 2e kx , x 0
66
We now conclude this unit by giving a summary of what we have covered in it. Continuous Uniform and
Exponential Distributions
15.4 SUMMARY
Following main points have been covered in this unit.
1) A random variable X is said to follow a continuous uniform (rectangular)
distribution over an interval (a, b) if its probability density function is given
by
1
for a x b
f x b a
0, otherwise
ab
2) For continuous uniform distribution, Mean and
2
2
variance
b a .
12
3) A random variable X is said to follow exponential distribution with
parameter > 0, if it takes any non-negative real value and its probability
density function is given by
ex for x 0
f x
0 , elsewhere
1 1
4) For exponential distribution, Mean = and Variance = 2 .
5) Mean > or = or < Variance according to whether > or = or < 1.
6) Exponential distribution is the only continuous distribution which has
the memoryless property given by:
P[X x + a X a] = P[X x].
15.5 SOLUTIONS/ANSWERS
E1) As X U[ a, a],
probability density function of X is
1 1 1
f x , a x a .
a ( a) a a 2a
1
i) Given that P[X > 4] =
3
a
1 1
2a dx 3
4
1 1
x a4
2a 3
67
Continuous Probability a4 1
Distributions
2a 3
3a – 12 = 2a
a = 12.
3
ii) P X 1
4
1
1 3
dx
a
2a 4
1 3
x 1 a
2a 4
1 3
1 a
2a 4
3
1+ a = a
2
2 2a 3a
a2
iii) P X 2 P X 2
X 2 X 2
X 2 or X 2
2 X 2 and
P 2 X 2 P X 2 or X 2
X 2 X 2
X 2or X 2
X 2or X 2
By Addition law of
P 2 X 2 P X 2 P X 2 probability for mutually
exclusive events
2 2 a
1 1 1
dx dx dx
2
2a a
2a 2
2a
1 1 1
2a
4 2 a a 2
2a 2a
4 (2 a) (a 2)
4 4 2a
2a = 8
a=4
E2) As X ~ U [ 2, 2],
1
f x , 2 x 2.
68 4
1 Continuous Uniform and
Now P X k Exponential Distributions
2
2
1 1
4dx 2
k
2k 1
4 2
2–k=2
k = 0.
E3) Comparing it with the exponential distribution given by
f x e x , x 0
We have = 3
1 1 1 1
Mean = and Variance = 2
3 9
E4) As the given function is exponential distribution i.e. a p.d.f.,
f x dx 1
0
we have
= 2 and = k
k=2
1
x
E5) Here P X x F x 1 e x = 1 – e 20
ii) P[First accident occurs on second week from starting of working day
on Tuesday till end of working day on Wednesday]
=P[First accident occurs after 7 working days
and before the end of 9 working days]
= P[7 < X 9]
= P[X 9] – P[X 7]
69
Continuous Probability 9 7
Distributions
1 e 20 1 e 20
9 7
20 20
e e
7 9
20 20
e e
e0.35 e 0.45
= 0.7047 – 0.6376 [See the table give at the end of Unit 10]
= 0.0671.
70
Gamma and Beta
UNIT 16 GAMMA AND BETA Distributions
DISTRIBUTIONS
Structure
16.1 Introduction
Objectives
16.1 INTRODUCTION
In Unit 15, you have studied continuous uniform and exponential
distributions. Here, we will discuss gamma and beta distributions. Gamma
distribution reduces to exponential distribution and beta distribution reduces
to uniform distribution for special cases. Gamma distribution is a
generalization of exponential distribution in the same sense as the negative
binomial distribution is a generalization of geometric distribution. In a sense,
the geometric distribution and negative binomial distribution are the discrete
analogs of the exponential and gamma distributions, respectively. The present
unit discusses the gamma and beta distributions which are defined with the
help of special functions known as gamma and beta functions, respectively.
So, before defining these distributions, we first define gamma and beta
functions in Sec. 16.2 of this unit. Then gamma distribution and beta
distribution of first kind followed by beta distribution of second kind are
discussed in Secs. 16.3 to 16.5.
Objectives
After studing this unit, you would be able to:
define beta and gamma functions;
define gamma and beta distributions;
discuss various properties of these distributions;
identify the situations where these distributions can be employed; and
solve various practical problems related to these distributions.
71
Continuous Probability
Distributions Beta Function
1
m 1 n 1
Definition: If m > 0, n > 0, the integral x 1 x
0
dx is called a beta
On the basis of the above discussion, you can try the following exercise.
E1) Express the following as a beta function:
1 1 1
i) x 3
1 x 2 dx
0
1
2 5
ii) x 1 x
0
dx
x2
iii) 1 x 5
dx
0
1
2
x
iv) 1 x 2
dx
0
72
Gamma Function Gamma and Beta
Distributions
Though we have defined Gamma function in Unit 13, yet we are again
defining it with more properties, examples and exercises to make you clearly
understand this special function.
n 1 x
Definition: If n > 0, the integral x e dx is called a gamma function and is
0
denoted by n
e.g.
2 x
(i) x e dx 2 1 3
0
x 1 3
(ii) xe dx 1
0 2 2
Some Important Results on Gamma Function
1. If n > 1, n n 1 n 1
2. If n is a positive integer, n n 1 !
1
3.
2
Relationship between Beta and Gamma Functions
10
(ii) 1 x
0
dx
1
x
(iii) x 2 e dx
0
Remark 1:
(i) It can be verified that
f x dx 1
0
Verification:
r ex x r 1
x dx
0 0 r
dx
r 1
ex x
dx
0 r
Putting x = y dx dy
Also, when x 0, y 0 and when x , y
1 y
e y r 1dx
r 0
1
r [Using gamma function defined in Sec. 16.2]
r
=1
(ii) If X is a gamma variate with two parameters r > 0 and > 0, it is expressed
as X γ(, r).
(iii) If we put r = 1, we have
e x x 0
f (x) ,x 0
1
e x , x 0
which is probability density function of exponential distribution.
e x .x r 1
f (x) , x 0, r 0
r
74
It is known as gamma distribution with single parameter r. This form of the Gamma and Beta
gamma distribution is also widely used. If X follows gamma distribution with Distributions
e x e x
x 1 dx [Integrating by parts]
1 5 5 1
ex
0 5e5 e x dx 5e5
5 1 5
5e 5 0 e5
= 6 e 5
75
Continuous Probability
Distributions
= 6 0.0070 [See the table given at the end of Unit 10]
= 0.042
ii) In this case r = 1, = 1 and hence
r ex .x r 1
P X 5 dx
5 r
(1)1 e x x 0 e x
dx e x dx 0 e5 0.0070
5 (1) 5 1 5
Alternatively,
As r = 1, so it is a case of exponential distribution for which
f x ex , x 0
e x
P X 5 e x x
dx 1e dx 0 e 5 0.0070
5 5 1 5
Here is an exercise for you.
E3) Telephone calls arrive at a switchboard at an average rate of 2 per minute.
Let X denotes the waiting time in minutes until the 4th call arrives and
follows gamma distribution. Write the probability density function of X.
Also find its mean and variance.
Let us now discuss the beta distributions in the next two sections:
m, n
m n
m n
Now, we are in a position to define beta distribution which is defined with the
help of beta function. There are two kinds of beta distribution beta
distribution of first kind and beta distribution of second kind. Beta distribution
of second kind is defined in next section of the unit whereas beta distribution
of first kind is defined as follows:
Definition: A random variable X is said to follow beta distribution of first
kind with parameters m > 0 and n > 0, if its probability density function is
given by
1 m 1 n 1
m, n x 1 x , 0 x 1
f (x)
0, otherwise
The random variable X is known as beta variate of first kind and can be
expressed as X 1(m, n)
76
Remark 5: If m = 1 and n = 1, then the beta distribution reduces to Gamma and Beta
Distributions
1 11
f x x11 1 x , 0 x 1
1,1
0
x 0 1 x
, 0 x 1
1,1
1
,0 x 1
1,1
11 0 0
But 1,1
2 1
Therefore, f (x)
11 1
1
f (x) 1, 0 x 1
1
,0 x 1
1 0
which is uniform distribution on (0, 1).
1
[p.d.f. of uniform distribution on (a, b) is f (x) , a x b]
ba
So, continuous uniform distribution is a particular case of beta
distribution.
Mean and variance of Beta Distribution of First Kind
Mean and Variance of this distribution are given as
m
Mean =
mn
mn
Variance = 2
m n m n 1
Example 4: Determine the constant C such that the function
6
f (x) Cx 3 1 x , 0 x 1 is a beta distribution of first kind. Also, find its
mean and variance.
Solution: As f x is a beta distribution of first kind.
1
f x 1
0
1
3 6
Cx 1 x
0
dx 1
1
6
C x 3 1 x dx 1
0
77
Continuous Probability
Distributions C 3 1, 6 1 1 [By definition of Beta distribution of first kind]
1
C
4, 7
47 m n
m, n
47 m n
11 10
4 7 3 6
10 9 8 7 6
840
3 2 6
6
Thus, f x 840x 3 1 x
7 1
= 840x 4 1 1 x
7 1
x 41 1 x
4, 7
1
[ 840 just obtained above in this example]
4, 7
m = 4, n = 7
m 4 4
Mean = ,
m n 4 7 11
mn
and Variance = 2
m n m n 1
47
2
4 7 4 7 1
28 7 7
12112 121 3 363
Now, you can try the following exercises.
E4) Using beta function, prove that
1
2 3
60x 1 x
0
dx 1
78
Gamma and Beta
16.5 BETA DISTRIBUTION OF SECOND KIND Distributions
Let us now define beta distribution of second kind.
Definition: A random variable X is said to follow beta distribution of second
kind with parameters m > 0, n > 0 if its probability density function is given
by
x m 1
mn
, 0x
f x m, n 1 x
0, elsewhere
x m 1
Remark 6: It can be verified that m, n 1 x mn
dx 1
0
Verification:
x m 1 1 x m 1
m, n 1 x dx dx
0
m n
m, n 0 1 x m n
x m-1
mn
dx is another form
0 1+x
1 of beta function.
m, n
m, n
(see Sec. 16.2 of this Unit)
=1
f x dx 1
0
79
Continuous Probability
Distributions kx 3
1 x dx 1
0
7
x 4 1
k 4 3
dx 1
0 1 x
k 4, 3 1
1 7 6 6 5 4
k 60
4,3 4 3 3 2 2
Here m = 4, n = 3
m 4 4
Mean = 2
n 1 3 1 2
m m n 1 4(4 3 1) 46
Variance = 2
2
6
n 1 n 2 (3 1) 3 2 4 1
E7) Obtain mean and variance for the beta distribution whose density is given
by
60x 2
f x 7
,0 x
1 x
16.6 SUMMARY
The following main points have been covered in this unit:
1) A random variable X is said to follow gamma distribution with
parameters r > 0 and > 0 if its probability density function is given by
r ex x r 1
, x0
f x r
0, elsewhere
1 m 1 n 1
m, n x 1 x , 0 x 1
f x
0, otherwise
m mn
Its mean and variance are and 2
, respectively.
mn m n m n 1
5) A random variable X is said to follow beta distribution of second kind
with parameters m > 0, n > 0 if its probability density function is given by:
x m 1
mn
,0 x
f x m, n 1 x
0, elsewhere
m m m n 1
Its Mean and Variance are , n 1; and 2
,n 2
n 1 n 1 n 2
respectively.
6) Exponential distribution is a particular case of gamma distribution and
continuous uniform distribution is a particular case of beta distribution.
16.7 SOLUTIONS/ANSWERS
1 1 1
1 1 2 3
1 x 2 dx B
3
E1) (i) x 1, 1 B ,
0 3 2 3 2
1 1
2 11 5 6 1
(ii) x 1 x dx x 1 x
0 0
dx
5
E2) e x .x 5/ 2 dx 1
0 2
7
2
81
Continuous Probability
Distributions 5 5 5 3 3 5 3 1 1
2 2 2 2 2 2 2 2 2
Result 1on gamma
function (See Sec. 16.2)
5 3 1
Result 3 on gamma function
2 2 2
15
8
1 10
(ii) x 1 x
0
dx = β(1 + 1, 10 + 1)
= β(2,11)
=
1!10 ! [Result 2 on gamma function]
12 !
=
10 ! =
1
1
12 1110 ! 12 11 132
1
x 1 1
(iii) 0 2 e dx 2 1 2
x
E3) Here = 2, r = 4.
r ex .x r 1
f (x) ,x 0
r
24.e2x .x 3
,x 0
4
16e2x .x 3
= ,x 0
3
8
= x 3e 2x , x 0
3
r 4
Mean = 2,
2
r 4
Variance = 2
2 1
2
1 1
3 4 1
E4) 60x 2 1 x dx 60 x 31 1 x dx 60 3, 4
0 0
82
34 2 3 60 2 3 2 Gamma and Beta
= 60 = 60 1 Distributions
7 6 6 5 4 3 2
1 1 1
E5) kx 2
1 x 2 dx 1
0
1 1
k 1, 1 1
2 2
1 2 1 2 2
k
1 3
, 1 3 1
1
2 2 2 2 2 2
Now, as the given p.d.f. of beta distribution of first kind is
2 12 1
f (x) x 1 x 2 , 0 x 1
1 3
1 1
x2 1 x 2
, 0 x 1
1 3
,
2 2
1 3
m , n
2 2
1
m 1
and hence mean = 2
mn 1 3 4
2 2
mn
Variance = 2
m n m n 1
1 3 3
3 1
2
2 4
2
2
1 3 1 3 2 3 4 4 3 16
1
2 2 2 2
x3 x 4 1
E6) 1 x 13/ 2
dx 5
dx
4
0 0 1 x 2
5 5
4 3
5 2 2
= 4,
2 5 13
4 2
2
5
6
2 6 32 64
=
13 11 9 7 5 5 13 11 9 7 5 15015
. . . .
2 2 2 2 2 2
83
Continuous Probability
Distributions 60x 2
E7) f x 7
,0 x
1 x
60x 31
3 4
,0 x
1 x
x 31 34 23 1
3 4
, 0x 3, 4
3, 4 1 x 6 6 60
m =3, n = 4
m 3
Hence, mean = 1
n 1 4 1
m m n 1 3 3 4 1 3 6
Variance = 2
= 2
= 1.
n 1 n 2 4 1 4 2 9 2
84
UNIT 9 CONCEPTS OF TESTING OF
HYPOTHESIS
Structure
9.1 Introduction
Objectives
9.2 Hypothesis
Simple and Composite Hypotheses
Null and Alternative Hypotheses
9.3 Critical Region
9.4 Type-I and Type-II Errors
9.5 Level of Significance
9.6 One-Tailed and Two-Tailed Tests
9.7 General Procedure of Testing a Hypothesis
9.8 Concept of p-Value
9.9 Relation between Confidence Interval and Testing of Hypothesis
9.10 Summary
9.11 Solutions /Answers
9.1 INTRODUCTION
In previous block of this course, we have discussed one part of statistical
inference, that is, estimation and we have learnt how we estimate the unknown
population parameter(s) by using point estimation and interval estimation. In
this block, we will focus on the second part of statistical inference which is
known as testing of hypothesis.
In our day-to-day life, we see different commercials advertisements in
television, newspapers, magazines, etc. such as
(i) The refrigerator of certain brand saves up to 20% electric bill,
(ii) The motorcycle of certain brand gives 60 km/liter mileage,
(iii) A detergent of certain brand produces the cleanest wash,
(iv) Ninety nine out of hundred dentists recommend brand A toothpaste for
their patients to save the teeth against cavity, etc.
Now, the question may arise in our mind “can such types of claims be verified
statistically?” Fortunately, in many cases the answer is “yes”.
The technique of testing such type of claims or statements or assumptions is
known as testing of hypothesis. The truth or falsity of a claim or statement is
never known unless we examine the entire population. But practically it is not
possible in mostly situations so we take a random sample from the population
under study and use the information contained in this sample to take the
decision whether a claim is true or false.
This unit is divided into 11 sections. Section 9.1 is introductory in nature. In
Section 9.2, we defined the hypothesis. The concept and role of critical region
in testing of hypothesis is described in Section 9.3. In Section 9.4, we explored
the types of errors in testing of hypothesis whereas level of significance is
explored in Section 9.5. In Section 9.6, we explored the types of tests in testing
5
Testing of Hypothesis of hypothesis. The general procedure of testing a hypothesis is discussed in
Section 9.7. In Section 9.8, the concept of p-value in decision making about the
null hypothesis is discussed whereas the relation between confidence interval
and testing of hypothesis is discussed in Section 9.9. Unit ends by providing
summary of what we have discussed in this unit in Section 9.10 and solution of
exercises in Section 9.11.
Objectives
After reading this unit, you should be able to:
define a hypothesis;
formulate the null and alternative hypotheses;
explain what we mean by type-I and type-II errors;
explore the concept of critical region and level of significance;
define one-tailed and two-tailed tests;
describe the general procedure of testing a hypothesis;
concept of p-value; and
test a hypothesis by using confidence interval.
Before coming to the procedure of testing of hypothesis, we will discuss the
basis terms used in this procedure one by one in subsequent sections.
9.2 HYPOTHESIS
As we have discussed in previous section that in our day-to-day life, we see
different commercials advertisements in television, newspapers, magazines,
etc. and if someone may be interested to test such type of claims or statement
then we come across the problem of testing of hypothesis. For example,
(i) a customer of motorcycle wants to test whether the claim of motorcycle
of certain brand gives the average mileage 60 km/liter is true or false,
(ii) the businessman of banana wants to test whether the average weight of a
banana of Kerala is more than 200 gm,
(iii) a doctor wants to test whether new medicine is really more effective for
controlling blood pressure than old medicine,
(iv) an economist wants to test whether the variability in incomes differ in
two populations,
(v) a psychologist wants to test whether the proportion of literates between
two groups of people is same, etc.
In all the cases discussed above, the decision maker is interested in making
inference about the population parameter(s). However, he/she is not interested
in estimating the value of parameter(s) but he/she is interested in testing a
claim or statement or assumption about the value of population parameter(s).
Such claim or statement is postulated in terms of hypothesis.
In statistics, a hypothesis is a statement or a claim or an assumption about the
value of a population parameter (e.g., mean, median, variance, proportion,
etc.).
Similarly, in case of two or more populations a hypothesis is comparative
statement or a claim or an assumption about the values of population
parameters. (e.g., means of two populations are equal, variance of one
population is greater than other, etc.). The plural of hypothesis is hypotheses.
6
In hypothesis testing problems first of all we should being identifying the claim Concepts of Testing of
or statement or assumption or hypothesis to be tested and write it in the words. Hypothesis
Once the claim has been identified then we write it in symbolical form if
possible. As in the above examples,
(i) Customer of motorcycle may write the claim or postulate the hypothesis
“the motorcycle of certain brand gives the average mileage 60 km/liter.”
Here, we are concerning the average mileage of the motorcycle so let µ
represents the average mileage then our hypothesis becomes µ = 60 km /
liter.
(ii) Similarly, the businessman of banana may write the statement or
postulate the hypothesis “the average weight of a banana of Kerala is
greater than 200 gm.” So our hypothesis becomes µ > 200 gm.
(iii) Doctor may write the claim or postulate the hypothesis “ the new
medicine is really more effective for controlling blood pressure than old
medicine.” Here, we are concerning the average effect of the medicines
so let µ1 and µ2 represent the average effect of new and old medicines
respectively on controlling blood pressure then our hypothesis becomes
µ1 > µ2.
(iv) Economist may write the statement or postulate the hypothesis “ the
variability in incomes differ in two populations.” Here, we are concerning
the variability in income so let 12 and 22 represent the variability in
incomes in two populations respectively then our hypothesis becomes
12 22 .
(v) Psychologist may write the statement or postulate the hypothesis “the
proportion of literates between two groups of people is same.” Here, we
are concerning the proportion of literates so let P1 and P2 represent the
proportions of literates of two groups of people respectively then our
hypothesis becomes P1 = P2 or P1 –P2 = 0.
The hypothesis is classified according to its nature and usage as we will discuss
in subsequent subsections.
9.2.1 Simple and Composite Hypotheses
In general sense, if a hypothesis specifies only one value or exact value of the
population parameter then it is known as simple hypothesis. And if a
A hypothesis which
hypothesis specifies not just one value but a range of values that the population completely specifies
parameter may assume is called a composite hypothesis. parameter(s) of a
As in the above examples, the hypothesis postulated in (i) µ = 60 km/liter is theoretical population
(probability distribution)
simple hypothesis because it gives a single value of parameter (µ = 60) is called a simple
whereas the hypothesis postulated in (ii) µ > 200 gm is composite hypothesis hypothesis otherwise
because it does not specify the exact average value of weight of a banana. It called composite
may be 260, 350, 400 gm or any other. hypothesis.
Similarly, (iii) µ1 > µ2 or µ1 −µ2 > 0 and (iv) 12 22 or 12 22 0 are not
simple hypotheses because they specify more than one value as µ1 −µ2 = 4,
µ1 −µ2 = 7, 12 22 2, 12 22 5 , etc. and (v) P1 = P2 or P1 –P2 = 0 is simple
hypothesis because it gives a single value of parameter as P1 –P2 = 0.
9.2.2 Null and Alternative Hypotheses
As we have discussed in last page that in hypothesis testing problems first of
all we identify the claim or statement to be tested and write it in symbolical
7
Testing of Hypothesis form. After that we write the complement or opposite of the claim or statement
in symbolical form. In our example of motorcycle, the claim is µ = 60 km/liter
then its complement is µ ≠ 60 km/liter. In (ii) the claim is µ > 200 gm then its
complement is µ ≤ 200 gm. If the claim is µ < 200 gm then its complement is
µ ≥ 200 gm. The claim and its complement are formed in such a way that they
cover all possibility of the value of population parameter.
Once the claim and its compliment have been established then we decide of
We state the null and these two which is the null hypothesis and which is the alternative hypothesis.
alternative The thump rule is that the statement containing equality is the null hypothesis.
hypotheses in such a
way that they cover
That is, the hypothesis which contains symbols or or is taken as null
all possibility of the hypothesis and the hypothesis which does not contain equality i.e. contains
value of population or or is taken as alternative hypothesis. The null hypothesis is denoted
parameter.
by H0 and alternative hypothesis is denoted by H1 or HA.
In our example of motorcycle, the claim is µ = 60 km/liter and its complement
is µ ≠ 60 km/liter. Since claim µ = 60 km/liter contains equality sign so we take
it as a null hypothesis and complement µ ≠ 60 km/liter as an alternative
hypothesis, that is,
H0 : µ = 60 km/liter and H1: µ ≠ 60 km/liter
In our second example of banana, the claim is µ > 200 gm and its complement
is µ ≤ 200 gm. Since complement µ ≤ 200 gm contains equality sign so we take
complement as a null hypothesis and claim µ > 200 gm as an alternative
hypothesis, that is,
H0 : µ ≤ 200 gm and H1: µ > 200 gm
Formally these hypotheses are defined as
The hypothesis which we wish to test is called as the null hypothesis.
According to Prof. R.A. Fisher,
“A null hypothesis is a hypothesis which is tested for possible rejection under
the assumption that it is true.”
The hypothesis which complements to the null hypothesis is called alternative
hypothesis.
Note 1: Some authors use equality sign in null hypothesis instead of ≥ and ≤
signs.
The alternative hypothesis has two types:
(i) Two-sided (tailed) alternative hypothesis
(ii) One-sided (tailed) alternative hypothesis
If the alternative hypothesis gives the alternate of null hypothesis in both
directions (less than and greater than) of the value of parameter specified in
null hypothesis then it is known as two-sided alternative hypothesis and if it
gives an alternate only in one direction( less than or greater than) only then it is
known as one-sided alternative hypothesis. For example, if our alternative
hypothesis is H1 : θ ≠ 60 then it is a two-sided alternative hypothesis because its
means that the value of parameter θ is greater than or less than 60. Similarly, if
H1 : θ > 60 then it is a right-sided alternative hypothesis because its means that
the value of parameter θ is greater than 60 and if H1: θ < 60 then it is a
left-sided alternative hypothesis because its means that the value of parameter θ
is less than 60.
In testing procedure, we assume that the null hypothesis is true until there is
sufficient evidence to prove that it is false. Generally, the hypothesis is tested
8
with the help of a sample so evidence in testing of hypothesis comes from a Concepts of Testing of
sample. If there is enough sample evidence to suggest that the null hypothesis Hypothesis
is false then we reject the null hypothesis and support the alternative
hypothesis. If the sample fails to provide us sufficient evidence against the null
hypothesis we are not saying that the null hypothesis is true because here, we
take the decision on the basis of a random sample which is a small part of the
population. To say that null hypothesis is true we must study all observations
of the population under study. For example, if someone wants to test that the
person of India has two hands then to prove that this is true we must check all
the persons of India whereas to prove that it is false we require a person he /
she has one hand or no hand. So we can only say that there is not enough
evidence against the null hypothesis.
Note 2: When we assume that null hypothesis is true then we are actually
assuming that the population parameter is equal to the value in the claim. In our
example of motorcycle, we assume that µ = 60 km/liter whether the null
hypothesis is µ = 60 km/liter or µ ≤ 60 km/liter or µ ≥ 60 km/liter.
Now, you can try the following exercises.
E1) A company manufactures car tyres. Company claims that the average life
of its tyres is 50000 miles. To test the claim of the company, formulate
the null and alternative hypotheses.
E2) Write the null and alternative hypotheses in case (iii), (iv) and (v) of our
example given in Section 9.2.
E3) A businessman of orange formulates different hypotheses about the
average weight of the orange which are given below:
(i) H0: = 100 (ii) H1 : >100 (iii) H0 : ≤ 100 (iv) H1: ≠ 100
(v) H1: > 150 (vi) H0: = 130 (vii) H1: 0
Categorize the above cases into simple and composite hypotheses.
After describing the hypothesis and its types our next point in the testing of
hypothesis is critical region which will be described in next section.
range of T10 is 0 T10 1000. Now, we divide the whole space ( 0-1000) into
two regions as no-distinction awarded region (less than 750) and distinction
awarded region (greater than or equal to 750) as shown in Fig. 9.1. Here, 750 is
the critical value which separates the no-distinction and distinction awarded
regions.
Tn
Fig. 9.1: Non-rejection and critical regions for distinction award
On the basis of scores in all the papers of the selected student, we calculate the
10
value of the statistic T10 Xi . And calculated value may fall in distinction
i 1
award region or not, depending upon the observed value of test statistic.
For making a decision to reject or do not reject H0, we use test statistic
10
T10 X
i 1
i (sum of scores of 10 papers). If calculated value of test statistic T10
lies in no-distinction awarded region (critical region), that is, T10 < 750 then we
reject H0 and if calculated value of test statistic T10 lies in distinction awarded
region (non-rejection region), that is, T10 750 then we do not reject H0. It is a
basic structure of the procedure of testing of hypothesis which needs two
regions like:
(i) Region of rejection of null hypothesis H0
(ii) Region of non-rejection of null hypothesis H0
The point of discussion in this test procedure is “how to fix the cut off value
750”? What is the justification for this value? The distinction award region
may be like T10 800 or at T10 850 or at T10 900. So, there must be a
scientific justification for the cut-off value 750. In a statistical test procedure it
is obtained by using the probability distribution of the test statistic.
The region of rejection is called critical region. It has a pre-fixed area generally
denoted by , corresponding to a cut-off value in a probability distribution of
test statistic.
The rejection (critical) region lies in one-tail or two-tails on the probability
curve of sampling distribution of the test statistic its depends upon the
alternative hypothesis. Therefore, three cases arise: Concepts of Testing of
Hypothesis
Case I: If the alternative hypothesis is right-sided such as H1 : θ > θ0 or
H1 : θ1 > θ2 then the entire critical or rejection region of size α lies on
right tail of the probability curve of sampling distribution of the test
statistic as shown in Fig. 9.2.
Critical value is a
value or values that
separate the region of
rejection from the non-
rejection region.
Fig. 9.2
Case II: If the alternative hypothesis is left-sided such as H1: θ < θ0 or
H1 : θ1 < θ2 then the entire critical or rejection region of size α lies on
left tail of the probability curve of sampling distribution of the test
statistic as shown in Fig. 9.3.
Fig. 9.3
Case III: If the alternative hypothesis is two sided such as H1 : θ ≠ θ0 or
H1 : θ1 ≠ θ2 then critical or rejection regions of size α/2 lies on both
tails of the probability curve of sampling distribution of the test
statistic as shown in Fig. 9.4.
Fig. 9.4
Now, you can try the following exercise.
E4) If H0: θ = 60 and H1: θ ≠ 60 then critical region lies in one-tail or two-
tails.
Testing of Hypothesis
9.4 TYPE-I AND TYPE-II ERRORS
In Section 9.3, we have discussed a rule that if the value of test statistic falls in
rejection (critical) region then we reject the null hypothesis and if it falls in the
non-rejection region then we do not reject the null hypothesis. A test statistic is
calculated on the basis of observed sample observations. But a sample is a
small part of the population about which decision is to be taken. A random
sample may or may not be a good representative of the population.
A faulty sample misleads the inference (or conclusion) relating to the null
hypothesis. For example, an engineer infers that a packet of screws is sub-
standard when actually it is not. It is an error caused due to poor or
inappropriate (faulty) sample. Similarly, a packet of screws may infer good
when actually it is sub-standard. So we can commit two kinds of errors while
testing a hypothesis which are summarised in Table 9.1 which is given below:
Table 9.1: Type of Errors
Decision H0 True H1 True
Reject H0 Type-I Error Correct Decision
Do not reject H0 Correct Decision Type-II Error
Let us take a situation where a patient suffering from high fever reaches to a
doctor. And suppose the doctor formulates the null and alternative hypotheses
as
H0 : The patient is a malaria patient
H1 : The patient is not a malaria patient
Then following cases arise:
Case I: Suppose that the hypothesis H0 is really true, that is, patient actually
a malaria patient and after observation, pathological and clinical
examination, the doctor rejects H0, that is, he / she declares him / her
a non-malaria-patient. It is not a correct decision and he / she
commits an error in decision known as type-I error.
Case II: Suppose that the hypothesis H0 is actually false, that is, patient
actually a non-malaria patient and after observation, the doctor
rejects H0, that is, he / she declares him / her a non-malaria-patient. It
is a correct decision.
Case III: Suppose that the hypothesis H0 is really true, that is, patient actually
a malaria patient and after observation, the doctor does not reject H0,
that is, he / she declares him / her a malaria-patient. It is a correct
decision.
Case IV: Suppose that the hypothesis H0 is actually false, that is, patient
actually a non-malaria patient and after observation, the doctor does
not reject H0, that is, he / she declares him / her a malaria-patient. It
is not a correct decision and he / she commits an error in decision
known as type-II error.
Thus, we formally define type-I and type-II errors as below:
Type-I Error:
The decision relating to rejection of null hypothesis H0 when it is true is called
type-I error. The probability of committing the type-I error is called size of test,
denoted by and is given by
= P [Reject H0 when H0 is true] = P [Reject H0 / H0 is true]
12
We reject the null hypothesis if random sample / test statistic falls in rejection Concepts of Testing of
region, therefore, Hypothesis
α = P [ X / H0 ]
where X = (X1, X2,…,Xn) is a random sample and ω is the rejection region and
1- = 1-P[Reject H0 / H0 is true]
= P[Do not reject H0 / H0 is true] = P[Correct decision]
The (1-) is the probability of correct decision and it correlates to the concept
of 100(1-)% confidence interval used in estimation.
Type-II Error:
The decision relating to non-rejection of null hypothesis H0 when it is false
(i.e. H1 is true) is called type-II error. The probability of committing type-II
error is generally denoted by and is given by
= P[Do not reject H0 when H0 is false]
= P[Do not reject H0 when H1 is true]
= P[Do not reject H0 / H1 is true]
= P[ X ω / H1 ] where, is the non-rejection region.
and
1- = 1-P[Do not reject H0 / H1 is true]
= P[Reject H0 / H1 is true] = P[Correct decision]
The (1-) is the probability of correct decision and also known as “power of
the test”. Since it indicates the ability or power of the test to recognize
correctly that the null hypothesis is false, therefore, we wish a test that yields a
large power.
We say that a statistical test is ideal if it minimizes the probability of both types
of errors and maximizes the probability of correct decision. But for fix sample
size, and are so interrelated that the decrement in one results into the
increment in other. So minimization of both probabilities of type-I and type-II
errors simultaneously for fixed sample size is not possible without increasing
sample size. Also both types of errors will be at zero level (i.e. no error in
decision) if size of the sample is equal to the population size. But it involves
huge cost if population size is large. And it is not possible in all situations such
as testing of blood.
Depending on the problem in hand, we have to choose the type of error which
has to minimize. For this, we have to look at a situation, suppose there is a
decision making problem and there is a rule that if we make type-I error, we
lose10 rupees and if we make type-II error we lose 1000 rupees. In this case,
we try to eliminate the type-II error, since it is more expensive.
In another situation, suppose the Delhi police arrests a person whom they
suspect is a murderer. Now, policemen have to test hypothesis:
H0: Arrested person is innocent (not murderer)
H1: Arrested person is a murderer
The type-I error is
= P [Reject H0 when it is true]
That is, suspected person who is actually an innocent will be sent to jail when
H0 rejects, although H0 being a true.
Testing of Hypothesis The type-II error is
= P [Do not reject H0 when H1 is true]
That is, when arrested person truly a murderer but released by the police. Now,
we see that in this case type-I error is more serious than type-II error because a
murderer may be arrested / punished later on but sending jail to an innocent
person is serious.
Consider another situation, suppose we want to test the null hypothesis
H0 : p 0.5 against H1 : p 0.5 on the basis of tossing a coin once, where p is
the probability of getting a head in a single toss (trial). And we reject the null
hypothesis if a head appears and do not reject otherwise. The type-I error, that
is, the probability of Reject H0 when it is true can be calculated easily(as shown
in Example 1) but the computation of type-II error is not possible because there
are infinitely many alternatives for p such as p = 0.6, p = 0.1, etc.
Generally, strong control on α is necessary. It should be kept as low as
possible. In test procedure, we prefix it at very low level like = 0.05 ( 5%) or
0.01 (1%) .
Now, it is time to do some examples relating to α and .
Example 1: It is desired to test a hypothesis H 0 :p p0 1/ 2 against the
alternative hypothesis H1 :p p1 1/ 4 on the basis of tossing a coin once,
where p is the probability of “getting a head” in a single toss (trial) and
agreeing to reject H0 if a head appears and accept H0 otherwise. Find the value
of and .
Solution: In such type of problems, first of all we search for critical region.
Here, we have critical region = {head}
Therefore, the probability of type-I error can be obtained as
= P[Reject H0 when H0 is true]
P[X / H 0 ]= P[Head appears / H 0 ]
1 H0 is trueso we take value
P Head appears 1
p
2 2 of parameter pgiven in H 0
Also,
= P[Do not reject H0 when H1 is true]
P X / H1 P [Tail appears / H1 ]
Obtain type-I and type-II errors when critical region is X 0.4. Also obtain
power function of the test.
14
Solution: Here, we have critical (rejection) and non-rejection regions as Concepts of Testing of
Hypothesis
X : X 0.4 and X : X 0.4
We have to test the null hypothesis
H 0 : θ 1 against H1 : θ 2
The size of type-I error is given by
P X / H 0 P X 0.4 / 1
f x, dx
P X a f x, dx
… (1)
0.4 1 a
1
Now, by using f x , ; 0 x , we get from equation (1)
1 1
1
dx dx x 0.4 1 0.4 0.6
0.4 1 0.4
Similarly, the size of type-II error is given by
P X / H1 P X 0.4 / 2
0.4 1 0.4
1 1 0.4 1
dx dx x 0 0.4 0 0.2
0 2 0 2 2 2
16
Let us do one example based on type of tests. Concepts of Testing of
Hypothesis
Example 3: A company has replaced its original technology of producing
electric bulbs by CFL technology. The company manager wants to compare the
average life of bulbs manufactured by original technology and new technology
CFL. Write appropriate null and alternate hypotheses. Also say about one tailed
and two tailed tests.
Solution: Suppose the average lives of original and CFL technology bulbs are
denoted by 1 and 2 respectively.
If company manager is interested just to know whether any significant
difference exists in average-life time of two types of bulbs then null and
alternative hypotheses will be:
H0 : µ1 = µ2 [average lives of two types of bulbs are same]
H1 : µ1 µ2 [average lives of two types of bulbs are different]
Since alternative hypothesis is two-tailed therefore corresponding test will be
two-tailed.
If company manager is interested just to know whether average life of CFL is
greater than original technology bulbs then our null and alternative hypotheses
will be
H0 : µ1 ≥ µ2
17
Testing of Hypothesis assumed value θ0 of parameter θ. So we can take the null and
alternative hypotheses as
H 0 : 0 and H1 : 0 for two-tailed test
H 0 : 0 and H1 : 0
or for one-tailed test
H 0 : 0 and H1 : 0
In case of comparing same parameter of two populations of interest,
say, 1 and 2, then our null and alternative hypotheses would be
H0 : 1 2 and H1 : 1 2 for two-tailed test
H 0 : 1 2 and H1 : 1 2
or for one-tailed test
H 0 : 1 2 and H1 : 1 2
Step II: After setting the null and alternative hypotheses, we establish a
criteria for rejection or non-rejection of null hypothesis, that is,
decide the level of significance (), at which we want to test our
hypothesis. Generally, it is taken as 5% or 1% (α = 0.05 or 0.01).
Step III: The third step is to choose an appropriate test statistic under H0 for
testing the null hypothesis as given below:
Statistic Value of the parameter under H 0
Test statistic
Standard error of statistic
After that, specify the sampling distribution of the test statistic
preferably in the standard form like Z (standard normal), 2, t, F or
any other well-known in literature.
Step IV: Calculate the value of the test statistic described in Step III on the
basis of observed sample observations.
Step V: Obtain the critical (or cut-off) value(s) in the sampling distribution
of the test statistic and construct rejection (critical) region of size .
Generally, critical values for various levels of significance are
putted in the form of a table for various standard sampling
distributions of test statistic such as Z-table, 2-table, t-table, etc.
Step VI: After that, compare the calculated value of test statistic obtained
from Step IV, with the critical value(s) obtained in Step V and
locates the position of the calculated test statistic, that is, it lies in
rejection region or non-rejection region.
Step VII: In testing of hypothesis ultimately we have to reach at a conclusion.
It is done as explained below:
(i) If calculated value of test statistic lies in rejection region at
level of significance then we reject null hypothesis. It means
that the sample data provide us sufficient evidence against the
null hypothesis and there is a significant difference between
hypothesized value and observed value of the parameter.
(ii) If calculated value of test statistic lies in non-rejection region at
level of significance then we do not reject null hypothesis. Its
means that the sample data fails to provide us sufficient
evidence against the null hypothesis and the difference between
hypothesized value and observed value of the parameter due to
fluctuation of sample.
18
Note 3: Nowadays the decision about the null hypothesis is taken with the help Concepts of Testing of
of p-value. The concept of p-value is very important, because computer Hypothesis
packages and statistical software such as SPSS, SAS, STATA, MINITAB,
EXCEL, etc. all provide p-value. So, Section 9.8 is devoted to explain the
concept of p-value.
Now, with the help of an example we explain the above procedure.
Example 4: Suppose, it is found that average weight of a potato was 50 gm
and standard deviation was 5.1 gm nearly 5 years ago. We want to test that due
to advancement in agricultural technology, the average weight of a potato has
been increased. To test this, a random sample of 50 potatoes is taken and
calculate the sample mean (X) as 52gm. Describe the procedure to carry out
this test.
Solution: Here, we are given that
Specified value of population mean = 0 = 50 gm,
Population standard deviation = σ = 5.1 gm,
Sample size = n = 50,
Sample mean = X = 52 gm
To carry out the above test, we have to follow up the following steps:
Step I: First of all, we setup null and alternative hypotheses. Here, we want
to test that the average weight of potato is increased. So our claim is
“average weight of potato has increased” i.e. µ > 50 and its
complement is µ ≤ 50. Since complement contains equality sign so
we can take the complement as the null hypothesis and claim as the
alternative hypothesis, that is,
H0 : µ ≤ 50 gm and H1: µ > 50 gm [Here, θ = µ]
Since the alternative hypothesis is right-tailed, so our test is right-
tailed.
Step II: After setting the null and alternative hypotheses, we fix level of
significance α. Suppose, α = 0.01 (= 1 % level).
Step III: Define a test statistic to test the null hypothesis as
Statistic Value of the parameter under H 0
Test staistic
Standard error of statistic
X 50
T
σ/ n
Since sample size is large (n = 50 > 30) so by central limit theorem
the sampling distribution of test statistic approximately follows
standard normal distribution (as explained in Unit 1 of this course),
i.e. T ~ N(0,1)
Step IV: Calculate the value of test statistic on the basis of sample
observations as
52 50 2
T 2.78
5.1/ 50 0.72
Step V: Now, we find the critical value. The critical value or cut-off value for
standard normal distribution is given in Table I (Z-table) in the
Appendix at the end of Block 1 of this course. So from this table, the
critical value for right-tailed test at = 0.01 is zα = 2.33.
19
Testing of Hypothesis Step IV: Now, to take the decision about the null hypothesis, we compare the
calculated value of test statistic with the critical value.
Since calculated value of test statistic (= 2.78) is greater than critical
value (= 2.33), that means calculated value of test statistic lies in
rejection region at 1% level of significance as shown in Fig. 9.5. So
we reject null hypothesis and support the alternative hypothesis. Since
alternative hypothesis is our claim, so we support the claim.
Thus, we conclude that sample does not provide us sufficient
evidence against the claim so we may assume that the average weight
of potato has increased.
Fig. 9.5 Now, you can try the following exercise.
E9) What is the first step in testing of hypothesis?
Procedure of taking the decision about the null hypothesis on the basis of
p-value:
To take the decision about the null hypothesis based on p-value, the p-value is
compared with level of significance (α) and if p-value is equal or less than
then we reject the null hypothesis and if the p-value is greater than we do not
reject the null hypothesis.
The p-value for various tests can be obtained with the help of the tables given
in the Appendix of the Block 1 of this course. But unless we are dealing with
the standard normal distribution, the exact p-value is not obtained with the
tables as mentioned above. But if we test our hypothesis with the help of
computer packages or softwares such as SPSS, SAS, MINITAB, STATA,
EXCEL, etc. then these types of computer packages or softwares present the p-
value as part of the output for each hypothesis testing procedure. Therefore, in
this block we will also describe the procedure to take the decision about the
null hypothesis on the basis of critical value as well as p-value concepts.
9.10 SUMMARY
In this unit, we have covered the following points:
1. Statistical hypothesis, null hypothesis, alternative hypothesis, simple &
composite hypotheses.
2. Type-I and Type-II errors.
3. Critical region.
4. One-tailed and two-tailed tests.
5. General procedure of testing a hypothesis.
6. Level of significance.
7. Concept of p-value.
8. Relation between confidence interval and testing of hypothesis.
H1 : 12 22
23
Testing of Hypothesis (v) Here, psychologist wants to test whether the proportion of literates
between two groups of people is same so
Claim: P1 = P2 and complement: P1 ≠ P2
Since claim contains equality sign so we take claim as the null
hypothesis and complement as the alternative hypothesis i.e.
H0 : P1 = P2
H1 : P1 ≠ P2
E3) Here, (i) and (vi) represent the simple hypotheses because these
hypotheses tell us the exact values of parameter average weight of
orange as = 100 and = 130.
The rest (ii), (iii), (iv), (v) and (vii) represent the composite hypotheses
because these hypotheses do not tell us the exact values of parameter .
E4) Since alternative hypothesis H1 : θ ≠ 60 is two tailed so critical region
lies in two-tails.
E5) Let A and B denote the number of white balls and black balls in the urn
respectively. Further, let X be the number of white balls drawn among
the two balls from the urn then we can take the null and alternative
hypotheses as
H0 : A = 4 & B = 2 and H1: A = 2 & B = 4
The critical region is given by
w X : X 2 X : X 0,1
Thus,
= P [Reject H0 when H0 is true]
P X w / H0 PX 0 / H0 PX 1/ H0
4
C0 2 C0 4
C1 2 C1 1 1 4 2 1 8
6
6
C2 C2 15 15 15 15
9 3
α
15 5
Similarly,
= P [Do not reject H0 when H1 is true]
2
C 2 4 C0 1 1 1
P X w / H1 PX 2 / H1 6
C2 15 15
E6) Since level of significance is the probability of type-I error so in this
case level of significance is 0.05 or 5%.
E7) Here, the alternative hypothesis is two-tailed therefore, the test will be
two-tailed test.
E8) Whether the test of testing a hypothesis is one-tailed or two-tailed
depends on the alternative hypothesis. So correct option is (ii).
E9) First step in testing of hypothesis is to setup null and alternative
hypotheses.
24
UNIT 10 LARGE SAMPLE TESTS
Structure
10.1 Introduction
Objectives
10.2 Procedure of Testing of Hypothesis for Large Samples
10.3 Testing of Hypothesis for Population Mean Using Z-Test
10.4 Testing of Hypothesis for Difference of Two Population Means Using
Z-Test
10.5 Testing of Hypothesis for Population Proportion Using Z-Test
10.6 Testing of Hypothesis for Difference of Two Population Proportions
Using Z-Test
10.7 Testing of Hypothesis for Population Variance Using Z-Test
10.8 Testing of Hypothesis for Two Population Variances Using Z-Test
10.9 Summary
10.10 Solutions /Answers
10.1 INTRODUCTION
In previous unit, we have defined basic terms used in testing of hypothesis.
After providing you necessary material required for any test, we can move
towards discussing particular tests one by one. But before doing that let us tell
you the strategy we are adopting here.
First we categories the tests under two heads:
Large sample tests
Small sample tests
After that, their unit wise distribution is done. In this unit, we will discuss large
sample tests whereas in Units 11 and 12 we will discuss small sample tests.
The tests which are described in these units are known as “parametric tests”.
Sometimes in our studies in the fields of economics, psychology, medical, etc.
we take a sample of objects / units / participants / patients, etc. such as 70, 500,
1000, 10,000, etc. This situation comes under the category of large samples.
As a thumb rule, a sample of size n is treated as a large sample only if it
contains more than 30 units (or observations, n > 30). And we know that, for
large sample (n > 30), one statistical fact is that almost all sampling
distributions of the statistic(s) are closely approximated by the normal
distribution. Therefore, the test statistic, which is a function of sample
observations based on n > 30, could be assumed follow the normal distribution
approximately (or exactly).
But story does not end here. There are some other issues which need to be
taken care off. Some of these issues have been highlighted by making different
cases in each test as you will see when go through Sections 10.3 to 10.8 of this
unit.
This unit is divided into ten sections. Section 10.1 is introductory in nature.
General procedure of testing of hypothesis for large samples is described in
25
Testing of Hypothesis Section 10.2. In Section 10.3, testing of hypothesis for population mean is
discussed whereas in Section 10.4, testing of hypothesis for difference of two
population means with examples is described. Similarly, in Sections 10.5 and
10.6, testing of hypothesis for population proportion and difference of two
population proportions are explained respectively. Testing of hypothesis for
population variance and two population variances are described in Sections
10.7 and 10.8 respectively. Unit ends by providing summary of what we have
discussed in this unit in Section 10.9 and solution of exercises in Section 10.10.
Objectives
After studying this unit, you should be able to:
judge for a given situation whether we should go for large sample test or
not;
Applying the Z-test for testing the hypothesis about the population mean
and difference of two population means;
Applying the Z-test for testing the hypothesis about the population
proportion and difference of two population proportions; and
Applying the Z-test for testing the hypothesis about the population variance
and two population variances.
Step II: After setting the null and alternative hypotheses, we have to choose
level of significance. Generally, it is taken as 5% or 1% (α = 0.05 or
0.01). And accordingly rejection and non-rejection regions will be
decided.
Step III: Third step is to determine an appropriate test statistic, say, Z in case
of large samples. Suppose Tn is the sample statistic such as sample
mean, sample proportion, sample variance, etc. for the parameter
then for testing the null hypothesis, test statistic is given by
we know that SE of a statistic is
Tn E(Tn ) Tn E(Tn ) theSD of the sampling distribution
Z
SE(Tn ) Var(Tn ) of that statistic
SE(Tn ) SD(Tn ) Var(Tn )
In this case, the rejection (critical) region falls under the right tail of
the probability curve of the sampling distribution of test statistic Z.
Fig. 10.1
Suppose z is the critical value at level of significance so entire
region greater than or equal to z is the rejection region and less than
z is the non-rejection region as shown in Fig. 10.1.
Testing of Hypothesis If z (calculated value ) ≥ z (tabulated value), that means the
calculated value of test statistic Z lies in the rejection region, then we
reject the null hypothesis H0 at level of significance. Therefore, we
conclude that sample data provides us sufficient evidence against the
null hypothesis and there is a significant difference between
hypothesized or specified value and observed value of the parameter.
If z < z , that means the calculated value of test statistic Z lies in non-
rejection region, then we do not reject the null hypothesis H0 at
level of significance. Therefore, we conclude that the sample data
fails to provide us sufficient evidence against the null hypothesis and
the difference between hypothesized value and observed value of the
parameter due to fluctuation of sample.
so the population parameter θ may be 0.
Case II: When H 0 : 0 and H 1 : 0 (left-tailed test)
In this case, the rejection (critical) region falls under the left tail of the
probability curve of the sampling distribution of test statistic Z.
Suppose -z is the critical value at level of significance then entire
region less than or equal to -z is the rejection region and greater
than -z is the non-rejection region as shown in Fig. 10.2.
Fig. 10.2 If z ≤-z, that means the calculated value of test statistic Z lies in the
rejection region, then we reject the null hypothesis H0 at level of
significance.
If z >-z, that means the calculated value of test statistic Z lies in the
non-rejection region, then we do not reject the null hypothesis H0 at
level of significance.
In case of two-tailed test: When H 0 : 0 and H 1 : 0
In this case, the rejection region falls under both tails of the
probability curve of sampling distribution of the test statistic Z. Half
the area (α) i.e. α/2 will lies under left tail and other half under the
right tail. Suppose zα / 2 and zα / 2 are the two critical values at the
left-tailed and right-tailed respectively. Therefore, entire region less
than or equal to z / 2 and greater than or equal to zα / 2 are the
rejection regions and between zα / 2 and zα / 2 is the non-rejection
Fig. 10.3
region as shown in Fig. 10.3.
These p-values for Z-test can be obtained with the help of Table-I (Z-table)
given in the Appendix at the end of Block 1 of this course (which gives the
probability [0 Z z] for different value of z) as discussed in Unit 14 of
MST-003.
For example, if test is right-tailed and calculated value of test statistic Z is 1.23
then
p-value = P Z z P Z 1.23 0.5 P 0 Z 1.23
σ
SE X Var X … (2)
n
Now, follow the same procedure as we have discussed in previous section, that
is, first of all we have to setup null and alternative hypotheses. Since here we
want to test the hypothesis about the population mean so we can take the null
and alternative hypotheses as
30
Here, and 0 0 Large Sample Tests
H0 : 0 and H1 : 0 for two-tailed test if we compareit with
general procedure.
H0 : 0 and H1 : 0
or for one-tailed test
H0 : 0 and H1 : 0
When we assume that
For testing the null hypothesis, the test statistic Z is given by the null hypothesis is
true then we are actually
X E X assuming that the
Z
SE X population parameter is
equal to the value in the
X µ0 null hypothesis. For
Z Using equations (1) and (2) and example, we assume that
σ/ n under H 0 we assume that µ µ 0 . µ = 60 whether the null
hypothesis is µ = 60 or
The sampling distribution of the test statistic depends upon σ2 that it is known µ ≤ 60 or µ ≥ 60.
or unknown. Therefore, two cases arise:
Case I: When σ2 is known
In this case, the test statistic follows the normal distribution with
mean 0 and variance unity when the sample size is the large as the
population under study is normal or non-normal. If the sample size is
small then test statistic Z follows the normal distribution only when
population under study is normal. Thus,
X µ0
Z ~ N 0, 1
σ/ n
Case II: When σ2 is unknown
In this case, we estimate σ2 by the value of sample variance (S2)
where,
1 n 2
S2
n 1 i1
Xi X
X 0
Then become test statistic follows the t-distribution with
S/ n
(n−1) df as the sample size is large or small provided the population
under study follows normal as we have discussed in Unit 2 of this
course. But when population under study is not normal and sample
size is large then this test statistic approximately follows normal
distribution with mean 0 and variance unity, that is,
X 0
Z ~ N 0,1
S/ n
After that, we calculate the value of test statistic as may be the case (σ2 is
known or unknown) and compare it with the critical value given in Table 10.1
at prefixed level of significance α. Take the decision about the null hypothesis
as described in the previous section.
From above discussion of testing of hypothesis about population mean, we note
following point:
(i) When σ2 is known then we apply the Z-test as the population under study
is normal or non-normal for the large sample. But when sample size is
31
Testing of Hypothesis small then we apply the Z-test only when population under study is
normal.
(ii) When σ2 is unknown then we apply the t-test only when the population
under study is normal as sample size is large or small. But when the
assumption of normality is not fulfilled and sample size is large then we
can apply the Z-test.
(iii) When sample is small and σ2 is known or unknown and the form of the
population is not known then we apply the non-parametric test as we will
be discussed in Block 4 of this course.
Following examples will help you to understand the procedure more clearly.
Example 1: A light bulb company claims that the 100-watt light bulb it sells
has an average life of 1200 hours with a standard deviation of 100 hours. For
testing the claim 50 new bulbs were selected randomly and allowed to burn
out. The average lifetime of these bulbs was found to be 1180 hours. Is the
company’s claim is true at 5% level of significance?
Solution: Here, we are given that
Specified value of population mean = 0 = 1200 hours,
Population standard deviation = σ = 100 hours,
Sample size = n = 50
Sample mean = X = 1180 hours.
In this example, the population parameter being tested is population mean i.e.
average life of a bulb (µ) and we want to test the company’s claim that average
life of a bulb is 1200 hours. So our claim is = 1200 and its complement is
≠ 1200. Since claim contains the equality sign so we can take the claim as the
null hypothesis and complement as the alternative hypothesis. So
H 0 : 0 1200 average life of a bulb is 1200 hours
H 1 : 1200 average life of a bulb is not1200 hours
Also the alternative hypothesis is two-tailed so the test is two-tailed test.
Here, we want to test the hypothesis regarding mean when population SD
(variance) is known and sample size n = 50(> 30) is large. So we will go for
Z-test.
Thus, for testing the null hypothesis the test statistic is given by
X 0
Z
/ n
1180 1200 20
1.41
100 / 50 14.14
The critical (tabulated) values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ± 1.96.
Fig. 10.4 Since calculated value of test statistic Z ( = –1.41) is greater than critical value
(= − 1.96) and less than the critical value (= 1.96), that means it lies in non-
rejection region as shown in Fig. 10.4, so we do not reject the null hypothesis.
Since the null hypothesis is the claim so we support the claim at 5% level of
significance.
Decision according to p-value:
The test is two-tailed, therefore,
p-value = 2P Z z 2P Z 1.41 Large Sample Tests
Since p-value (= 0.1586) is greater than α (= 0.05) so we do not reject the null
hypothesis at 5% level of significance.
Decision according to confidence interval:
Here, test is two-tailed, therefore, we contract two-sided confidence interval for
population mean.
Since population standard deviation is known, therefore, we can use
(1−α) 100 % confidence interval for population mean when population
variance is known which is given by
X z / 2 n , X z / 2 n
100 100
1180 1.96 50 ,1180 1.96 50
or 1180 27.71,1180 27.71
or 1152.29, 1207.71
Since 95% confidence interval for average life of a bulb contains the value of
the parameter specified by the null hypothesis, that is, 0 1200 so we do
not reject the null hypothesis.
Thus, we conclude that sample does not provide us sufficient evidence against
the claim so we may assume that the company’s claim that the average life of a
bulb is 1200 hours is true.
Note 2: Here, we note that the decisions about null hypothesis based on three
approaches (critical value or classical, p-value and confidence interval) are
same. The learners are advised to make the decision about the claim or
statement by using only one of the three approaches in the examination. Here,
we used all these approaches only to give you an idea how they can be used in
a given problem. Those learners who will opt biostatistics specialisation will
see and realize the importance of confidence interval approach in Unit 16 of
MSTE-004.
Example 2: A manufacturer of ball point pens claims that a certain pen
manufactured by him has a mean writing-life at least 460 A-4 size pages. A
purchasing agent selects a sample of 100 pens and put them on the test. The
mean writing-life of the sample found 453 A-4 size pages with standard
deviation 25 A-4 size pages. Should the purchasing agent reject the
manufacturer’s claim at 1% level of significance?
Solution: Here, we are given that
Specified value of population mean = 0 = 460,
33
Testing of Hypothesis Sample size = n = 100,
Sample mean = X = 453,
Sample standard deviation = S = 25
Here, we want to test the manufacturer’s claim that the mean writing-life (µ) of
pen is at least 460 A-4 size pages. So our claim is ≥ 460 and its complement
is < 460. Since claim contains the equality sign so we can take the claim as
the null hypothesis and the complement as the alternative hypothesis. So
H 0 : 0 460 and H 1 : 460
Since p-value (= 0.0026) is less than α (= 0.01) so we reject the null hypothesis
at 1% level of significance.
Therefore, we conclude that the sample provide us sufficient evidence against
the claim so the purchasing agent rejects the manufacturer’s claim at 1% level
of significance.
Now, you can try the following exercises.
E4) A sample of 900 bolts has a mean length 3.4 cm. Is the sample regarded
to be taken from a large population of bolts with mean length 3.25 cm
and standard deviation 2.61 cm at 5% level of significance?
E5) A big company uses thousands of CFL lights every year. The brand that
the company has been using in the past has average life of 1200 hours. A
new brand is offered to the company at a price lower than they are paying
for the old brand. Consequently, a sample of 100 CFL light of new brand Large Sample Tests
is tested which yields an average life of 1220 hours with standard
deviation 90 hours. Should the company accept the new brand at 5% level
of significance?
and
12 22
Var X Y Var X Var Y
n1 n 2
But we know that standard error = Variance
12 22
SE X Y Var X Y … (4)
n1 n 2
Now, follow the same procedure as we have discussed in Section 10.2, that is,
first of all we have to setup null and alternative hypotheses. Here, we want to
test the hypothesis about the difference of two population means so we can take
the null hypothesis as
Here, 1 1 and 2 2
H0 : 1 2 (no difference in means) if we compareit with
general procedure.
35
Testing of Hypothesis or H 0 : 1 2 0 (difference in two means is 0)
and the alternative hypothesis as
H1 : 1 2 for two-tailed test
H0 : 1 2 and H1 : 1 2
or for one-tailed test
H0 : 1 2 and H1 : 1 2
For testing the null hypothesis, the test statistic Z is given by
Z
X Y E X Y
SE X Y
X Y µ1 µ 2
or Z using equations (3) and (4)
σ12 σ 22
n1 n 2
Since under null hypothesis we assume that µ1 = µ2, therefore, we have
XY
Z
σ12 σ 22
n1 n 2
Now, the sampling distribution of the test statistic depends upon 12 and 22
that both are known or unknown. Therefore, four cases arise:
Case I: When 12 & 22 are known and 12 22 2
In this case, the test statistic follows normal distribution with mean
0 and variance unity when the sample sizes are large as both the
populations under study are normal or non-normal. But when
sample sizes are small then test statistic Z follows normal
distribution only when populations under study are normal, that is,
XY
Z ~ N 0,1
1 1
σ
n1 n2
Case II: When 12 & 22 are known and 12 22
In this case, the test statistic also follows the normal distribution as
described in case I, that is,
XY
Z ~ N(0, 1)
σ12 σ 22
n1 n 2
Case III: When 12 & 22 are unknown and 12 22 2
In this case, 12 & 22 are estimated by the values of the sample
variances S12 &S22 respectively and the exact distribution of test
statistic is difficult to derive. But when sample sizes n1 and n2 are
large (> 30) then central limit theorem, the test statistic
approximately normally distributed with mean 0 and variance unity,
that is,
XY
Z ~ N(0, 1)
S12 S22
n1 n 2
After that, we calculate the value of test statistic and compare it with the
critical value given in Table 10.1 at prefixed level of significance α. Take the
decision about the null hypothesis as described in Section10.2.
From above discussion of testing of hypothesis about population mean, we note
following point:
(i) When 12 & 22 are known then we apply the Z-test as both the population
under study are normal or non-normal for the large sample. But when
sample sizes are small then we apply the Z-test only when populations
under study are normal.
(ii) When 12 & 22 are unknown then we apply the t-test only when the
populations under study are normal as sample sizes are large or small.
But when the assumption of normality is not fulfilled and sample sizes
are large then we can apply the Z-test.
(iii) When samples are small and 12 & 22 are known or unknown and the
form of the population is not known then we apply the non-parametric
test as we will be discussed in Block 4 of this course.
Let us do some examples based on above test.
Example 3: In two samples of women from Punjab and Tamilnadu, the mean
height of 1000 and 2000 women are 67.6 and 68.0 inches respectively. If
population standard deviation of Punjab and Tamilnadu are same and equal to
5.5 inches then, can the mean heights of Punjab and Tamilnadu women be
regarded as same at 1% level of significance?
37
Testing of Hypothesis Solution: We are given
n1 = 1000, n2 = 2000, X 67.6, Y 68.0 and σ1 σ2 σ 5.5
Here, we wish to test that the mean height of Punjab and Tamilnadu women is
same. If 1 and 2 denote the mean heights of Punjab and Tamilnadu women
respectively then our claim is 1 = 2 and its complement is 1 ≠ 2. Since the
claim contains the equality sign so we can take the claim as the null hypothesis
and complement as the alternative hypothesis. Thus,
H 0 : 1 2 and H1 : 1 2
Since the alternative hypothesis is two-tailed so the test is two-tailed test.
Here, we want to test the hypothesis regarding two population means. The
standard deviations of both populations are known and sample sizes are large,
so we should go for Z-test.
So, for testing the null hypothesis, the test statistic Z is given by
XY
Z
σ12 σ 22
n1 n 2
67.6 68.0 0.4
2 2
5.5 5.5 1 1
1000 2000 5.5
1000 2000
0.4
1.88
5.5 0.0387
The critical (tabulated) values for two-tailed test at 1% level of significance are
± zα/2 = ± z0.005 = ± 2.58.
Since calculated value of Z ( = −1.88) is greater than the critical value
(= − 2.58) and less than the critical value (= 2.58), that means it lies in non-
rejection region as shown in Fig. 10.6, so we do not reject the null hypothesis
i.e. we fail to reject the claim.
Decision according to p-value:
Fig. 10.6
The test is two-tailed, therefore,
p-value = 2P Z z 2P Z 1.88
Since p-value (= 0.0602) is greater than ( 0.01) so we do not reject the null
hypothesis at1% level of significance.
Thus, we conclude that the samples do not provide us sufficient evidence
against the claim so we may assume that the average height of women of
Punjab and Tamilnadu is same.
Example 4: A university conducts both face to face and distance mode classes
for a particular course indented both to be identical. A sample of 50 students of
face to face mode yields examination results mean and SD respectively as:
X 80.4, S1 12.8
and other sample of 100 distance-mode students yields mean and SD of their Large Sample Tests
examination results in the same course respectively as:
Y 74.3, S2 20.5
Are both educational methods statistically equal at 5% level?
Solution: Here, we are given that
n1 50, X 80.4, S1 12.8;
p-value = 2P Z z 2P Z 2.23
Since p-value (= 0.0258) is less than ( 0.05) so we reject the null hypothesis
at 5% level of significance.
Thus, we conclude that samples provide us sufficient evidence against the
claim so both methods of education, i.e. face-to-face and distance-mode, are
not statistically equal.
Testing of Hypothesis Now, you can try the following exercises.
E6) Two brands of electric bulbs are quoted at the same price. A buyer was
tested a random sample of 200 bulbs of each brand and found the
following information:
Mean Life (hrs.) SD(hrs.)
Brand A 1300 41
Brand B 1280 46
Is there significant difference in the mean duration of their lives of two
brands of electric bulbs at 1% level of significance?
E7) Two research laboratories have identically produced drugs that provide
relief to BP patients. The first drug was tested on a group of 50 BP
patients and produced an average 8.3 hours of relief with a standard
deviation of 1.2 hours. The second drug was tested on 100 patients,
producing an average of 8.0 hours of relief with a standard deviation of
1.5 hours. Does the first drug provide a significant longer period of relief
at a significant level of 5%?
Case I: When sample size is not sufficiently large i.e. either of the conditions
np > 5 or nq > 5 does not meet, then we use exact binomial test. But exact
binomial test is beyond the scope of this course.
Case II: When sample size is sufficiently large, such that np > 5 and nq > 5
then by central limit theorem, the sampling distribution of sample proportion p
is approximately normally distributed with mean and variance as
PQ
E(p) = P and Var(p) = … (5)
n
But we know that standard error = Variance
PQ
SE (p) … (6)
n
Now, follow the same procedure as we have discussed in Section 10.2, first of
all we setup null and alternative hypotheses. Since here we want to test the
hypothesis about specified value P0 of the population proportion so we can take
the null and alternative hypotheses as
Here, P and P0
H 0 : P P0 and H1 : P P0 for two-tailed test if we compare it with
0
general procedure.
H 0 : P P0 and H1 : P P0
or for one-tailed test
H 0 : P P0 and H1 : P P0
For testing the null hypothesis, the test statistic Z is given by
p E p
Z
SE p
p P0
Z ~ N 0, 1 under H0 using equations(5) and (6)
P0 Q 0
n
After that, we calculate the value of test statistic and compare it with the
critical value(s) given in Table 10.1 at prefixed level of significance α. Take
the decision about the null hypothesis as described in Section 10.2.
Let us do some examples of testing of hypothesis about population proportion.
Example 5: A machine produces a large number of items out of which 25%
are found to be defective. To check this, company manager takes a random
sample of 100 items and found 35 items defective. Is there an evidence of more
deterioration of quality at 5% level of significance?
Solution: The company manager wants to check that his machine produces
25% defective items. Here, attribute under study is defectiveness. And we
define our success and failure as getting a defective or non defective item.
Let P = Population proportion of defectives items = 0.25(= P0 )
p = Observed proportion of defectives items in the sample = 35/100 = 0.35
Here, we want to test that machine produces more defective items, that is, the
proportion of defective items (P) greater than 0.25. So our claim is P > 0.25
41
Testing of Hypothesis and its complement is P ≤ 0.25. Since complement contains the equality sign so
we can take the complement as the null hypothesis and the claim as the
alternative hypothesis. So
H 0 : P P0 0.25 and H 1 : P 0.25
p-value = P Z z P Z 2.31
0.0104
Since p-value (= 0.0104) is less than ( 0.05) so we reject the null
hypothesis at 5% level of significance.
Thus, we conclude that the sample fails to provide us sufficient evidence
against the claim so we may assume that deterioration in quality exists at 5%
level of significance.
Example 6: A die is thrown 9000 times and draw of 2 or 5 is observed 3100
times. Can we regard that die is unbiased at 5% level of significance.
Solution: Let getting a 2 or 5 be our success, and getting a number other than 2
or 5 be a failure then in usual notions, we have
n = 9000, X = number of successes = 3100, p = 3100/9000 = 0.3444
Here, we want to test that the die is unbiased and we know that if die is Large Sample Tests
unbiased then proportion or probability of getting 2 or 5 is
P = Probability of getting a 2 or 5
= Probability of getting 2 + Probability of getting 5
1 1 1
0.3333
6 6 3
So our claim is P = 0.3333 and its complement is P ≠ 0.3333. Since the claim
contains the equality sign so we can take the claim as the null hypothesis and
complement as the alternative hypothesis. Thus,
H0 : P P0 0.3333 and H1 :P 0.3333
44
PQ PQ Large Sample Tests
p1 ~ N P1 , 1 1 and p 2 ~ N P2 , 2 2
n1 n2
That is,
PQ P Q
p1 p 2 ~ N P1 P2 , 1 1 2 2
n1 n2
Now, follow the same procedure as we have discussed in Section 10.2, first of
all we have to setup null and alternative hypotheses. Here, we want to test the
hypothesis about the difference of two population proportions so we can take
the null hypothesis as
Here, 1 P1 and
H 0 : P1 P2 (no difference in proportions) 2 P2 if we compare
it with general
procedure.
H 0 : P1 P2 and H1 : P1 P2
or for one-tailed test
H 0 : P1 P2 and H1 : P1 P2
p1 p2 P1 P2
or Z using equations (7) and (8)
P1Q1 P2Q2
n1 n2
Since under null hypothesis we assume that P1 = P2 = P, therefore, we have
45
Testing of Hypothesis p1 p 2
Z
1 1
PQ
n1 n 2
where, Q = 1-P.
Generally, P is unknown then it is estimated by the value of pooled proportion
P̂, where
n p n 2 p 2 X1 X 2 ˆ 1 Pˆ
Pˆ 1 1 and Q
n1 n 2 n1 n 2
After that, we calculate the value of test statistic and compare it with the
critical value(s) given in Table 10.1 at prefixed level of significance α. Take
the decision about the null hypothesis as described in Section 10.2.
Now, it is time for doing some examples for testing of hypothesis about the
difference of two population proportions.
Example 7: In a random sample of 100 persons from town A, 60 are found to
be high consumers of wheat. In another sample of 80 persons from town B, 40
are found to be high consumers of wheat. Do these data reveal a significant
difference between the proportions of high wheat consumers in town A and
town B ( at α = 0.05 )?
Solution: Here, attribute under study is high consuming of wheat. And we
define our success and failure as getting a person of high consumer of wheat
and not high consumer of wheat respectively.
We are given that
n1 = total number of persons in the sample of town A = 100
n2 = total number of persons in the sample of town B = 80
X1 = number of persons of high consumer of wheat in town A = 60
X2 = number of persons of high consumer of wheat in town B = 40
The sample proportion of high wheat consumers in town A is
X1 60
p1 0.60
n1 100
and the sample proportion of wheat consumers in town B is
X 2 40
p2 0.50
n 2 80
Here, we want to test that the proportion of high consumers of wheat in two
towns, say, P1 and P2, is not same. So our claim is P1 ≠ P2 and its complement
is P1 = P2. Since the complement contains the equality sign, so we can take the
complement as the null hypothesis and the claim as the alternative hypothesis.
Thus,
H 0 : P1 P2 P and H 1 : P1 P2
46
n1p1 100 0.60 60 5, n1q1 100 0.40 40 5 Large Sample Tests
n 2 p2 80 0.50 40 5, n 2q 2 80 0.50 40 5
We see that condition of normality meets, so we can go for Z-test.
The estimate of the combined proportion (P) of high wheat consumers in two
towns is given by
n1p1 n 2p2 X1 X 2 60 40 5
P̂
n1 n 2 n1 n 2 100 80 9
ˆ 1 Pˆ 1 5 4
Q
9 9
For testing the null hypothesis, the test statistic Z is given by
p1 p 2
Z
ˆ 1 1
P̂Q
n1 n 2
0.60 0.50 0.10
1.34
5 4 1 1 0.0745
9 9 100 80
The critical values for two-tailed test at 5% level of significance are ± zα/2
= ± z0.025 = ±1.96.
Since calculated value of Z (=1.34) is less than the critical value (= 1.96) and
greater than critical value (= −1.96), that means calculated value of Z lies in
non-rejection region, so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject the claim.
Decision according to p-value:
Since the test is two-tailed, therefore
p-value = 2 P Z z 2P Z 1.34
Since p-value (= 0.1802) is greater than ( 0.05) so we do not reject the null
hypothesis at 5% level of significance.
Thus, we conclude that the samples provide us the sufficient evidence against
the claim so we may assume that the proportion of high consumers of wheat in
two towns A and B is same.
Example 8: A machine produced 60 defective articles in a batch of 400. After
overhauling it produced 30 defective in a batch of 300. Has the machine
improved due to overhauling? (Take = 0.01).
Solution: Here, the machine produced articles and attribute under study is
defectiveness. And we define our success and failure as getting a defective or
non defective article. Therefore, we are given that
X1 = number of defective articles produced by the machine before overhauling
= 60
X2 = number of defective articles produced by the machine after overhauling
= 30
47
Testing of Hypothesis and n1 400, n 2 300,
Let p1 = Observed proportion of defective articles in the sample before the
overhauling
X1 60
0.15
n1 400
and p2 = Observed proportion of defective articles in the sample after the
overhauling
X2 30
0.10
n 2 300
Here, we want to test that machine improved due to overhauling that means the
proportion of defective articles is less after overhauling. If P1 and P2 denote the
proportion defectives before and after the overhauling the machine so our claim
is P1 > P2 and its complement P1 ≤ P2. Since the complement contains the
equality sign so we can take the complement as the null hypothesis and claim
as the alternative hypothesis. Thus,
H 0 : P1 P2 and H1 : P1 P2
Since the alternative hypothesis is right-tailed so the test is right-tailed test.
Since P is unknown, so the pooled estimate of proportion is given by
X1 X 2 60 30 90 9 ˆ 1 Pˆ 1 9 61 .
P̂ and Q
n1 n 2 400 300 700 70 70 70
Before proceeding further, first we have to check whether the condition of
normality meets or not.
n1p1 400 0.15 60 5, n1q1 400 0.85 340 5
n 2 p2 300 0.10 30 5, n 2q 2 300 0.90 270 5
We see that condition of normality meets, so we can go for Z-test.
For testing the null hypothesis, the statistic is given by
p1 p 2
Z
ˆ 1 1
P̂Q
n1 n 2
0.15 0.10 0.05
1.95
9 61 1 1 0.0256
70 70 400 300
The critical value for right-tailed test at 1% level of significance is
zα = z0.01 = 2.33.
Since calculated value of Z (= 1.95) is less than the critical value (= 2.33) that
means calculated value of Z lies in non-rejection region, so we do not reject the
null hypothesis and reject the alternative hypothesis i.e. we reject the claim at
1% level of significance.
Decision according to p-value:
Since the test is right-tailed, therefore,
p-value = P Z z P Z 1.95
48
0.5 P 0 Z 1.95 0.5 0.4744 0.0256 Large Sample Tests
Since p-value (= 0.0256) is greater than ( 0.01) so we do not reject the null
hypothesis at1% level of significance.
Thus, we conclude that the samples provide us sufficient evidence against the
claim so the machine has not been improved after overhauling.
Now, you can try the following exercises.
E10) The proportions of literates between groups of people of two districts A
and B are tested. Out of the 100 persons selected at random from each
district, 50 from district A and 40 from district B are found literates. Test
whether the proportion of literate persons in two districts A and B is
same at 1% level of significance?
E11) In a large population 30% of a random sample of 1200 persons had blue-
eyes and 20% of a random sample of 900 persons had the same blue-
eyes in another population. Test the proportion of blue-eyes persons is
same in two populations at 5% level of significance.
2 2
SE S2 Var S2 σ … (10)
n
The general procedure of this test is explained in the next page.
49
Testing of Hypothesis As we are doing so far in all tests, first Step in hypothesis testing problems is to
setup null and alternative hypotheses. Here, we want to test the hypothesis
specified value 20 of the population variance 2 so we can take our null and
alternative hypotheses as
and
214 224
Var S12 S22 Var S12 Var S22
n1 n2
51
Testing of Hypothesis But we know that standard error = Variance
2 14 2 42
SE S12 S22 Var S12 S22 … (12)
n1 n2
Now, follow the same procedure as we have discussed in Section 10.2, that is,
first of all we have to setup null and alternative hypothesis. Here, we want to
test the hypothesis about the two population variances, so we can take our null
and alternative hypotheses as
H 0 : 12 22 2 and H1 : 12 22 for two-tailed test
H 0 : 12 22 and H1 : 12 22
or for one-tailed test
H 0 : 12 22 and H1 : 12 22
or
S12 S22 σ12 σ 22
Z using equations (11) and (12)
2σ14 2σ 42
n1 n2
Since under null hypothesis we assume that σ12 σ22 σ2 , therefore, we have
S12 S22
Z ~ N 0,1
1 1
σ2 2
n1 n 2
Generally, population variances σ12 and σ 22 are unknown, so we estimate them
by their corresponding sample variances S12 and S 22 as
ˆ 12 S12 and ˆ 22 S 22
Thus, the test statistic Z is given by
S12 S22
Z ~ N 0,1
2S14 2S42
n n
1 2
After that, we calculate the value of test statistic as may be the case and
compare it with the critical value given in Table 10.1 at prefixed level of
significance α. Take the decision about the null hypothesis as described in
Section 10.2.
Note 3: When populations under study are normal then for testing the
hypothesis about equality of population variances we use F- test which will be
discussed in Unit 12 of this course. Whereas when the form of the populations
under study is not known and sample sizes are large then we apply Z-test as
discussed above.
Now, it is time to do an example based on above test.
52
Example 10: A comparative study of variation in weights (in pound) of Army- Large Sample Tests
soldiers and Navy- sailors was made. The sample variance of the weight of 120
soldiers was 60 pound2 and the sample variance of the weight of 160 sailors
was 70 pound2. Test whether the soldiers and sailors have equal variation in
their weights. Use 5% level of significance.
We want to test that the Army-soldiers and Navy-sailors have equal variation
in their weights. If 12 and 22 denote the variances in the weight of Army-
soldiers and Navy-sailors so our claim is 12 22 and its complement is
12 22 . Since the claim contains the equality sign so we can take the claim as
the null hypothesis and complement as the alternative hypothesis. Thus,
Since population variances are unknown so for testing the null hypothesis, the
test statistic Z is given by
S12 S22
Z
2S14 2S24
n n
1 2
60 70
2 2
2 60 2 70
120 160
10 10
0.91
60.0 61.25 11.01
53
Testing of Hypothesis E 13) Two sources of raw materials of bulbs are under consideration by a bulb
manufacturing company. Both sources seem to have similar
characteristics but the company is not sure about their respective
uniformity. A sample of 52 lots from source A yields variance 25 and a
sample of 40 lots from source B yields variance of 12. Test whether the
variance of source A significantly differs to the variances of source B at
= 0.05?
We now end this unit by giving a summary of what we have covered in it.
10.9 SUMMARY
In this unit we have covered the following points:
1. How to judge a given situation whether we should go for large sample test
or not.
2. Applying the Z-test for testing the hypothesis about the population mean
and difference of two population means.
3. Applying the Z-test for testing the hypothesis about the population
proportion and difference of two population proportions.
4. Applying the Z-test for testing the hypothesis about the population variance
and two population variances.
p-value = 2P Z z 2P Z 2.42
n 2 200, Y 1280, S2 46
Here, we want to test that there is significant difference in the mean
duration of their lives of two brands of electric bulbs. If 1 and 2
denote the mean lives of two brands of electric bulbs respectively then
our claim is 1 ≠ 2 and its complement is 1 = 2. Since the
complement contains the equality sign so we can take the complement
as the null hypothesis and the claim as the alternative hypothesis. Thus,
H 0 : 1 2 and H1 : 1 2
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
We want to test the null hypothesis regarding equality of two
population means. The standard deviations of both populations are
unknown so we should go for t-test if population of difference is known
to be normal. But it is not the case. Since sample sizes are large (n1, and
n2 > 30) so we go for Z-test.
So for testing the null hypothesis, the test statistic Z is given by
XY
Z
S12 S22
n1 n 2
1300 1280 20 20
4.59
41
2
46
2
8.41 10.58 4.36
200 200
The critical (tabulated) values for two-tailed test at 1% level of
significance are ± zα/2 = ± z0.005 = ± 2.58.
Since calculated value of test statistic Z (= 4.59) is greater than the
critical values (= ± 2.58), that means it lies in rejection region, so we
reject the null hypothesis an support the alternative hypothesis i.e.
support the claim at 1% level of significance.
Thus, we conclude that samples do not provide us sufficient evidence
against the claim so there is significant difference in the mean duration
of their lives of two brands of electric bulbs.
E7) Given that
n1 50, X 8.3, S1 1.2;
58
Since the alternative hypothesis is right-tailed so the test is right-tailed Large Sample Tests
test.
Before proceeding further, first we have to check whether the condition
of normality meets or not.
np 200 0.9 180 5
7 6
13
13
2.80
2 2 36 0.129 4.64
6
120
The critical values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ±1.96.
Since calculated value of Z (= 2.8) is greater than critical values
(= ±1.96), that means it lies in rejection region, so we reject the null
hypothesis i.e. we reject our claim at 5% level of significance.
61
Testing of Hypothesis Thus, we conclude that sample provides us sufficient evidence against
the claim so standard deviation of the life of bulbs of the lot is not 6.0
hours.
E13) Here, we are given that
n1 52, S12 25
n2 40, S22 12
Here, we want to test that variance of source A significantly differs to
the variances of source B. If 12 and 22 denote the variances in the raw
materials of sources A and B respectively so our claim is 12 22 and
its complement is 12 22 . Since complement contains the equality
sign so we can take the complement as the null hypothesis and the
claim as the alternative hypothesis. Thus,
H0 : 12 22 and H1 : 12 22
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
Here, the distributions of populations under study are not known and
sample sizes are large (n1 52 30, n 2 40 30) so we can go for Z-
test.
Since population variances are unknown so for testing the null
hypothesis, the test statistic Z is given by
S12 S22
Z
2S14 2S24
n n
1 2
25 12 13
2.36
2
225 212
2 5.5
52 40
The critical values for two-tailed test at 5% level of significance are
± zα/2 = ± z0.025 = ±1.96.
Since calculated value of Z (= 2.36) is greater than critical values (= ±1.96),
that means it lies in rejection region, so we reject the null hypothesis and
support the alternative hypothesis i.e. we support our claim at 5% level of
significance.
Thus, we conclude that samples fail to provide us sufficient evidence against
the claim so variance of source A significantly differs to the variance of source
B.
62
UNIT 11 SMALL SAMPLE TESTS
Structure
11.1 Introduction
Objectives
11.2 General Procedure of t-Test for Testing a Hypothesis
11.3 Testing of hypothesis for Population Mean Using t-Test
11.4 Testing of Hypothesis for Difference of Two Population Means Using
t-Test
11.5 Paired t-Test
11.6 Testing of Hypothesis for Population Correlation Coefficient Using
t-Test
11.7 Summary
11.8 Solutions /Answers
11.1 INTRODUCTION
In previous unit, we have discussed the testing of hypothesis for large samples
in details. Recall that throughout the unit, we were making an assumption that
“if sample size is sufficiently large then test statistic follows approximately
standard normal distribution”. Also recall two points highlighted in this course,
i.e.
Cost of our study increases as sample size increases.
Sometime nature of the units in the population under study is such that they
destroyed under investigation.
If there are limited recourses in terms of money then first point listed above
force us not to go for large sample size when items /units under study are very
costly such as airplane, computer, etc. Second point listed above give an alarm
for not to go for large sample if population units are destroyed under
investigation.
So, we need an alternative technique which is used to test the hypothesis based
on small sample(s). Small sample tests do this job for us. But in return they
demand one basic assumption that population under study should be normal as
you will see when you go through the unit. t, 2 and F -tests are some
commonly used small sample tests.
In this unit, we will discuss t-test in details which is based on the t-distribution
described in Unit 3 of this course. And 2 and F-tests will be discussed in next
unit which are based on χ2 and F-distributions described in Unit 3 and Unit 4 of
this course respectively.
This unit is divided into eight sections. Section 11.1 is described the need of
small sample tests. The general procedure of t-test for testing a hypothesis is
described in Section 11.2. In Section 11.3, we discuss testing of hypothesis for
population mean using t-test. Testing of hypothesis for difference of two
population means when samples are independent is described in Section 11.4
whereas in Section 11.5, the paired t-test for difference of two population
means when samples are dependent(paired) is discussed. In Section 11.6
testing of hypothesis for population correlation coefficient is explained. Unit
63
Testing of Hypothesis ends by providing summary of what we have discussed in this unit in Section
11.7 and solution of exercises in Section 11.8.
Before moving further a humble suggestion to you that please revise what you
have learned in previous two units. The concepts discussed there will help you
a lot to better understand the concepts discussed in this unit.
Objectives
After studying this unit, you should be able to:
realize the importance of small sample tests;
know the procedure of t-test for testing a hypothesis;
describe testing of hypothesis for population mean for using t-test;
explain the testing of hypothesis for difference of two population means
when samples are independent using t-test;
describe the procedure for paired t-test for testing of hypothesis for
difference of two population means when samples are dependent or paired;
and
explain the testing of hypothesis for population correlation coefficient using
t-test.
Step II: After setting the null and alternative hypotheses our next step is to
decide a criteria for rejection or non-rejection of null hypothesis i.e.
64
decide the level of significance , at which we want to test our null Small Sample Tests
hypothesis. We generally take = 5 % or 1%.
Step III: The third step is to determine an appropriate test statistic, say, t for
testing the null hypothesis. Suppose Tn is the sample statistic (may be
sample mean, sample correlation coefficient, etc. depending upon )
for the parameter then test-statistic t is given by
Tn E(Tn )
t
SE Tn
In this case, the rejection (critical) region falls under the right tail of
the probability curve of the sampling distribution of test statistic t.
Suppose t(ν), is the critical value at level of significance then entire
region greater than or equal to t(ν), is the rejection region and less
than t(ν), is the non-rejection region as shown in Fig. 11.1.
If tcal ≥ t(ν,), that means calculated value of test statistic t lies in the Fig. 11.1
rejection (critical) region, then we reject the null hypothesis H0 at
level of significance. Therefore, we conclude that sample data
provides us sufficient evidence against the null hypothesis and there is
a significant difference between hypothesized value and observed
value of the parameter.
If tcal < t(ν,), that means calculated value of test statistic t lies in non-
rejection region, then we do not reject the null hypothesis H0 at
level of significance. Therefore, we conclude that the sample data
fails to provide us sufficient evidence against the null hypothesis and
the difference between hypothesized value and observed value of the
parameter due to fluctuation of sample.
Testing of Hypothesis Case II: When H 0 : 0 and H 1 : 0 (left-tailed test)
In this case, the rejection (critical) region falls under the left tail of the
probability curve of the sampling distribution of test statistic t.
Suppose - t(ν), is the critical value at level of significance then
entire region less than or equal to - t(ν), is the rejection region and
greater than - t(ν), is the non-rejection region as shown in Fig. 11.2.
If tcal ≤ − t(ν),, that means calculated value of test statistic t lies in the
rejection (critical) region, then we reject the null hypothesis H0 at
level of significance.
If tcal >- t(ν),, that means calculated value of test statistic t lies in the
Fig. 11.2 non-rejection region, then we do not reject the null hypothesis H0 at
level of significance.
In case of two-tailed test:
That is, when H 0 : 0 and H 1 : 0
In this case, the rejection region falls under both tails of the
probability curve of sampling distribution of the test statistic t. Half
the area (α) i.e. α/2 will lies under left tail and other half under the
right tail. Suppose t ( ), / 2 and t ( ), / 2 are the two critical values at
the left- tailed and right-tailed respectively. Therefore, entire region
less than or equal to t ( ), / 2 and greater than or equal to t ( ), / 2 are
Fig. 11.3
the rejection regions and between t ( ), / 2 and t ( ), / 2 is the non-
rejection region as shown in Fig. 11.3.
If tcal ≥ t(ν),/2, or tcal ≤ -t(ν),/2, that means calculated value of test
statistic t lies in the rejection(critical) region, then we reject the null
hypothesis H0 at level of significance.
And if -t(ν),/2 < tcal < t(ν),/2, that means calculated value of test
statistic t lies in the non-rejection region, then we do not reject the
null hypothesis H0 at level of significance.
Procedure of taking the decision about the null hypothesis on the basis of
p-value:
To take the decision about the null hypothesis on the basis of p-value, the p-
value is compared with given level of significance (α). And if p-value is less
than or equal to α then we reject the null hypothesis and if p-value is greater
than α then we do not reject the null hypothesis at α level of significance.
Since the distribution of test statistic t follows t-distribution with ν df and we
also know that t-distribution is symmetrical about t = 0 line therefore, if tcal
represents calculated value of test statistic t then p-value can be defined as:
For one-tailed test:
For H1: θ > θ0 (right-tailed test)
p-value = P[t tcal]
For H1: θ < θ0 (left-tailed test)
p-value = P t t cal
For two-tailed test: For H1: 0
p-value = 2P t t cal Small Sample Tests
These p-values for t-test can be obtained with the help of Table-II (t-table)
given in the Appendix at the end of Block 1 of this course. But this table gives
the t-values corresponding to the standard values of α such as 0.10, 0.05,
0.025, 0.01 and 0.005 only, therefore, the exact p-values are not obtained with
the help of this table and we can approximate the p-value for this test.
For example, if test is right-tailed and calculated (observed) value of test
statistic t is 2.94 with 9 df then p-value is obtained as:
Since calculated value of test statistic t is based on the 9 df therefore, we use
row for 9 df in the t-table and move across this row to find the values in which
calculated t-value falls. Since calculated t-value falls between 2.821 and 3.250,
which are corresponding to the values of one-tailed area α = 0.01 and 0.005
respectively, therefore, p-value will lie between 0.005 and 0.01, that is,
0.005 p-value 0.01
If in the above example, the test is two-tailed then the two values 0.01 and
0.005 would be doubled for p-value, that is,
0.005 2 0.01 p-value 0.02 2 0.01
Note 1: With the help of computer packages and softwares such as SPSS, SAS,
MINITAB, EXCEL, etc. we can find the exact p-values for t-test.
Now, you can try the following exercise.
E1) If test is two-tailed and calculated value of test statistic t is 2.42 with 15
df then find the p-value for t-test.
H 0 : 0 and H1 : 0
or for one-tailed test
H 0 : 0 and H1 : 0
For testing the null hypothesis, the test statistic t is given by
X 0
t ~ t n 1 under H 0
S/ n
1 n 1 n 2
where, X
n i1
X i is the sample mean and S 2
n 1 i1
X i X is the
sample variance.
For computational simplicity, we may use the following formulae for X, S2 :
2
1 1 d
n 1
X a d and S
2 2
d
n n
where, d = (X − a ), ‘a’ being the assumed arbitrary value.
Here, the test statistic t follows t-distribution with (n − 1) degrees of freedom as
we discussed in Unit 3 of this course.
After substituting values of X, S and n , we get calculated value of test
statistic t. Then we look for critical (or cut-off or tabulated) value(s) of test
statistic t from the t-table. On comparing calculated value and critical value(s),
we take the decision about the null hypothesis as discussed in previous section.
Let us do some examples of testing of hypothesis about population mean using
t-test.
Example 1: A manufacturer claims that a special type of projector bulb has an
average life 160 hours. To check this claim an investigator takes a sample of 20
such bulbs, puts on the test, and obtains an average life 167 hours with standard
deviation 16 hours. Assuming that the life time of such bulbs follows normal
distribution, does the investigator accept the manufacturer’s claim at 5% level
of significance?
Solution: Here, we are given that
0 160, n 20, X 167 and S 16
Here, we want to test the manufacturer claims that a special type of projector
bulb has an average life (µ) 160 hours. So claim is µ = 160 and its complement
is µ ≠ 160. Since the claim contains the equality sign so we can take the claim
as the null hypothesis and complement as the alternative hypothesis. Thus,
68
H 0 : 0 160 and H 1 : 160 Small Sample Tests
Since calculated value of test statistic t (= 1.96) is greater than the critical
value (= − 2.093) and is less than critical value (= 2.093), that means calculated
value of test statistic lies in non-rejection region as shown in Fig. 11.4. So we
do not reject the null hypothesis i.e. we support the manufacture’s claim at 5%
level of significance.
Fig. 11.4
Decision according to p-value:
Since calculated value of test statistic t is based on 19 df therefore, we use row
for 19 df in the t-table and move across this row to find the values in which
calculated t-value falls. Since calculated t-value falls between 1.729 and 2.093
corresponding to one-tailed area α = 0.05 and 0.025 respectively therefore p-
value lies between 0.025 and 0.05, that is,
0.025 p-value 0.05
Since test is two-tailed so
2 0.025 0.05 p-value 0.10 0.05 2
Since p-value is greater than ( 0.05) so we do not reject the null
hypothesis at 5% level of significance.
Thus, we conclude that sample fails to provide us sufficient evidence against
the null hypothesis so we may assume that the manufacture’s claim is true so
the investigator may accept the manufacturer’s claim at 5% level of
significance.
Example 2: The mean share price of companies of Pharma sector is Rs.70. The
share prices of all companies were changed time to time. After a month, a
sample of 10 Pharma companies was taken and their share prices were noted as
below:
70, 76, 75, 69, 70, 72, 68, 65, 75, 72
Assuming that the distribution of share prices follows normal distribution, test
whether mean share price is still the same at 1% level of significance?
Testing of Hypothesis Solution: Here, we wish to test that the mean share price (µ) of companies of
Pharma sector is still Rs.70 besides all changes. So our claim is µ = 70 and its
complement is µ ≠ 70. Since the claim contains the equality sign so we can
take the claim as the null hypothesis and complement as the alternative
hypothesis. Thus,
H 0 : 0 70 mean share price of companies is still Rs. 70
H1 : 0 70 mean share price of companies is not still Rs. 70
Since the alternative hypothesis is two-tailed so the test is two-tailed test.
Here, we want to test the hypothesis regarding population mean when
population SD is unknown. Also sample size is small n = 10(n < 30) and
population under study is normal, so we can go for t-test for testing the
hypothesis about population mean.
For testing the null hypothesis, the test statistic t is given by
X µ0
t ~ t ( n 1) … (1)
S/ n
Calculation for X and S:
S. No. Sample value Deviation d2
(X) d = (X-a), a = 70
1 70 0 0
2 76 6 36
3 75 5 25
4 69 -1 1
5 70 0 0
6 72 2 4
7 68 -2 4
8 65 -5 25
9 75 5 25
10 72 2 4
Total 12 124
2
S
1 2
d
d
2
n 1 n
1 12 2 1 144
124 124 12.18
10 1 10 9 10
Since calculated value of test statistic t (= 1.09) is less than the critical value
(= 3.250) and greater than the critical value (= ‒3.250), that means calculated
value of t lies in non-rejection region as shown in Fig. 11.5. So we do not
reject the null hypothesis i.e. we support the claim at 1% level of significance.
Decision according to p-value:
Since calculated value of test statistic t is based on 9 df therefore, we use row
for 9 df in the t-table and move across this row to find the values in which Fig. 11.5
calculated t-value falls. Since all values in this row are greater than calculated
t-value 1.09 and the smallest value is 1.383 corresponding to one-tailed area
α = 0.10 therefore p-value is greater than 0.10, that is,
p-value 0.10
Since test is two-tailed so
p-value 2 0.10 0.20
Since p-value (= 0.20) is greater than ( 0.01) so we do not reject the null
hypothesis at 1% level of significance.
Thus, we conclude that the sample fails to provide us sufficient evidence
against the claim so may assume that the mean share price is still Rs. 70.
Now, you can try the following exercises.
E2) A tyre manufacturer claims that the average life of a particular category
of his tyre is 18000 km when used under normal driving conditions. A
random sample of 16 tyres was tested. The mean and SD of life of the
tyres in the sample were 20000 km and 6000 km respectively.
Assuming that the life of the tyres is normally distributed, test the claim
of the manufacturer at 1% level of significance using appropriate test.
E3) It is known that the average weight of cadets of a centre follows normal
distribution. Weights of 10 randomly selected cadets from the same
centre are as given below:
48, 50, 62, 75, 80, 60, 70, 56, 52, 77
Can we say that average weight of all cadets of the centre from which
the above sample was taken is equal to 60 kg at 5% level of
significance?
and
n n
1 1
2 1 2
2
S12
(n1 1) i 1
X i X , S 2
2
(n 2 1) i1
Yi Y
1 n1 2
n2
2
Xi X Yi Y
2
S
p
n1 n 2 2 i 1 i 1
For computational simplicity, use the following formulae for X, Y and S 2p :
1 1
Xa
n1 d1 , Y b
n2 d 2 and
2 2
1 d1 d 2
n1 n 2 2 1 d2
2 2 2
S
p d
n1 n2
where, d1= (X − a ) and d2 = (Y − b), ‘a’ and ‘b’ are the assumed arbitrary
values.
Now, follow the same procedure as we have discussed in Section 11.2, that is,
first of all we have to setup null and alternative hypotheses. Here, we want to
test the hypothesis about the difference of two population means so we can take
the null hypothesis as
Here, 1 1 and 2 2
H 0 : 1 2 (no difference in means) if we compare it with
general procedure.
H0 : 1 2 and H1 : 1 2
or for one-tailed test
H0 : 1 2 and H1 : 1 2
For testing the null hypothesis, the test statistic t is given by
XY
t ~ t (n 1 n 2 2) under H0
1 1
Sp
n1 n 2
73
Testing of Hypothesis Calculation for X, Y and S p :
Diet A Diet B
X d1 = (X−a) d1 2 Y d2 = (Y−b) d2 2
a =12 b = 16
12 0 0 14 −2 4
8 −4 16 13 −3 9
14 2 4 12 −4 16
16 4 16 15 −1 1
13 1 1 16 0 0
12 0 0 14 −2 4
8 −4 16 18 2 4
14 2 4 17 1 1
10 −2 4 21 5 25
9 −3 9 15 −1 1
Total −4 70 −5 65
1 5
Y b d 2 16 15.5
n2 10
2 2
1 d1 d2
n1 n 2 2 1 d2
2 2 2
S
p d
n1 n2
1
70
42 65 52
10 10 2 10 10
1
68.4 62.5 7.27
18
Sp 7.27 2.70
Since calculated value of test statistic t (= −3.21) is less than critical values
(± 2.101) that means calculated value of test statistic t lies in rejection region,
so we reject the null hypothesis and support the alternative hypothesis i.e.
support our claim at 5% level of significance.
74
Thus, we conclude that samples do not provide us sufficient evidence against Small Sample Tests
the claim so diets A and B differ significantly in terms of gain in weights of
pigs.
Example 4: The means of two random samples of sizes 10 and 8 drawn from
two normal populations are 210.40 and 208.92 respectively. The sum of
squares of the deviations from their means is 26.94 and 24.50 respectively.
Assuming that the populations are normal with equal variances, can samples be
considered to have been drawn from normal populations having equal mean.
Solution: In usual notations, we are given that
n1 10, n 2 8, X 210.40, Y 208.92,
2 2
X X 26.94, Y Y 24.50
Therefore,
1
S 2p X X 2 Y Y 2
n1 n 2 2
1
26.94 24.50 1 51.44 3.215
10 8 2 16
Sp 3.215 1.79
We wish to test that both the samples are drawn from normal populations
having the same means. If 1 and 2 denote the means of both normal
populations respectively then our claim is 1= 2 and its complement is
1 ≠ 2. Since the claim contains the equality sign so we can take the claim as
the null hypothesis and complement as the alternative hypothesis. Thus,
H 0 : 1 2 [mean of both populations is equal]
H1 : 1 2 [mean of both populations is not equal]
Since the alternative hypothesis is two-tailed so the test is two-tailed test.
Since it is given that two populations are normal with equal and unknown
variances and other assumptions of t-test for testing a hypothesis about
difference of two population means also meet. So we can go for this test.
For testing the null hypothesis, the test statistic t is given by
XY
t ~ t (n 1 n 2 2) under H0
1 1
Sp
n1 n 2
210.40 208.92 1.48 1.48
1.76
1 1 1.79 0.47 0.84
1.79
10 8
The critical values of test statistic t for two-tailed test corresponding
(n1 + n2 -2) = 16 df at 5% level of significance are
t ( n1 n 2 2 ), α / 2 t (16 ), 0.025 2.12.
Since calculated value of test statistic t (= 1.76) is less than the critical value
(= 2.12) and greater than the critical value (= ‒2.12), that means calculated
value of test statistic t lies in non-rejection region so we do not reject the null
hypothesis i.e. we support the claim.
75
Testing of Hypothesis Thus, we conclude that samples fail to provide us sufficient evidence against
the claim so we may assume that both samples are taken from normal
populations having equal means.
Now, you can try the following exercises.
E4) Two different types of drugs A and B were tried on some patients for
increasing their weights. Six persons were given drug A and other 7
persons were given drug B. The gain in weights (in ponds) is given
below:
Drug A 5 8 7 10 9 6 −
Drug B 9 10 15 12 14 8 12
76
Assumptions Small Sample Tests
77
Testing of Hypothesis Assuming that the memories of the children before and after the practice
session follow normal distributions, is the memory practice session improve the
performance of children?
Solution: Here, we want to test that memory practice session improve the
performance of children. If 1 and 2 denote the mean digit repetition before
and after the practice so our claim is 1 < 2 and its complement is 1 ≥ 2.
Since complement contains the equality sign so we can take the complement as
the null hypothesis and the claim as the alternative hypothesis. Thus,
H 0 : 1 2 and H1 : 1 2
1 14 2 1
32 15.67 1.42
11 12 11
SD 1.42 1.19
78
Substituting these values in equation (3), we have Small Sample Tests
1.17 1.17
t 3.44
1.19 / 12 0.34
The critical value of test statistic t for left-tailed test corresponding (n-1) = 11
df at 5% level of significance is t ( n 1), α t (11), 0.05 1.796.
Since calculated value of test statistic t (= −3.44) is less than the critical value
(=−1.796), that means calculated value of t lies in rejection region, so we reject
the null hypothesis and support the alternative hypothesis i.e. support the claim
at 5% level of significance.
Thus, we conclude that samples fail to provide us sufficient evidence against
the claim so we may assume that memory practice session improves the
performance of children.
Example 6: Ten students were given a test in Statistics and after one month’s
coaching they were again given a test of the similar nature and the increase in
their marks in the second test over the first are shown below:
Roll No. 1 2 3 4 5 6 7 8 9 10
Increase in Marks 6 −2 8 −4 10 2 5 −4 6 0
1 27 2 1
301 228.1 25.34
10 1 10 9
SD 25.34 5.03
Substituting these values in equation (4), we have
2.7 2.7
t 1.70
5.03/ 10 1.59
The critical value of test statistic t for right-tailed test corresponding (n-1) = 9
df at 1% level of significance is t (n 1), α t (9), 0.01 2.821.
Since calculated value of test statistic t (= 1.70) is less than the critical value
(= 2.821), that means calculated value of test statistic t lies in non-rejection
region, so we do not reject null hypothesis and reject the alternative hypothesis
i.e. we reject our claim at 1% level of significance.
Thus, we conclude that sample provides us sufficient evidence against the
claim so students are not gained knowledge from the coaching.
Now, you can try the following exercises.
E6) To verify whether the programme “Post Graduate Diploma in Applied
Statistics (PGDAST)” improved performance of the graduate students
in Statistics, a similar test was given to10 participants both before and
after the programme. The original marks out of 100 (before course)
recorded in an alphabetical order of the participants are 42, 46, 50, 36,
44, 60, 62, 43, 70 and 53. After the course the marks in the same order
are 45, 46, 60, 42, 60, 72, 63, 43, 80 and 65. Assuming that marks of
the students before and after the course follow normal distribution. Test
whether the programme PGDAST has improved the performance of the
graduate students in Statistics at 5% level of significance?
E7) A drug is given to 8 patients and the increments in their blood pressure
are recorded to be 4, 0, 7, −2, 0, −3, 2, 0. Assume that increment in their
blood pressure follows normal distribution. Is it reasonable to believe
that the drug has no effect on the change of blood pressure at 5% level
of significance?
80
between −1 and +1, where −1 representing a perfect negative correlation, 0 Small Sample Tests
representing no correlation, and +1 representing a perfect positive correlation.
Sometime, the sample data indicate for non-zero correlation but in population
they are uncorrelated ( = 0).
For example, price of tomato in Delhi (X) and in London (Y) are not correlated
in population ( = 0). But paired sample data of 20 days of prices of tomato at
both places may show correlation coefficient (r) ≠ 0. In general, in sample data
r ≠ 0 does not ensure in population ≠ 0 holds.
In this section, we will know how we test the hypothesis that population
correlation coefficient is zero.
Assumptions
This test works under following assumptions:
(i) The characteristic under study follows normal distribution in both the
populations. In other words, both populations from which random
samples are drawn should be normal with respect to the characteristic of
interest.
(ii) Samples observations are random.
Let us consider a random sample (X1, Y1), (X2, Y2), …, (Xn, Yn) of size n taken
from a bivariate normal population. Let and r be the correlation coefficients
of population and sample data respectively.
Here, we wish to test the hypothesis about population correlation coefficient
(), that is, linear correlation between two variables X and Y in the population,
so we can take the null hypothesis as
H0 : 0 and H1 : 0 for two-tailed test Here, and 0 0 if
we compare it with general
procedure given inSection11.2.
H 0 : 0 and H1 : 0
or for one-tailed test
H 0 : 0 and H1 : 0
For testing the null hypothesis, the test statistic t is given by
r n2
t ~ t n 2
1 r2
which follows t-distribution with n – 2 degrees of freedom.
After substituting values of r and n, we find out calculated value of test statistic
t. Then we look for critical (or cut-off or tabulated) value(s) of test statistic t
from the t-table. On comparing calculated value and critical value(s), we take
the decision about the null hypothesis as discussed in Section 11.2.
Let us do some examples of testing of hypothesis that population correlation
coefficient is zero.
Example 7: A random sample of 18 pairs of observations from a normal
population gave a correlation coefficient of 0.7. Test whether the population
correlation coefficient is zero at 5% level of significance.
Solution: Given that
n = 18, r = 0.7
Here, we wish to test that population correlation coefficient () is zero so our
claim is = 0 and its complement is ≠ 0. Since the claim contains the
81
Testing of Hypothesis equality sign so we can take the claim as the null hypothesis and complement
as the alternative hypothesis. Thus,
H 0 : 0 and H1 : 0
The critical value of test statistic t for two-tailed test corresponding (n-2) = 16
df at 5% level of significance are t ( n 2 ), α / 2 t (16 ), 0.025 2.120.
Since calculated value of test statistic t (= 3.94) is greater than the critical
values (= ± 2.120), that means calculated value of test statistic t lies in rejection
region, so we reject the null hypothesis. i.e. we reject the claim at 5% level of
significance.
Thus, we conclude that sample provides us sufficient evidence against the
claim so there exists a relationship between two variables.
Example 8: A random sample of 15 married couples was taken from a
population consisting of married couples between the ages of 30 and 40. The
correlation coefficient between the IQs of husbands and wives was found to be
0.68. Assuming that the IQs of husbands and wives follow normal distributions
then test that IQs of husbands and wives in the population are positively
correlated at1% level of significance.
Solution: Given that
n = 15, r = 0.68
Here, we wish to test that IQs of husbands and wives in the population are
positively correlated. If denote the correlation coefficient between IQs of
husbands and wives in the population then the claim is > 0 and its
complement is ≤ 0. Since complement contains the equality sign so we can
take the complement as the null hypothesis and the claim as the alternative
hypothesis. Thus,
H 0 : 0 and H1 : 0
Since the alternative hypothesis is right-tailed so the test is right-tailed test.
Here, we want to test the hypothesis regarding population correlation
coefficient is zero and the populations under study follow normal distributions,
so we can go for t-test.
For testing the null hypothesis, the test statistic t is given by
r n2
t
1 r2
82
0.68 15 2 0.68 3.61 Small Sample Tests
3.36
1 0.68
2 0.73
The critical value of test statistic t for right-tailed test corresponding (n-2) = 13
df at 1% level of significance is t ( n 2 ), α t (13 ), 0.01 2.650.
Since calculated value of test statistic t (= 3.36) is greater than the critical value
(= 2.650), that means calculated value of test statistic t lies in rejection region,
so we reject the null hypothesis and support the alternative hypothesis i.e. we
support our claim at 1% level of significance.
Thus, we conclude that sample fail to provide us sufficient evidence against the
claim so we may assume that the correlation between IQs of husbands and
wives in the population is positive.
In the same way, you can try the following exercise.
E8) Twenty families were selected randomly from a colony to determine
that correlation exists between family income and the amount of money
spent per family member on food each month. The sample correlation
coefficient was computed as r = 0.40. Assuming that the family income
and the amount of money spent per family member on food each month
follow normal distributions then test that there is a positive linear
relationship between the family income and the amounts of money
spent per family member on food each month in colony at 1% level of
significance.
We now end this unit by giving a summary of what we have covered in it.
11.7 SUMMARY
In this unit, we have discussed the following points:
1. Need of small sample tests.
2. Procedure of testing a hypothesis for t-test.
3. Testing of hypothesis for population mean using t-test.
4. Testing of hypothesis for difference of two population means when samples
are independent using t-test.
5. The procedure of paired t-test for testing of hypothesis for difference of two
population means when samples are dependent or paired.
6. Testing of hypothesis for population correlation coefficient using t-test.
84
56 −7 49 Small Sample Tests
52 −11 121
77 14 196
X 630 X X
2
1252
1
1252 139.11
10 1
S 139.11 11.79
Putting the values in equation (5), we have
63 60
t
11.79 / 10
3
0.80
3.73
The critical values of test statistic t for two-tailed test corresponding
(n-1) = 9 df at 5% level of significance are ± t(9) ,0.025 = ± 2.262.
Since calculated value of test statistic t (= 0.80) is less than the critical
value (= 2.262) and greater than the critical value (= −2.262), that
means calculated value of test statistic t lies in non-rejection region so
we do not reject H0 i.e. we support the claim at 5% level of
significance.
Thus, we conclude that sample fails to provide sufficient evidence
against the claim so we may assume that the average weight of all the
cadets of given centre is 60 kg.
E4) Here, we want to test that there is no difference between drugs A and B
with regard to their mean weight increment. If 1 and 2 denote the
mean weight increment due to drug A and drug B respectively then our
claim is µ1 = 2 and its complement is µ1 ≠ 2. Since the claim contains
the equality sign so we can take the claim as the null hypothesis and
complement as the alternative hypothesis. Thus,
H 0 : 1 2 effect of both drugs is same
H 1 : 1 2 effect of both drugs is not same
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
Since it is given that increments in the weight due to both drugs follow
normal distributions with equal and unknown variances and other
assumptions of t-test for testing a hypothesis about difference of two
population means also meet. So we can go for this test.
For testing the null hypothesis, the test statistic t is given by
85
Testing of Hypothesis XY
t ~ t ( n 1 n 22) under H0 … (6)
1 1
Sp
n1 n 2
X d1 = (X-a) d1 2 Y d2 = (Y-b) d2 2
a=8 b = 12
5 ‒3 9 9 ‒3 9
8 0 0 10 ‒2 4
7 ‒1 1 15 3 9
10 2 4 12 0 0
9 1 1 14 2 4
6 ‒2 4 8 ‒4 16
12 0 0
d 1 3 d 2
1 19 d 2 4 d22 42
From above calculation, we have
1 1
n1 1
Xa d 8 3 7.5,
6
1 1
Y b d 2 12 4 11.43
n2 7
2 2
1 d1 d2
n1 n 2 2 1 d2
2 2 2
S
p d
n1 n2
2 2
1 3 4
19 42
6 7 2 6 7
1
17.5 39.71 5.20
11
Sp 5.20 2.28
Putting these values in equation (6), we have
7.5 11.43 3.93 3.93
t 3.07
1 1 2.28 0.56 1.28
2.28
6 7
The critical values of test statistic t for two-tailed test corresponding
(n1 + n2 -2) = 11 df at 5% level of significance are ± t(11) ,0.025 = ± 2.201.
Since calculated value of test statistic t (= −3.07) is less than the critical
values (= ± 2.201) that means calculated value of test statistic t lies in
rejection region, so we reject the null hypothesis i.e. we reject the claim
at 5% level of significance.
86
Thus, we conclude that samples provide us sufficient evidence against Small Sample Tests
the claim so drugs A and B differ significantly. Any one of them is
better than other.
E5) Here, we are given that
n1 13, X 4.6, S1 0.5,
87
Testing of Hypothesis Thus, we conclude that samples fail to provide us sufficient evidence
against the claim so we may assume that there is significant
improvement in wheat production due to fertilizer.
E6) Here, we want to test whether the programme PGDAST has improved
the performance of the graduate students in Statistics. If 1 and 2
denote the average marks before and after the programmed so our claim
is 1 < 2 and its complement is 1 ≥ 2. Since complement contains the
equality sign so we can take the complement as the null hypothesis and
the claim as the alternative hypothesis. Thus,
H0 : 1 2 and H1 : 1 2
Since the alternative hypothesis is left-tailed so the test is left-tailed
test.
It is a situation of before and after. Also, the marks of the students
before and after the programme PGDAST follow normal distributions.
So, population of differences will also be normal. Also all the
assumptions of paired t-test meet. So we can go for paired t-test.
For testing the null hypothesis, the test statistic t is given by
D
t ~ t n 1 under H 0 … (7)
SD / n
Calculation for D and S D :
1 42 45 −3 9
2 46 46 0 0
3 50 60 −10 100
4 36 42 −6 36
5 44 60 −16 256
6 60 72 −12 144
7 62 63 −1 1
8 43 43 0 0
9 70 80 −10 100
10 53 65 −12 144
2
D 70 D 790
1
790
702 1
300 33 . 33
9 10 9
SD 33.33 5.77
88
Putting the values in equation (7), we have Small Sample Tests
7.0
t 3.83
5.77 / 10
The critical value of test statistic t for left-tailed test corresponding
(n-1) = 9 df at 5% level of significance is − t(9), 0.05 = −1.833.
Since calculated value of test statistic t (= −3.83) is less than the critical
(tabulated) value (= −1.833), that means calculated value of test statistic
t lies in rejection region, so we reject the null hypothesis and support
the alternative hypothesis i.e. we support our claim at 5% level of
significance.
Thus, we conclude that samples fail to provide us sufficient evidence
against the claim so we may assume that the participants have
significant improvement after the programme “Post Graduate Diploma
in Applied Statistics (PGDAST)”.
E7) Here, we want to test that the drug has no effect on change in blood
pressure. If D denotes the average increment in the blood pressure
before drug then our claim is D = 0 and its complement is D ≠ 0.
Since the claim contains the equality sign so we can take the claim as
the null hypothesis and complement as the alternative hypothesis. Thus,
H 0 : D 1 2 0 the drug has no effect
89
Testing of Hypothesis 1 1
t 0.87
3.25 / 8 1.15
The critical value of test statistic t for two-tailed test corresponding
(n-1) = 7 df at 5% level of significance are ± t(7), 0.025 = ± 2.365.
Since calculated value of test statistic t (= 0.87) is less than the critical
value (= 2.365) and greater than the critical value (=−2.365) that means
calculated value of test statistic t lies in non-rejection region, so we do
not reject the null hypothesis i.e. we support the claim at 5% level of
significance.
Thus, we conclude that samples fail to provide us sufficient evidence
against the claim so we may assume that the drug has no effect on the
change of blood pressure of patients.
E8) We are given that
n = 20, r = 0.40
and we wish to test that there is a positive linear relationship between
the family income and the amounts of money spent per family member
on food each month in colony. If denote the correlation coefficient
between the family income and the amounts of money spent per family
member then the claim is > 0 and its complement is ≤ 0. Since
complement contains the equality sign so we can take the complement
as the null hypothesis and the claim as the alternative hypothesis. Thus,
H0 : 0 and H0 : 0
Since the alternative hypothesis is right-tailed so the test is right-tailed
test.
For testing the null hypothesis, the test statistic t is given by
r n2
t
1 r2
0.40 20 2 0.40 4.24
1.84
1 0.40
2 0.92
Since calculated value of test statistic t (= 1.84) is less than the critical
value (= 2.552), that means calculated value of test statistic t lies in non-
rejection region, so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject our claim at 1% level of significance.
Thus, we conclude that sample provide us sufficient evidence against the
claim so there is no positive linear correlation between the family income
and the amounts of money spent per family member on food each month
in colony.
90
UNIT 12 CHI-SQUARE AND F-TESTS
Structure
12.1 Introduction
Objectives
12.2 Testing of Hypothesis for Population Variance Using χ2-Test
12.3 Testing of Hypothesis for Two Population Variances Using F-Test
12.4 Summary
12.5 Solutions / Answers
12.1 INTRODUCTION
Recall from the previous unit when we test the hypothesis about the difference
of means of two populations, t-test needs an assumption of equality of
variances of two populations under study. Other than this there are situations
where we want to test the hypothesis about the variances of two populations.
For example, an economist may want to test whether the variability in incomes
differ in two population. In such situations, we use F-test when the populations
under study follow the normal distributions.
Similarly, there are many other situations where we need to test the hypothesis
about the hypothetical or specified value of the variance of the population
under study. For example, the manager of the electric bulbs company would
probably be interested whether or not the variability in the life of bulbs is
within acceptable limits, the product controller of a milk company may be
interested in the variance of the amount of fat in the whole milk processed by
the company is no more than the specified level. In such situations, we use
χ2-test when the population under study follows the normal distribution.
This unit is divided into five sections. Section 12.1 is described the need of χ2
and F-tests. χ2-test for testing the hypothesis about the population variance is
discussed in Section 12.2. And F-test for equality of variances of two
populations is discussed in Section 12.3. Unit ends by providing summary of
what we have discussed in this unit in Section 12.4 and solution of exercises in
Section 12.5.
Objectives
After studying this unit, you should be able to:
χ 2
X i X
(n 1)S 2
~ χ 2 n 1 under H 0 ... (1)
σ 02 σ 02
1 2
where, S2
n 1
X X
Here, the test statistic χ2 follows chi square distribution with (n − 1) degrees of
freedom as we have discussed in Unit 3 of this course.
After substituting values of n, S and 02 , we get calculated value of test
statistic. Let 2cal be the calculated value of test statistic χ2.
Obtain the critical value(s) or cut-off value(s) in the sampling distribution of
the test statistic χ2 and construct rejection (critical) region of size . The critical
value of the test statistic χ2 for various df and different level of significance
are given in Table III of the Appendix given at the end of the Block 1of this
course.
After doing all the calculation discussed above, we have to take the decision
about rejection or non-rejection of the null hypothesis. The procedure of taking
the decision about the null hypothesis is explained is the next page:
92
For one-tailed test: Chi-square and F-Tests
χ 2
X i X
~ χ 2 n 1 under H 0 ... (2)
2
σ 0
2
Calculation for Xi X :
X X X X X
2
96
From above calculation, we have Chi-square and F-Tests
1 1
n
X X 18 1.5
12
2
Putting the values of X X and σ 02 in equation (2), we have
2
χ 2
X i X0.16
10
2
σ 0 0.016
The critical value of test statistic χ2 for left-tailed test corresponding
(n-1) = 11 df at 5% level of significance is (2n 1),(1 ) 2(11),0.95 = 4.57.
Since calculated value of test statistic (= 10) is greater than the critical value
(= 4.57), that means calculated value of test statistic lies in non-rejection region
as shown in Fig. 12.5 so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject the claim at 5% level of significance.
Thus, we conclude that sample provide sufficient evidence against the claim so
the variance in the measurement of the instrument is not less than 0.016.
In the same way, you can try the following exercises. Fig. 12.5
E1) An ambulance agency claims that the standard deviation in the length of
serving times is less than 15 minutes. Investigator suspects that this
claim is wrong and takes a random sample of 20 serving times which
has a standard deviation of 17 minutes. Assume that the service time of
the ambulance follows normal distribution. Test at = 0.01, is there
enough evidence to reject the agency’s claim?
E2) A cigarette manufacturer claims that the variance of nicotine content of
its cigarettes is 0.62. Nicotine content is measured in milligrams and is
normally distributed. A sample of 25 cigarettes has a variance of 0.65.
Test the manufacturer’s claim at 5% level of significance.
Since calculated F-value (= 2.65) falls between 2.35 and 2.79, corresponding
to one-tailed area α = 0.05 and 0.025 respectively therefore p-value lies
between 0.0025 and 0.05, that is,
0.025 p-value 0.05
If in the above example the test is two-tailed then the two values 0.025 and
0.05 would be doubled for p-value, that is,
0.05 p-value 0.10
Note 3: With the help of computer packages and software such as SPSS, SAS,
MINITAB, EXCEL, etc. we can find the exact p-value for F-test.
Let us do some examples based on this test.
100
Example 3: The following data relate to the number of items produced in a Chi-square and F-Tests
shift by two workers A and B for some days:
A 26 37 40 35 30 30 40 26 30 35 45
B 19 22 24 27 24 18 20 19 25
Assuming that the parent populations are normal, can it be inferred that B is
more stable (or consistent) worker compared to A?
Solution: Here, we want to test that worker B is more stable than worker A.
As we know that stability of data is related to variance of the data. Smaller
value of the variance implies data that it is more stable. Therefore, to compare
stability of two workers, it is enough to compare their variances. If 12 and 22
denote the variances of worker A and worker B respectively then our claim is
12 22 and its complement is 12 22 . Since complement contains the
equality sign so we can take the complement as the null hypothesis and the
claim as the alternative hypothesis. Thus,
H0 : 12 22
1 n1 2 1 n2 2
2
where, S1
n1 1 i1
Xi X and S2
2
n2 1 i1
Yi Y
Produced by
X X Produced by
Y Y
A X 34 B Y 22
( Variable X) (Variable Y)
26 −8 64 19 −3 9
37 3 9 22 0 0
40 6 36 24 2 4
35 1 1 27 5 25
30 −4 16 24 2 4
30 −4 16 18 −4 16
40 6 36 20 −2 4
26 −8 64 19 −3 9
30 −4 16 25 3 9
35 1 1
45 11 121
Total = 374 0 380 198 0 80
101
Testing of Hypothesis
Therefore, we have
1 1
X X 374 34
n1 11
and
1 1
Y Y 198 22
n2 9
Thus,
1 2 1
S12
n1 1
X X 380 38
10
1 2 1
S22
n2 1
Y Y 80 10
8
Putting the value of S12 and S 22 in equation (4), we have
38
F 3.8
10
The critical (tabulated) value of test statistic F for right-tailed test
corresponding (n1-1, n2 -1) = (10, 8) df at 1% level of significance is
F( n 1, n 1), F(10,8), 0.01 = 5.81.
1 2
Since calculated value of test statistic (= 3.8) is less than the critical value
Fig. 12.9 (= 5.81), that means calculated value of test statistic lies in rejection region as
shown in Fig. 12.9, so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject the claim at 1% level of significance.
Thus, we conclude that samples provide us sufficient evidence against the
claim so worker B is not more stable (or consistent) worker compared to A.
Example 4: Two random samples drawn from two normal populations gave
the following results:
Test whether both samples are from the same normal populations?
Solution: Since we have to test whether both the samples are from same
normal population, therefore, we will test two hypotheses separately:
(i) Two population means are equal, i.e. H 0 : µ1 µ 2
1 2 1
S22
n2 1
Y Y
11 1
36 3.60
First we want to test that the variances of both normal populations are equal so
our claim is 12 22 and its complement is 12 22 . Thus, we can take the null
and alternative hypotheses as
H0 : 12 22 and H1 : 12 22
Since the alternative hypothesis is two-tailed so the test is two-tailed test.
For testing this, the test statistic F is given by
S12
F ~ F n1 1,n 2 1
S22
3.20
0.88
3.65
The critical (tabulated) value of test statistic F for two-tailed test corresponding
(n1-1, n2 -1) = (8, 11) df at 5% level of significance are F( n1 1, n 2 1), / 2
1 1 1
F(8,11),0.025 3.66 and F( n 1, n 1),(1 / 2) 0.24.
1 2
F(n2 1,n1 1), / 2 F(11,8),0.025 4.24
Since calculated value of test statistic (= 0.88) is less than the critical value
(= 3.66) and greater than the critical value (= 0.24), that means calculated value
of test statistic F lies in non-rejection region so we do not reject the null
hypothesis i.e. we support the claim.
Thus, we conclude that both samples may be taken from normal populations
having equal variances.
Now, we test that the means of two normal populations are equal so our claim
is 1 2 and its complement is 1 2. Thus, we can take the null and
alternative hypotheses as
H0 : 1 2 and H1 : 1 2
The test statistic is given by
XY
t … (5)
1 1
Sp
n1 n 2
1 X X 2 Y Y 2
where, S2p
n1 n 2 2
1
26 32 1 58 3.22
9 11 2 18
103
Testing of Hypothesis
Sp 3.22 1.79
59 60 1 1
t 1.23
1 1 1.79 0.45 0.81
1.79
9 11
The critical values of test statistic t for (n1 + n2 -2) = 18 df at 5% level of
significance for two-tailed test are ± t(18) ,0.025 = ± 2.101.
Since calculated value of test statistic t (= −1.23) is less than the critical value
(= 2.101) and greater than the critical value (= ‒2.101), that means calculated
value of test statistic t lies in non-rejection region so we do not reject the null
hypothesis i.e. we support the claim.
Thus, we conclude that both samples may be taken from normal populations
having equal means.
Hence, overall we conclude that both samples may come from the same normal
populations.
Now, you can try the following exercises.
E3) Two sources of raw materials are under consideration by a bulb
manufacturing company. Both sources seem to have similar
characteristics but the company is not sure about their respective
uniformity. A sample of 12 lots from source A yields a variance of 125
and a sample of 10 lots from source B yields a variance of 112. Is it
likely that the variance of source A significantly differs to the variance
of source B at significance level = 0.01?
E4) A laptop computer maker uses battery packs of two brands, A and B.
While both brands have the same average battery life between charges
(LBC), the computer maker seems to receive more complaints about
shorter LBC than expected for battery packs of brand A. The computer
maker suspects that this could be caused by higher variance in LBC for
brand A. To check that, ten new battery packs from each brand are
selected, installed on the same models of laptops, and the laptops are
allowed to run until the battery packs are completely discharged. The
following are the observed LBCs in hours:
Brand A 3.2 3.7 3.1 3.3 2.5 2.2 3.2 3.1 3.2 4.3
Brand B 3.4 3.6 3.0 3.2 3.2 3.2 3.0 3.1 3.2 3.2
12.4 SUMMARY
In this unit, we have discussed the following points:
1. Testing of hypothesis for population variance using χ2-test.
2. Testing of hypothesis for two population variances using F-test.
104
Chi-square and F-Tests
12.5 SOLUTIONS / ANSWERS
E1) Here, we are given that
σ0 15, n 20, S 17
Here, we want to test the agency’s claim that the standard deviation ()
of the length of serving times is less than 15 minutes. So our claim is
< 15 and its complement is ≥ 15. Since complement contains the
equality sign so we can take complement as null hypothesis and claim
as the alternative hypothesis. Thus,
H 0 : 0 15
105
Testing of Hypothesis Here, we want to test the hypothesis about the population variance and
sample size is small n = 25(< 30). Also we are given that the nicotine
content of its cigarettes follows normal distribution so we can go for χ2
test for population variance.
The test statistic is given by
n 1 S2
χ2 ~ χ 2 n 1
σ2
24 0.65
25.16
0.62
The critical (tabulated) values of test statistic χ2 for two-tailed test
corresponding (n-1) = 24 df at 5% level of significance are
2( n 1), / 2 (224),0.025 39.36 and 2( n 1), (1 / 2) (224), 0.975 =12.40.
Since calculated value of test statistic (= 25.16) is less than the critical
value (= 39.36) and greater than the critical value (= 12.40), that means
calculated value of test statistic lies in non-rejection region, so we do
not reject the null hypothesis i.e. we support the claim at 5% level of
significance.
Thus, we conclude that sample fails to provide sufficient evidence
against the claim so we may assume that manufacturer’s claim that the
variance of the nicotine content of the cigarettes is 0.62 milligram is
true.
E3) Here, we are given that
n1 12, S12 125, n 2 10, S22 112
Here, we want to test that variance of source A significantly differs to
the variances of source B. If 12 and 22 denote the variances in the raw
materials of sources A and B respectively so our claim is 12 22 and
its complement is 12 22 . Since complement contains the equality sign
so we can take the complement as the null hypothesis and the claim as
the alternative hypothesis. Thus,
H0 : 12 22 and H1 : 12 22
Since the alternative hypothesis is two-tailed so the test is two-tailed
test.
Here, we want to test the hypothesis about two population variances
and sample sizes n1 = 12(< 30) and n2 = 10 (< 30) are small. Also
populations under study are normal and both samples are independent
so we can go for F-test for two population variances.
For testing this, the test statistic is given by
S12
F ~ F n1 1,n 2 1
S22
125
1.11
112
The critical (tabulated) value of test statistic F for two-tailed test
corresponding (n1-1, n2 -1) = (11, 9) df at 5% level of significance are
F( n1 1, n 2 1), / 2 F(11,9),0.025 3.91 and
106
1 1 1 Chi-square and F-Tests
F( n1 1, n 2 1),(1 / 2) 0.28.
F(n2 1,n1 1), / 2 F(9, 11),0.025 3.59
Since calculated value of test statistic (= 1.11) is less than the critical
value (= 3.91) and greater than the critical value (= 0.28), that means
calculated value of test statistic lies in non-rejection region, so we do
not reject the null hypothesis and reject the alternative hypothesis i.e.
we reject the claim at 5% level of significance.
Thus, we conclude that samples provide us sufficient evidence against
the claim so we may assume that the variances of source A and B is
differ.
E4) Here, we want to test that the LBCs of brand A have a larger variance
than those of brand B. If 12 and 22 denote the variances in the LBCs
of brands A and B respectively so our claim is 12 22 and its
complement is 12 22 . Since complement contains the equality sign
so we can take the complement as the null hypothesis and the claim as
the alternative hypothesis. Thus,
H0 : 12 22 and H1 : 12 22
Since the alternative hypothesis is right-tailed so the test is right-tailed
test.
Here, we want to test the hypothesis about two population variances
and sample sizes n1 = 10(< 30) and n2 = 10 (< 30) are small. Also
populations under study are normal and both samples are independent
so we can go for F-test for two population variances.
For testing the null hypothesis, test statistic is given by
S12
F ~ F n1 1, n 2 1 … (6)
S22
2
1 2 1 X
where, S
n1 1
2
1 X X n 1 X n and
2
1 1
2
1 2 1 Y .
S22
n2 1
Y Y n 1
Y 2
n2
2
2 2
Calculation for S1 and S2 :
LBCs of Brand A X2 LBCs of Brand A Y2
(X) (Y)
3.7 13.69 3.6 12.96
3.2 10.24 3.2 10.24
3.3 10.89 3.2 10.24
3.1 9.61 3.0 9.00
2.5 6.25 3.0 9.00
2.2 4.84 3.2 10.24
3.1 9.61 3.2 10.24
3.2 10.24 3.1 9.61
4.3 18.49 3.2 10.24
3.2 10.24 3.1 9.61
Total = 3.18 104.1 31.8 101.38
107
Testing of Hypothesis
From the calculation, we have
1 1
X
n1 X 31.8 3.18,
10
1 1
Y
n2 Y 31.8 3.18
10
Thus,
2
1 X 1 31.8 2
n1 1
2 2
S X 104.10
1
n1 9 10
1
2.98 0.33
9
2
1 Y 1 101.38 31.8 2
n2 1
S22 Y 2
n 2 9 10
1
0.26 0.03
9
Putting the values of S12 and S22 in equation (6), we have
0.33
F 1.1
0.03
The critical (tabulated) value of test statistic F for right-tailed test
corresponding (n1-1, n2 -1) = (9, 9) df at 1% level of significance is
F( n1 1, n2 1), F(9,9),0.01 = 5.35.
Since calculated value of test statistic (= 1.1) is less than the critical
value (= 5.35), that means calculated value of test statistic lies in non-
rejection region so we do not reject the null hypothesis and reject the
alternative hypothesis i.e. we reject the claim at 1% level of
significance.
Thus, we conclude that samples provide us sufficient evidence against
the claim so variance in LBCs of brand A is not greater than variance in
LBCs of brand B.
108