Understanding Population and Sample in Statistics
Understanding Population and Sample in Statistics
ij 1
INFERENTIAL STATISTICS
If a grain merchant wishes to buy wheat from a farmer, he assesses the quality of wheat by
taking a handful of it from a bag and then decides to purchase or not. Wheat in the bag is
considered as a population and handful of wheat taken for inspection is called a sample.
w
Similarly, in a factory producing electric lamps, some lamps are picked up randomly to check
the quality by the quality control department. Lamps chosen for inspection form a sample and
F lo
the totality of the lamps manufactured is called the population.
Tlie following is the formal statistical definition of population.
ee
POPULATION The population is an aggregate of objects, animate or inanimate, under stud}/.
Fr
Objects (animate or inanimate) in the population are also called statistical individuals.
The population may be finite or infinite according as the number of objects in it is finite or
for
infinite.
ur
In the above discussion wheat in the wheat bag form an infinite population whereas the totality
of electric lamps manufactured by the factory over a period of time form a finite population.
s
ook
Yo
If the population is infinite or the number of statistical individuals (objects) in the population is
eB
extremely large, then for any statistical investigation complete enumeration of the population is
not possible. Even if the population is finite, complete enumeration is impracticable because of
administrative and financial implications, time factor etc. So, for any statistical investigation, we
our
ad
take the help of sampling. Sampling is quite often used in our day-to-day practical life.
SAMPLE A finite subset of statistical individuals (objects) in a population is called a sample.
SAMPLE SIZE The number of statistical individuals (objects) in a sample is called the sample size.
Y
Re
The process of selecting samples from a population is called sampling. The purpose of sampling
nd
is to draw inference about the population by examining the sample. For the purpose of
Fi
PARAMETERS The Statistical constants or measures of the population, like mean(\.i), variance (g^) etc.
are called the parameters of the pjopulation.
STATISTICS The statistical measures or constants computed from the sample observatio7is alone, like
mean (X), variance (s ) etc. are called statistics.
21.2 APPLIED MATHEMATICS-XII
<y
In practice, values of population parameters viz. p, cr etc. are not known and the corresponding
values of statistics obtained from samples are used for the analysis of the population. However,
the statistics based on different samples can vary from one sample to another sample. One of the
fundamental problems of sampling theory is to find out whether these variations in the statistic
obtained from different samples are significant or insignificant.
21.2.1 SAMPLING DISTRIBUTION
If we select a number of independent random samples of a definite size from a given population
and calculate some statistic (like mean, mode, median, standard deviation etc) from each
sample, we shall get a series of values of the statistic. These values obtained from the different
samples can be put in the form of a frequency distribution as given below.
Sample Number 1 2 3 4
w
n
Flo
The distribution so formed of all possible values of a statistic is called the sampling distribution
or the probability distribution of that statistic.
e
Thus, if we draw 150 random samples from a given population and calculate their means, we
re
shall get a series of 150 means which would form a frequency distribution. This distribution is
rF
called the sampling distribution of the means.
In general, if , $2, S3 3re values of a statistic S (like mean, variance etc.) obtained from
ur
n independent random samples of a definite size chosen from a_ given population, then
fo
S^, $2, S3,..., form a sampling distribution of statistic [Link] mean(S) and variance of statistic
ks
Sare given by
Yo
1 1
S = -n : ^ Sj and Var (S) Z (Si-s)^
oo
n
j = l j = l
B
STANDARD ERROR (S.E.) The standard deviation of the sampling distribution of a statistic is known as
re
21.3 STATISTICALINFERENCES
Yo
testing of hypothesis as the theory of estimation is beyond the scope of this book. Hypothesis
in
testing begins with an assumption called null hypothesis that we make about a population
F
parameter. The null hypothesis asserts that there is no significant difference in the sample
statistic and the corresponding population parameter or between two sample statistics. It is a
hypothesis of no difference. Null hypothesis is usually denoted by Hq. In case of a single
statistic, Hq will be that the sample statistic does not differ significantly from the hypothetical
parameter value and in the case of two sample statistics, Hq will be that the sample statistics do
not differ significantly.
Having set up the null hypothesis, we compute the probability P that the deviation between the
observed sample statistic and the hypothetical parameter value might have occurred due to
fluctuations or sampling. If the deviation comes out to be significant (as measured by a test of
significance) null hypothesis is rejected at the particular level of significance adopted and if the
deviation is not significant, null hypothesis may be retained at that level.
Any hypothesis which is complementary to the null hypothesis is called an alternate
hypothesis, usually denoted by Hj. For example, if we want to test the null hypothesis that the
population has a specified mean p q (say) i.e. Hq : p = p q , then the alternative hypothesis could
be:
INFERENTIAL STATISTICS 21.3
The significance level also called the alpha level is a term used to test a hypothesis as defined
below.
SIGNIFICANCE LEVEL
In a hypothesis test, the significance level, a, is the probability of making wrong
decision xohen the null hypothesis is true.
For example, significance level 0.05 indicates a 5% risk of concluding that a difference exists
(between the population parameter and sample statistic or between statistics of two samples)
when there is no actual difference. In other words, if we use the same sampling method to such
w
different samples, then 5% of the samples drawn do not include the population parameter.
F lo
Thus, significance level a means that 100a% of the samples drawn do not include the
population parameter and so (100 - 100a) % = 100 (1 - a) % of the samples drawn include the
population parameter.
e
Fre
To graph a significance level of 0.05 in a two-tailed distribution, we need to shade the 5% of the
distribution, that is furthest away from the null hypothesis. In the following graph two shaded
areas are equidistant from the null hypothesis value and each has a probability of 0.025, for a
for
total of 0.05. These shaded areas are called the critical region for a two-tailed test. The critical
region defines how far away our sample statistic must be from the null hypothesis value before
r
You
we can say it is unusual enough to reject the null hypothesis. If sample statistic falls within the
s
ook
critical region, it indicates that it is statistically significant at the 0.05 level. In Fig. 21.1, the
sample statistic does not fall within the critical region representing 0.01 level of significance. So,
eB
we cannot reject the null hypothesis at 0.01 level of significance. This comparison tells us why
we need to choose the significance level before accepting or rejecting the null hypothesis.
our
ad dY
Re
Sample statistic
value)
Fig. 21.1
CONFIDENCE LEVEL A confidence level refers to the percentage of all possible samples that can be
expected to include the true population parameter.
For example, a 95% confidence level means that in 95% of the samples drawn include the
population parameter and remaining 5% samples drawn do not include the population
parameter.
Clearly, Confidence level = 1 - Significance level
CONFIDENCE INTERVAL A confidence interval is a range that could be expected to contain the
population parameter of interest.
Confidence intervals are intrinsically connected to confidence levels. When we say that
confidence interval for population parameter is [a, b] with a 95% confidence level, it means that
in 95% of the samples drawn the population parameter falls within the confidence interval.
21.4 APPLIED MATHEMATICS-Xil
In 1905 Sir William Gossett gave a test popularly known as Mest. Gossett was employed by the
Guiness Brewery in Dublin, Ireland, which did not permit employees to publish research
findings under their own name. So, Gossett adopted the pen-name 'Student' and published his
findings under this name. Thereafter, the f-distribution is commonly called Student's
f-distribution or simply Student's distribution.
The ^distribution is used when sample size is 30 or less and the population standard deviation
is unknown.
Let .Vj, .\'2,..., x„ be a random sample of size n from a normal population with mean and variance a"^.
Then Student's t is defined b^ the statistic
f =
X-p
s/Vm
w
where X = - ^ Xj is the sample mean and — S
n-^
unbiased estimate of
F lo
n
/ = ! /=!
population variance.
It follows Student's /-distribution with v =(??-!) degrees of freedom with probability density
ree
function
F
v+r
REMARK 1 The number of degrees offreedom of a statistic is the number of independent varieties used to
ook
Yo
compute that statistic. For example, if a small sample has n observations with m constraints on these
values i.e. m values are already available, then the number of degrees offreedom is v = n-m.
eB
REMARK 2 //.Vj, are n obseivations_ in a sample, then for computing sample mean X, we use
all the values Xi, X2,...,x„.Therefore, the mean X has n degrees offreedom. Since the standard deviation of
r
the sample depends on the mean, therefore the standard deviation has (n -1) degrees offreedom.
ad
ou
v+ T
Re
2^
nd
t . 2 ,
m=c i-i-- ,-CO </ <00
Fi
We observe that/(-/) =/(/), so the probability curve is symmetric about the line / = 0.
As f —»■ CO, /(f) 0 rapidly. So, f-axis is an asymptote of the curve.
The curve resembless the standard normal probability curve and is bell shaped as shown in
Fig. 21.2.
Standard normal m
distribution \
j /-distribution for y = 15
2
/-distribution for i’ = 8
3
3 3
●2-
2 1
1 O
As the number of degrees of freedom increases, the f-distribution curve moves closer to the
standard normal probability curve.
PROPERTIES OF f-DlSTRiBUTlON (i) The variable t of t-distribiition ranges from - co fo oo.
(ii) The probability curve is symmetric about the line f = 0 and it resembles with the standard
normal probability curve and is bell shaped as shoW' i in Fig. 21.2. As tlie number of degrees of
freedom increases, the f-distribution curve moves closer to the standard normal probability
ow
curve,
(iii) The variance of f-distribution is greater than one, but approaches one as the number of
degrees of freedom and therefore the sample size becomes large.
The values of t- distribution have been tabulated extensively. The values of fy(a) (two tailed)
e
have been tabulated for a =0.10, 0.05, 0.025, 0.01, 0.005 and v=l, 2, 3,..., 29, 30 where a) is
re
such that the area to its right under the curve off-distribution with v degrees of freedom is equal
Flr
to a (see Fig 21.3). That is i^ia) is such that if the random variable f has f-distribution with v
F
degrees of freedom, then
P(|fl>fv(a))=a
ou
sr
The table does not contain values of fy(ct) for a>0.50, because 4(1-a) =-fy(ct) as the
fo
probability density function/(f) is symmetrical about f = 0. When vis more than 30 probabilities
k
related to f-distribution are usually approximated witli the use of normal distributions.
oo
Y
Rejection Rejection
reB
Accep ance
region a/2 region (1-a) region a/2
uY
0
-fy(a) fi,(a)
Fig. 21.3 Critical values of f-distribution
ad
do
CONFIDENCE OF FIDUCIAL INTERVALS FOR POPULATION MEANp 7/4(0.05) is the tabulated value of t
in
f \ f \
P |f|> 4(0.05) =0.05 =>P I fl <4(0.05) =0.95
F
Thus, the confidence interval for population mean p with a 95% confidence level is given by
I f| <4(0-05)
X-p
<4 (0.05)
S/4n
p-X
<4(0.05)
Sf-sln
|p-Xl<^4 (0-05)
X -A
sfn
fy (0.05) < p < X + 4= fy (0.05)
n
21.6
APPLIED MATHEMATICS-Xn
Hence, the confidence interval with a 95% confidence level or at 5% level of significance is
X-A^^(0.05), X+^
v»
fv(0.05) , where is the tabulated value of t for v=(»-l)
degrees of freedom.
Similarly, the confidence interval with a 99% confidence level or at 1% level of significance is
If instead of S the sample variance s = — ^ (x; -X)^ is given then confidence interval at 5%
ow
level of significance X -
=^fv(0.05),
y/fJ-l X +-^== (0.05) . Similarly, the confidence
intervals at 1% level of significance is X - = fv(0-0i), x + ^i=t^(0.0i).
e
Vn-1 Vn-1
re
Frl
21.5 APPLICATIONS OF f-DISTRiBUTION
F
The f-distribution has a wide range of applications in Statistics.
In this chapter, we will discuss the following:
ou
or
(i) To test if the sample mean (X) differs significantly from the hypothetical value p of the
population mean,
kfs
(ii) To test the significance of the difference between two sample means.
oo
21.5.1 TEST OF SIGNIFICANCE OF THE DIFFERENCE BETWEEN MEAN OF A RANDOM SAMPLE
AND POPULATION MEAN
Y
B
In order to determine whether the meanXof a small random sample X2, X3,...,x„ drawn
from a normal population deviates significantly from the hypothetical value p of the population
re
STEP!
ad
population mean p i.e. The sample has been drawn from the population with mean p
STEP II Define the statistic
d
t =
X-p
in
S/ffn
Re
F
where,
1 ^
^ sample mean, p = Hypothetical mean of the population, n = Sample size
2
1 If
n-1 n
i=l
^2
2 1
■, di = Xj - A, A is the assumed mean.
n \
If calculated | f | < tabulated (0.05), the null hypotl^sis Hq may be accepted at 0.05 level of
significance and we say that the difference between X and p is not significant and hence the
sample might have been drawn from a population with mean p.
If calculated 111 < tabulated t,,_i (0.01), the null hypoUiesis Hq may be accepted at 0.01 level of
significance and we say that the difference between X and p is not significant.
If calculated \t\> tabulatedf„_i (0.05), the null hypothesis Hq may be rejected at 0.05 level of
significance and we say that the difference between X and p is significant at 5% level.
If calculated | f | > tabulated (0.01), the null hypothesis Hq may be rejected at 0.01 level of
significance and we say that the difference between X and p is significant at 1% level.
ow
REMARK Let s^be the sample variance, then
n
i=l
e
n
=>
re
n-l
rFl
,=i
F
n
s'^ =S^, where is the unbiased estimate of population variance.
H-1
S2
r
s S
ou
n-\ n yjn-\ 4n
fo
ks
X-p X-p
t = => f =
S/i^ s/^Jn-i
oo
Thus, we have
Y
B
X-p X-p
t = or, t =
S/i^ s/^ln-1
re
1
where, S^=^— Y (x'-X)^ and
ou
Y
ad
n ■
/=1 i=l
ILLUSTRATIVE EXAMPLES
d
[iXAMPLE 1 A company has been producing steel tubes of mean inner diameter of 2 cm. A sample of 10
in
Re
tubes gives an inner diameter of 2.01 cm and a variance of 0.004 ctn^. Is the difference in the values of
F
t =
(2.01 - 2.00) 0.01 X 3 10
— = 0.476
V0.004 0.063 21
The test staristic't' follows Student's f-distribution with (10 -1) = 9 degrees of freedom. We shall
now
compare this calculated value with of [ / [ the tabulated value of t for 9 degrees of freedom
and at a certain level of significance. It is given that f9(0.05) =2.262.
Clearly, 111 </9 (0.05) i.e. calculated 111 < tabulated U) (0.05).
So, the null hypothesis Hq is accepted at 5% level of significance. Hence, the difference in the
values of sample mean and population mean is not significant.
EXAMPLE; 2 A machinist is making engine parts with axle cliaf7ieter of 0.7 inch. A random sample of 10
parts shows mean diameter 0.742 inch with a standard deviation of 0.04 inch. On the basis of this sample,
would you say that the zuork is inferior? (Given (0.05) = 2.262)
ow
SOLUTION It is given that:
|j = Population mean = 0.7, X = Sample mean = 0.742
n = Sample size = 10 and. s = Sample standard deviation = 0.04
We define.
e
Fl
re
Null Hypothesis Hq: There is no significant difference between sample mean X and the
population mean p or, the product is not inferior.
F
Alternate hypothesis Hp The difference beh\'een the sample meanXand the population meanp is
ur
r
significant i.e. p X or the product is inferior.
Let t be the test statistic given by fo
ks
t = or, t =
Yo
sZ/tt-l s
oo
0.04 0.04 40
e
The test statistic't' follows Student's f-distribution with (10 -1) = 9 degrees of freedom. We shall
ur
now
compare this calculated value with the tabulated value of t for 9 degrees of freedom and at a
certain level of significance. It is given that tg (0.05) = 2.262.
ad
Yo
We observe that
I.e.
in
So, the null hypothesis is rejected at 5% level of significance or the alternative hypothesis is
F
accepted at 5% level of significance. Hence, sample mean X differs significantly from population
mean p i.e. the work is inferior.
nXAMl'I.t 3 A soap manufacturing company was distributing a particular brand of a soap throng}, 2 a
large number of retail shops. Before a heavy advertisement campaign, the mean sales per week per shop
zvas 140 dozens. After the campaign a sample of 26 sfiops was taken and mean sales was found to be
147 dozens with standard deviation 16. Can you consider the advertisement effective'^ (Given
t25 (0.05) =2.06)
SOLUTION It is given that
p = Population mean = 140, X - Sample mean = 147
n = Sample size = 26 s =16.
We define.
Null Hypothesis Hq: There is no significant difference in the mean sales before and after
advertisement.
K'JFERENTIAL STATISTICS 21.9
Alteruate hypothesis There is significant difference in the mean sales before and after
advertisement.
t = (l^^V2iPr=^.5=^
16 16 16
= 2.187
The sample statistic't' follows student's f-distribution with v = (26-1) = 25 degrees of freedom.
We shall now compare this calculated value with the tabulated value of t for 25 degrees of
ow
freedom and at a certain level of significance. It is given that ^25 (0.05) = 2.06.
We observe that
|f|=2.187>2.06=f25 (0.05)
i.e.
Calculated | f | > Tabulated ^25 (0.05)
e
re
rFl
So, we reject the null hypothesis and accept the alternate hypothesis. Hence, we conclude that
advertisement is effective for sales.
F
i-;[Link] !● ; A random sample of size 16 has 53 as mean. The sum of the squares of the deviations taken
from mean is 150. Can this sample be regarded as taken from the population having 56 as mean? (Given
r
(0.01) = 2.95)
fo
ou
SOLUTION We have,
ks
n = Sample size = 16, X = Sample mean = 53, |i = Population mean - 56
oo
16
and, ^ (x, -X)^ =150, where Xj, X2, ^re sample observations
Y
eB
/ = 1
1 16 _ o 1
S2 = Y (.^i-X)^=—Xl50=10
ur
(16-1) ^ 15 i=l
ad
Yo
We define,
Null Hypothesis Hq: The sample is drawn from tlie population having 56 as mean.
d
Re
in
Alternate hypothesis Hp The sample is not drawn from the population having 56 as mean.
Let the sample statistic f be given by
F
The sample statistic follows Student's f-distribution with v = (16 -1) = 15 degrees of freedom.
We shall now compare this calculated value with the tabulated value of f for 15 degrees of
freedom at a certain level of significance. It is given that t^^ (0.01) = 2.95.
Calculated\t\ = 3.794 >2.95 = (0.01)
i.e. Calculated | f | > Tabulated (0.01)
So, we reject the null hypothesis. Consequently, the alternate hypothesis is accepted at 0.01 level
of significance. Hence, the sample is not taken from the population having 56 as mean.
[Link]; s A random sample of 17 values from a normal population has a mean of 105 cm and the sum of
the squares of deviations fi-om this mean is 1225 cm^. Is the assumption of a mean of 110 cm for the normal
populatio)! reasonable? Test under 5% and 1% levels of significance. Also, obtain the 95% and 99%
confidence limits. (Given (0.05) = 2.12 and (0.01) = 2.921).
21.10 APPLIED MATHEMATICS-XII
SOLUTION We have,
|i = Population mean = 110, X = Sample mean = 105
17 _ „
n = Sample size = 17 and. £ (Xi -X)2 =1225.
/=1
5^=1” 1 £
=1
We define.
w
Null Hypothesis Hq: There is no significant difference between sample mean and population
mean i.e. assumption that mean of the population is 110 cm is valid.
Alternate hypothesis Hj: Assumption that mean of the population is 110 cm is not valid.
Flo
Let t be the test statistic given by
e
X-p 105-110 -5x4
=> t = xVi^ = = -2.3561
re
s/Vn-1 8.4887 8.4887
F
I f| =2.3561
The sample statistic follows Student's f-distribution with v = (17 -1) =16 degrees of freedom.
ur
r
fo
We shall now compare this calculated value with the tabulated value of t for 16 degrees of
freedom at 5% and 1% levels of significance.
ks
At 5% level of significance: It is given that (0.05) = 2.12. We find that
Yo
i.e.
Calculated | f | > Tabulated fjg (0.05)
B
So, we reject the null hypothesis at 5% level of significance. Hence, the assumption that the
re
X-
VM—1
(0-05) and X +
^*>6(0.05)
8.4887 8.4887
d
4 4
in
Ten students are selected at random from a college and their heights arefound to be 100,104,
●Mi’i.l-i'
108,110,118,120,122,124,126 and 128 cms. In the light of these data, discuss the suggestion that the
mean height of the students of the college is 110 cms (Given (0.05) = 2.262).
SOLUTION We define
Null Hypothesis Hq: There is no significant difference between the sample mean and hypothe
tical population mean 110 cm.
Alternate hypothesis Hj: The sample mean is not same as the population mean.
Let the sample statistic t be given by
t =
X-p
ow
S/-Jn
Let us now compute the sample mean (X) and S.
Computation ofX and S
.r,--X
e
X;
!
re
100
Fl -16 256
F
104 -12 144
ur
108 -8 64
r
110 -6 fo 36
ks
118 2 4
Yo
120 4 16
oo
122 6 36
eB
124 64
126 10 100
ur
128 12 144
ad
10
Yo
X-M
t =
S/4n
F
116-110
=> f = X Vl0=-^x 3.162=1.94 [v 1.1 =110]
9.798 9.798
10
1 10 1160
X=- = 116
10
” /=1
10
1 10 1 1 29.393
=
£ (Xi-X)2=>s2=-x864^S = 3 3
= 9.798
^1 = 1 ^
21.12 APPLIED MATHEMATICS-XI!
Tlie sample statistic follows student’s f-distribution withv =(10 -1) = 9 degrees of freedom. We
shall now compare tliis calculated value with the tabulated value of t for 9 degrees of freedom at
a certain level of significance. It is given that (0.05) = 2.262.
Calculated 11 \ =1.94 < 2.262 = tg (0.05)
i.e.
Calculated | f | < tabulated tg (0.05)
So, we accept the null hypothesis. Hence, the sample mean is same as the population mean.
Consequently, the mean height of the students of the college is 110 cm.
EXAMPLE: 7 A random sample of 10 boys had the following LQ's: 70,120,110,101,88,83,95,98,107,
100. Do these data support the assumption of a population mean I.Q. of 100? Find a reasonable range in
ow
which most of the mean I.Q. values of samples of 10 boys tie. (Given tg (0.05) = 2.262)
SOLUTION We have,
e
Null Hypothesis Hq: Tlie data are consistent with the assumption of a mean I.Q. of 100 in the
re
population.
rFl
F
Alternate hypothesis H.p Tlie mean I.Q. of population ^ 100.
Let the sample statistic t be given by
r
ou
«
t =
X-p
S/fiT
, where -
1
Z (.V, -X)^ fo
ks
n-l ■
/ = 1
X;
I dj = Xj - 90
r
70 -20 400
ou
ad
Y
120 30 900
no 20 400
d
101 11 121
Re
in
88 -2 4
F
83 -7 49
95 5 25
98 8 64
107 17 289
100 10 100
(72)2
=1^ 2352-
1833.6
S2 = = 203.73
10 9
INFERENTIAL STATISTICS 21.13
X
Jn
fq (0.05) and X + -^
V/i
to (0.05)
w
203.73 20373
X 2.262 and 97.2 + X 2.262
V 10
or, 97.2-
V 10
or.
or,
97.2 - 4514 X 2.262 and 97.2 + 4514x 2.262
97.2-10.21 and 97.2 + 10.21
F lo
e
Fre
or. 86.99 and 107.41
Hence, the required 95% confidence interval is [86.99,107.41]. for EXERCISE 21.1
r
1. Ten cartons are taken at random from an automatic filling machine. The mean net weight of
You
the cartons is 11.8 kg and the standard deviation 0.15 kg. Does the sample mean differ
oks
the mean 0.5 cm. What can we say about tliis process if a sample of 10 of these bearings has a
mean diameter of 0.506 cm and standard deviation of 0.004 cm? (Given fq (0.05) = 2.262).
5. A machine is supposed to produce washers of mean thickness 0.12 cm. A sample of 10
washers was found to have a mean thickness of 0.128 and standard deviation 0.008. Test
whether the machine is working in proper order at 5% level of significance. (Given
fq (0.05) =2.262).
6. A random sample of 16 values from a normal population showed a mean of 41.5 and sum of
squares of deviations from mean equal to 135. Can it be assumed that the mean of the
population is 43.5? (Given fj5 (0.01) = 2.95).
7. A sample of size 9 from a normal population X = 15.8 and = 10.3. Find 99% confidence
interval for population mean. (Given fg (0.01) = 3.335).
21.14
APPLIED MATHEMATICS-XII
8. A random sample of size 16 has 53 as mean. The sum of the squares of deviations taken
from mean is 150. Find 95% and 99% confidence intervals for population mean. (Given
tl5 (0.01) = 2.95 and (0.05) = 2.13).
9. A random sample of 16 values from a normal population showed a mean of 41.5 inches and
the sum of squares of deviations from this mean equal to 135 square inches. Show that the
assumption of a mean of 43.5 inches for the population is not reasonable. Obtain 95% and
99% confidence intervals for the same. (Given (0.05) = 2.131 and (0.01) = 2.947).
1 (). The annual rainfall at a certain place is normally distributed with mean 45 cm. The rainfall
during the last five years are 48 cm, 42, cm, 40 cm, 44 cm and 43 cm. Can we conclude that
the average rainfall during the last five years is less than the normal rainfall? (Given
f4 (0.05) = 2.132).
r ■. The heights of 8 males participating in an athletic championship are found to be 175 cm, 168
cm, 165 cm, 170 cm, 167 cm, 160 cm, 173 cm and 168 cm. Can we conclude that the average
w
height is greater than 165 cm? (Given tj (0.05) = 1.895).
\ 2. The mean weekly sales of chocolate bar in general stores was 146.3 bars per store. After an
F lo
advertising the mean weekly sales in 22 stores for typical week increased to 153.7 bars and
showed a standard deviation of 17.2. Was the advertising campaign successful? (Given
/2i (0.05) =2.08).
ee
13. The foreman of ABC mining company has estimated the average quantity of iron ore
Fr
extracted to be 36.8 tonnes per shift and the sample standard deviation to be 2.8 tonnes per
for
shift, based upon a random selection of 4 shifts. Consider a 90% confidence interval around
this estimate. (Given tg (0.1) = 2.353).
ur
14. A random sample of size 20 from normal population gives a sample mean of 42 and
s
standard deviation of 6. Test the hypothesis that the population mean is 44. (Given
ook
Yo
1 f,. The manufacturer of a certain make of electric bulbs claims that his bulbs have a mean life
of 25 months with a standard deviation of 5 months. A random sample of 12 such bulbs
gave the following values:
Y
Re
Life in months: 24 26 32 28 20 18 23 27 29 34 20 28
nd
_ANSWERS
1. Yes 2. Not significant Yes 4, Process is not under control
5. No 6. Yes 7 [11.99,19.61]
8. [51.32,54.68], [50.67,55.33] 9. [39.902, 43.098], [39.29, 43.71]
10. No ■1. Yes 12. Yes )2. [34.5,39.10]
14. True 15. Yes; [87.494,107.906] 16. Yes
t =
X1-X2 or, t = Xi-X2^. »1»2
1 1 S fl-j + II2
S I- +
V»1 «2
, where S =
+ »2S2^
^ n-i + »2 - 2
The statistic t follows i-distribution with v = +112 - 2 degrees of freedom.
ow
II'
"1
Z (-Vi-Xi)" + IVi-Xlf
1 =1 i =l
+«2“2
e
are taken from assumed means. In such cases, S is
re
When the actual means are infraction the deviations
Frl
given by
F
\2 \2
”1 1 fll "2
1 2 1
S2 = 1
+ I I
EXAMl’Ui ! for the follozoing data examine if the means of two samples differ significantly:
Y
Size
6 40
Sample I:
re
5 50 10
Sample II:
oYu
SOLUTION We have,
H-j = 6, X| = 40, = 8, ?i2 =5, X2 =50 and $2=10
d
"1^1^ +”2^2^
in
Re
^ H-] + }t2 ~ 2
F
We define
Null hypothesis Hq : The difference in the means of two samples is not significant.
Alternate hypothesis Hi: Means of two samples differ significantly.
Let f be the sample statistic given by
X1-X2-X «1”2
t = —i
s ill + '^2
t =
40-50 ,, 675 -10 30 -10x 1.651
= -1.666
9.910 ^\6+5~ 9.910 V 11 9.910
The sample statistic t follows f-distribution with v=(6 + 5-2)=9 degrees of freedom. We shall
now compare this calculated value of f with the tabulated value for 9 degrees of freedom at a
given level of significance. It is given that tg (0.05) = 2.262.
21.16
APPLIED MATKEMATICS-Xn
We find that; calculated 11 \ =1.666 < tabulated fg (0.05). So, we accept the null hypothesis at 5%
level of significance. Hence the difference in sample means is not significant.
‘ AMI-U-,: Tzvo batches of the same product are tested for their mean life. Assuming that the lives of the
product follow a normal distribution with an unknozun variance, test the In/pothesis that the mean life is
the same for both the branches, given the follozuing information:
Batch Sample size Mean life in hrs Standard deviation
Batch I 10 750 12
Batch II 8 820 14
(Use t^(,{0.05)^ 2.2120)
SOLUTION We define
w
lives.
F lo
t = X
, where S =
s + "2 +772-2
ee
We have.
Fr
77] -10,7?2 - 8, Xj =750, X2 =820,S|=12 and $2=14
2 2
"1^1 +”2^2
^ 77| + 772 ~ ^
for
ur
10x144 + 8x196 3000
S = = VIM =13.711
s
=>
ook
10 + 8-2 16
Yo
t = X
X 3.162= -10.762
13.711 VlO+8 13.711 V18 13.711 3 41.133
The sample statistic follows Student's f-distribution with v =(77i+?72-2) =16 degrees of
our
freedom. Let us now compare the calculated value of t with the tabulated value of f at a given
ad
hypothesis at 5% level of significance and hence accept the alternate hypothesis. Hence, the
Re
nd
liXAMPM :
Samples oftivo types of electric light bulbs were tested for length of life andfolloiving data
were obtained:
Type I Type II
Sample size 77j = 8 7?2 =7
Sample means Xi = 1234 hrs X2 =1036 hrs
Sample S.D's = 36 hrs S2 = 40 hrs
Is the difference in the means sufficient to zvarrant that type I is superior to type II regarding length of
life? (Given ti3 (0.05) = 2.216)
SOLUTION We define
2 2
t =
X1-X2 X
»1»2
, where S =
”l^l +''2''^2
s \ }J| + H2 \ + ;i2 - 2
Now,
8x 36^+7x40^ I10368 + 11200 = 1659.076=40.731
^ /i-j + JJ2 ”2 ^ 8+7-2 M 13
The sample statistic 't' follows Student's f-distribution with v=(8 + 7~2) = 13 degrees of
freedom. Let us now compare this calculated value with the tabulated value of f at a given level
of significance. It is given that f|3 (0.05) = 2.16.
We find that Calculated | f | = 9.391 > tabulate (0.05). So, the null hypothesis is rejected at 5%
w
level of significance. Hence, the two types of electric bulbs differ significantly. Further, since
is much greater thanX2, we conclude that type 1 bulbs are definitely superior to type 11 bulbs.
Flo
Samples of sales in similar shops in towns A ami B regarding a new product yielded the
following information:
ee
For town A : X-i = 3.45 IX;I = 38 = 228 11-1=11
Fr
For town B: Xi1 =4.44 Zy,-=40 = 222 112 - 9
is there any evidence of difference in sales in the two towns? (Given f-jg (0.05) - 2.10)
for
ur
SOLUTION We define
2 2
Xi -X2= niii2
eB
t = —I X , where S =
s + )l2 /?! +^2 -2
We have, = 11, X^ = 3.45, 1.^^- = 38, = 228
r
ou
ad
S^ = + ?l2 ^2^
JI-] + ^2 “ 2
Re
nd
S = V7.869 =2.805
X1 X2 »i»2
f = —
S n^ +112
3.45-4.44 11x9
f = ^ 1 =-0.353 X 2.22 =-0.784
2.805 Vll+9
-2=11+9-2 = 18
The sample statistic 't' follows Student's /-distribution with v=«i +JI2
degrees of freedom. Let us now compare the calculated 11 \ with the tabulated value of / at a
given level of significance. It is given that fjg (0.05) = 2.10. We find that the calculated ] f | is less
than the tabulated fig (0.05).So, the null hypothesis is accepted a 5% level of significance. Hence,
there is no evidence of difference in sales in the two towns.
21.18 APPLIED MATHEMATICS-XII
LXAMiM I 5 Two different types of drugs A and B were tried on certain patients for increasing weight, 5
persons were given drug A and 7 persons were given drug B. The increase in weights in pounds is given
below:
Drug A: 8 12 13 9 3
Drug B: 10 8 12 15 6 8 11
Do the two drugs differ significantly with regard to their effect in increasing weight (Given
ilO (0.05) =2.23) ^ ^
SOLUTION We define
Null hypothesis Hq : Two drugs do not differ significantly with regard to their effect of
increasing weight.
Alternate hypothesis Hj: Two drugs differ significantly with regard to their effect of increasing
ow
weight.
The sample statistic t is given by
X1-X2
|z(^-X,)2 + S(K-X2)2j
«1«2 1
t = —
, where =
S +«2 +ri2-2
e
Fl
re
Let us now compute Xj, X2 / ^ -^1)^ and S (y,- -X2)^.
F
Drug A
ur
DrugB
Xi-X^=Xi-9 (^- -Xi)^ y/ or
yi~^2 -10 (» -X2)2
sf
8 -1 1
k
10 0 0
Yo
oo
12 3 9 8 -2 4
B
13 4 16 12 2 4
re
9 0 0 15 5 25
3 -6 36 6 -4 16
u
ad
Yo
8 -2 4
11 1 1
d
1 45
1 ^ 70
— =10
n2 7
S^=
«1 +«2 -2 |z(^;-X,)2 +Z(k.-X2)2|
S2 =
5^1-^(62+5.4)=11.6
S = VlL6 =3.406
f = ~^2
S y«j+n2
9-10 5x7 1 35 17078
f = ^ ^ ^ __ = -0501
3.406 V5 +7 3.406 V12 3.406
21.19
INFERENTIAL STATISTICS
The sample statistic 7' follows Student’s f-distribution with v =(5 + 7-2) =10 degrees of
freedom. Let us now compare the calculated 11 \ with the tabulated value of t at a given level of
significance. It is given that (0.05) =2.23. We find that the calculated | f | is less than the
tabulated t^Q (0.05) = 2.23. So, the null hypothesis is accepted at 5% level of significance. Hence,
the null hypothesis Hq holds true i.e. the drugs A and B do not differ significantly with regard to
their effect in increasing weight.
RIINIAKI- In the above example and X2 come out to be integral values
_ and hence the direct method of
computing I (.r,- -X^)^ and Z (pj -Xj)^ is used. In case X^ and (or) X2 out to be fractional, then
2
the step deviation method is convenient for computmg S .
EXAMlM I !■ The heights in inches of 6 randomly chosen sailors and 10 randomly chosen soldiers are given
as under:
Sailors: 63 65 68 69 71 72
Soldiers: 61 62 65 66 69 69 70 71 72 73
Do these figures show that the soldiers are on an average shorter than sailors? (Given t-^^ (0.05) = 2.15)
w
We define
F lo
SOLUTION
ree
t = X »1»2 ^
for F
s 111 + »2'
n2 n2
f u-)
1 ”1 1 1
r
where = di
You
"2^/ = l
oks
■'■”2 1 = ] "1
eBo
Sailors Soldiers
63 -5 25 61 -5 25
Fin
65 -3 9 62 -4 16
68 0 0 65 -1 1
69 1 1 66 0 0
71 3 9 69 3 9
72 4 16 69 3 9
70 4 16
71 5 25
72 6 36
73 7 49
1
S2 =
//-] + ~2 ”1 ”2
=> S2 =
1 0
● 60 — +186
(18)^ 1 213.6
6 + 10-2 6
(60 + 186-32.4) = = 15.2571
10 14 14
ow
S = Vl 5.2571 = 3.9060
f = ^1 ^^2 X
”1»2
s ”l+”2
e
68-67.8 6x10 _ 0.2
t = X 1.9365 = 0.0991
re
3.9060 6 + 10 ~ 3.9060
rFl
F
The sample statistic 't‘ follows Student's f-distribution with v =«-j+«2-2 =(6 +10-2) =14
degrees of freedom. Let us now compare this calculated value of | f | with the tabulated value for
14 degrees of freedom at a given level of significance. It is given that
r
(0.05) = 2.15. Clearly, the
ou
fo
calculated value of|f| is much less than the tabulated value. So, the null hypothesis Hq is
accepted at 5% level of significance. Hence, on an average soldiers are not shorter than sailors.
ks
EXERCISE 21.2
oo
1. The mean life of a sample of 10 electric bulbs was found to be 1456 hours with standard
Y
eB
deviation of 423 hours. A second sample of 17 bulbs chosen from a different batch showed a
mean life of 1280 hours with standard deviation of 398 hours. Is there a significant
difference between the means of the two batches?
ur
2. Strength tests carried out on samples of two yarns spun to the same count gave the
ad
following results:
Yo
4 50 42
Re
Yarn B
in
9 42 56
The strengths are expressed in kg. Is the difference in mean strengths significant of real
F
difference in the mean strengths of the sources from which the samples are drawn?
3. Samples of two types of electric bulbs were tested for length of life and the following data
were obtained :
T\/pe I Type II
Number of bulbs in the sample 8 7
Mean (in hours) 1134 1024
Standard deviation (in hours) 35 40
5. Two methods of performing a certain operation are compared. The following data are
obtained:
Xj =505 2 =95
n-j =15
»,=12 X2=57.2 S2^=5.7
Is there a significant difference in the means of the two methods at 5% level of significance?
(Given f25 (0.05) = 2.06).
6. The meansof the random samples of sizes 9 and 7 are 196.42 and 198.42 respectively. The
are 26.94 and 18.74
sum of the squares of deviations from the respective means
respectively. Can the samples be considered to have been drawn from the same normal
population?
ow
7. The I.Q.'s (intelligence quotients) of 16 students from one area of a city showed a mean of
107 with a standard deviation of 10 while the I.Q.’s of 14 students from another area of the
city showed a mean of 112 with a standard deviation of 8. Is there a significant difference
between the I.Q.’s of the two groups at (i) 1% and (ii) 5% level of significance?
8. Below are given the gain in weiights (in kg^) of pigs fed on two diets A and B.
e
Gain in weight
re
Diet A: 25 32
Fl
30 34 24 14 32 24 30 31 35 25
F
DietB: 44 34 22 10 47 31 40 30 32 15 18 21 35 39 22
ur
Test, if the two diets differ significantly as regards their effect on increasing in weight
r
((25 (0.05) = 2.06).
9. In a certain experiment to compare two types of animal foods A and
fo B, the following results
ks
of increase in weights were observed in animals;
Yo
oo
2 3 4 5 6 7 8
Animal number 1
47 50 52 53
eB
Food A 49 53 51 52
Increase
Assuming that the two samples of animals are independent, can we conclude that food B is
ur
ad
lO. The marks obtained by two groups of students in Mathematics test are given below:
Group A Group B
d
15 11
Number of students:
Re
in
42 38
Mean Marks:
15
F
On the basis of this data, can it be concluded that there is a significant difference in the
mean marks obtained by two groups? (Given (14 (0.05) = 2.064).
11 . Two kinds of fertilizers were applied to 15 plots of one acre, other conditions remaining the
same. The yields in quintals are given below:
14 20 34 48 32 40 30 44
Fertilizer I:
18 22 28 40 26 45
Fertilizer II: 31
Examine the significance of difference between the mean yields due to the use of difierent
kinds of fertilizers. (Given (^3 (0.05) = 2.16)
12. Two different types of drugs A and B were tried on certain patients for increasing weight. 6
persons were given drug A and 8 persons were given drug B. The increase in weight in
pounds is given below:
7 10 13 12 4
Drug A:
6 18 16 9 3
12 8
Drug A:
21.22
APPLIED MATHEMATICS-XII
ow
4. Significant
>- Significant 6. No
9. Yes
7. Not significant 8. Do not differ significantly
10. Not significant n ● No significant difference
12. No 13. Yes
e
re
F
Frl
ou
sr
kfo
oo
Y
reB
uY
ad
do
in
Re
F