Sampling Theory - Notes
Sampling Theory - Notes
Department of Mathematics
SAMPLING THEORY
Population: A large collection of individuals or attributes or numerical data can be regarded
as population or universe. It is an aggregate of objects, animate or inanimate, under study.
The population may be finite or infinite.
If the population is large, complete enumeration is not possible most of the times because
of the cost involved, time consumed and also in some cases units are destroyed in the
course of inspection (e.g. inspection of crackers). So we take help of sampling.
Size of the sample is denoted by n. Sampling is the process of drawing samples from a given
population.
2. Cars produced in India are the population where as the mantis cars produced in India is
sample.
The statistical constants of the population such as mean (µ), Standard deviation ( σ) etc are
called the parameters. Similarly the constants for the sample drawn from the given
population i.e. Mean (x ) standard deviation (S) etc are called statistics.
Random sampling :
The selection of an item from the population in such a way that each has the same chance
of being selected is called random sampling.
Suppose we take a sample of size n from the finite population of size N. Random sampling is
a technique in which each element has an equal chance of being selected.
Sampling where each member of a population may be chosen more than once is called
sampling with replacement i.e. here the items are drawn one by one and are put back to the
population before the next draw. If N is the size of the finite population and n is the sample
size then we have Nn samples.
Sampling where if a member cannot be chosen more than once it is called sampling
without replacement. Here the items are drawn one by one and are not put back to the
population before the next draw. In this case there will be NCn samples.
Sampling distribution:
Given a population, suppose we consider a set of samples of a certain size drawn from the
population. For each sample, suppose we compute a statistics such as the mean, standard
deviation etc., these statistics will vary from the sample to the other sample, suppose we
group these statistics according to their frequencies and form a frequency distribution. The
frequency distribution so formed is called a sampling distribution.
Consider a population for which the mean is µ and the standard deviation is suppose we
draw a set of samples of a certain size n, from this population and find the mean x of each
of these population. The frequency distribution of these means is called a sampling
distribution of means. Let the mean and the standard deviation of sampling distribution of
means be μX¯ and σX¯ respectively.
Suppose the population is finite with size N or random sampling without the replacement
i.e. the items drawn one by one and are not put back to the population before the next
draw. In this case there will be NCn samples and we have
μX¯ = μ and
σ2 X¯ σ2
[
N–n
= N–1
]
n
2 N–n
σ2 X¯ = c σ where c = [ ] is called the finite population correction
n N–1
factor. If N is very large i.e. if the population is infinite or the sampling is finite with
replacement then
c = 1 as N → ∞
σ2
∴ σ2 X¯ = n
So, the mean of sampling distribution is equal to population mean and the corresponding
σ
standard error is where σ is the standard deviation of the population.
√n
If the population is distributed normally with mean µ and S.D. σ, then the mean of all σ
positive random samples of size n are also distributed normally with mean µ and S.E. .
√n
This is a very important theorem regarding the distribution of the mean of a sample if the
parent population is non-normal and the sample size is large.
If the variable X has a non-normal distribution with mean μ and standard deviation σ, then
the limiting distribution of
x¯ –μ
Z= , n → ∞ , is the standard normal distribution (i.e, with mean 0 and unit S D)
σ/√n
There is no restriction upon the distribution of X except that it has a finite mean and
variance. This theorem holds well for a sample of 30 or more which is regarded as large.
Dayananda Sagar College of Engineering
Department of Mathematics
Statistical Estimation is the method in which the parameters are estimated with the aid of the
corresponding statistics. An estimate of the unknown true or exact value of the parameter or
an interval in which the parameter is to be determined on the basis of sample data from the
population.
1. Confidence interval:
( s - zcσs , s+ zcσs)
figure
Z% Confidence interval :
i.e. Z
= P { - zcσs ≤ μs – s ≤ zcσs }
100
= P { |s–μs | ≤ Z c }
σs
s– μs
= P { |Z| ≤ Zc } , where Z =
σs
then Z
= P { −Zc ≤ Z ≤ zc }
100
= 2 P { 0 ≤ Z ≤ Zc }
= 2 ∅ ( Zc )
Z = 2 ∅ ( Zc ) × 100
Confidence limit : The interval ( s - zcσs , s+ zcσs ) is the Z% confidence interval for μs
, then the quantities ( s ± Zcσs ) are called Z% confidence limits. The member Zc is called
the corresponding confidence coefficient or the critical value confidence.
The length of the confidence interval ( s - zcσs , s+ zcσs ) ie 2l = 2 Zcσs is called the error
in the confidence level.
Z Zc Z Zc
50 .6745 90 1.645
55 .7639 95 1.96
60 .843 95.44 2
65 .9259 96 2.05
68.26 1 97 2.195
70 1.041 98 2.33
75 1.15 99 2.58
80 1.277 99.5 2.81
85 1.445 99.74 3
The confidence interval for the population mean is ( ¯X¯¯ − Z s ¯X¯¯ + Z s
)
c c
√N
,
√N
1. A random sample of size N=100 is taken from a population with standard deviation
σ = 5.1. Given that the sample mean is X¯ = 21.6. Obtain the 95% confidence
interval for the population mean μ
N=100, σ = 5.1 , X¯ = 21.6
Statistical Decision
Introduction:
For reaching statistical decisions, we start with some assumptions or guesses about the
populations involved. Such assumptions / guesses, which may or may not be true, are called
Statistical Hypotheses.
Example: i) Suppose we wish to reach the decision that a certain coin is biased ( that is, the
coin shows more heads tails or vice versa ). To reach this decision, we start with the
hypothesis that the coin is faired ( not biased) with the sole purpose of rejecting it (at the
end). This hypothesis is a null hypothesis.
ii) Consider the situation where the probability of an event is, say, 1/3, according
to some hypothesis. For arriving at some decision, if we make the hypothesis that the
probability is, say, ¼, then the hypothesis we have made is an alternative hypothesis.
Type I error:
In a hypothesis test, a type I error occurs when the null hypothesis is rejected when it is in
fact true; that is, H0 is wrongly rejected.
For example, in a clinical trial of a new drug, the null hypothesis might be that the new drug
is no better, on average, than the current drug; i.e.
H0: there is no difference between the two drugs on average.
A type I error would occur if we concluded that the two drugs produced different effects
when in fact there was no difference between them.
The following table gives a summary of possible results of any hypothesis test:
Decision
If we do not reject the null hypothesis, it may still be false (a type II error) as the sample may
not be big enough to identify the falseness of the null hypothesis (especially if the truth is
very close to hypothesis).
For any given set of data, type I and type II errors are inversely related; the smaller the risk of
one, the higher the risk of the other.
In a hypothesis test, a type II error occurs when the null hypothesis H0, is not rejected when it
is in fact false. For example, in a clinical trial of a new drug, the null hypothesis might be that
the new drug is no better, on average, than the current drug; i.e.
H0: there is no difference between the two drugs on average.
A type II error would occur if it was concluded that the two drugs produced the same effect,
i.e. there is no difference between the two drugs on average, when in fact they produced
different ones.
The probability of a type II error is generally unknown, but is symbolised by β and written
P(type II error) = β
Levels of significance:
S–
zs =
μ
----------- (1) is the standard normal variate associated with S, so that for the
σs
distribution of Z the mean is zero and the standard deviation is 1.
Accordingly, for the distribution of z, the z% confidence interval is (-z c ,zc ). This
means that we can be Z% confident that, if the hypothesis H is true, then the value of z will
lie between -zc and zc . This is equivalent to saying that there is (100 – z)% chance that the
hypothesis H is true but the value of Z lies outside the interval (-zc , zc ). If we reject the
hypothesis H on the grounds that the value of Z lies outside the interval (-z c ,zc ), we would be
making a type I error and the probability of making the error is (100-Z)%. Her, we say that
the hypothesis is rejected at a (100 - Z)% level of significance. Thus, a level of significance
is the probability level below which we reject a hypothesis.
The value of the normal variate Z, determinate by using the formula (1) is usually
called the z – score of the statistic S. It is this score that determines the "fate” of a hypothesis
H and is called the test statistic.
Rule of decision:
“Reject a hypothesis H at a (100 – Z)% level of significance if the z – score of the statistic S,
determined on the basis of H, is outside the interval (-z c , zc). Do not reject the hypothesis
otherwise”.
Critical Region: The region in which a sample value falling is rejected, is known as critical
region.
Normally, the test statistic, we consider follows normal distribution. Let us look into the
normal curve.
‘Student’s’‘t - distribution’.
Properties of the t Distribution
The t distribution has the following properties:
The mean of the distribution is equal to 0 .
The variance is always greater than 1, although it is close to 1 when there are many degrees of freedom. With
infinite degrees of freedom, the t distribution is the same as the standard normal distribution.
γ t 0.01 ( γ ) t 0.05 ( γ )
1 63.66 12.71
2 9.92 4.30
3 5.84 3.18
4 4.60 2.78
5 4.03 2.57
6 3.71 2.45
7 3.50 2.36
8 3.36 2.31
9 3.25 2.26
10 3.17 2.23
11 3.11 2.20
12 3.06 2.18
13 3.01 2.16
14 2.98 2.14
15 2.95 2.13
16 2.92 2.12
17 2.90 2.11
18 2.88 2.10
19 2.86 2.09
20 2.84 2.09
21 2.83 2.08
22 2.82 2.07
23 2.81 2.07
24 2.80 2.06
25 2.79 2.06
26 2.78 2.06
27 2.77 2.05
28 2.76 2.05
29 2.76 2.04
30 2.75 2.04
Chi - square test
In practice, expected (theoretical) frequencies are computed on the basis of a hypothesis H0. If under this
hypothesis the value of χ 2computed with the use of the formula:
2
n
(f k −e k )2
χ =∑
k=1 ek
It is greater than some critical value χ 2c , we would conclude that the observed frequencies differ significantly
from the expected frequencies and would reject Ho at the corresponding level of significance c. Otherwise; we
would accept it or at least not reject it. This procedure is called the Chi – square ( χ 2) Test of hypothesis or
significance.
Generally, the chi-square test is employed by taking c = 0.05 or 0.01. Tables giving the values of χ 2c for different
values of v are available. Below table gives the value of χ 2c at c = 0.05 and 0.01 levels and for v=1,2,……10.
Table of values of χ 2c (v) for c = 0.05 and 0.01
2 2
v χ 0.05 ( v) χ 0.01 ( v)
1 3.84 6.64
2 5.59 9.21
3 7.82 11.34
4 9.49 13.28
5 11.07 15.09
6 12.59 16.81
7 14.07 18.48
8 15.51 20.09
9 16.92 21.67
10 18.31 23.21
The number of degrees of freedom v is determined by using the formula v = n – m. Here, n is the number of
frequency – pairs (fi ,ei) used in the computation of and m is the number of quantities that are needed (and used)
in the calculation of the expected frequencies ei. if the N= ∑ f iis the only quantity used in the calculation of ei,
then m = 1 so that v = n-1.
Goodness of Fit
When a hypothesis H0 is accepted(or not rejected)on the basis of the Chi- square test, we say that the
expected frequencies calculated on the basis of H0 form a good fit for the given frequencies. When H0 is rejected,
we say that the corresponding expected frequencies do not form a good fit.
It has been mentioned that the sampling distribution of χ 2is approximately identical with the Chi-square
distribution when the expected frequencies are at least equal to 5. Therefore, the Chi- square test is applicable
only if every expected frequency is ≥ 5. If some expected frequencies are less than 5, then some frequencies are
to be clubbed together so that none of the expected frequency is less than 5.
Dayananda Sagar College of Engineering
Department of Mathematics
Sampling Distribution
Q.No Question
1 a) Explain the following
i) Null hypothesis
ii) Alternative hypothesis
iii) Type I and type II error
iv) Level of significance
v) Standard error
b) A population has mean 75 and standard deviation 12.
a) Random samples of size 121 are taken. Find the mean and standard deviation
of the sample.
b) How would the answers to part a) change if the size of the samples were 400
instead of 121?
µ x́ = µ =75
σ 12 12
σ x́ = = = =1.09
√ n √ 121 11
n = 400
µ x́ = µ = 75
σ 12
σ x́ = = =0.6
√ n √ 20
So if the size of the samples is changed from 121 to 400. σ x́ decrease from 1.09
to 0.6.
Dayananda Sagar College of Engineering
Department of Mathematics
μ=5⋅ 75,0=1.02
¿ n=81
μ x́=μ=5 ⋅75
σ 1.02
−¿= = =0⋅ 1133
√n 9
σx
¿
μ x́=μ=5 ⋅75
σ 1.02
−¿= = =0⋅ 204
√n 95
σx
¿
changes¿
μ x́ remains same but the σ−¿
x 0.1133 ¿ 0.204
b) The weights of 1500 ball bearings are normally distributed with a mean of 635 gms and S.D of
1.36gms. If 300 random samples of size 36 are drawn from this population, determine the
expected mean and S.D of the sampling distribution of means if sampling is done a) with
replacement b) without replacement.
Here N=1500
µ = 635 σ= 1.36 n= 36
a) Expected Mean μ x́=µ=635
2
σx
−¿=
√ σ
n
=¿ ¿¿
1.36/6=0.227
b) Expected Mean μ x́=µ=635
2
σx
−¿=
√ σ
n
=¿
σ
.
N −n
√
√ n N −1
¿¿
(1.36)2
√ 36
¿ √ 0.05
×
1500−36
1500−1 √
¿ 0.224
i.
ii. μx¯ = μ where μx¯ is the mean of this distribution and μ is the
population mean.
2 1 2 2 2 2
Population Variance σ = 4 {( 3−9 ) + ( 7−9 ) + (11−9 ) + (15−9 ) }=20
a) Let us consider the sample of Size 2 with supplement. They are follows
(11,3),(11,7),(11,11),(11,15),
(15,3),(15,7),(15,11),(15,15)
Sampling means are as follows
X: 3 5 7 9 11 13 15
f:1 2 3 4 3 2 1
f i xi 144
μ x́ =∑ = =9
fi 16
f i xi
σ 2x́ =∑ -( μ x́ )2
fi
=1456/16-92
=10
Thus μ x́ =9
❑
σ x́ =√ 10
b) Let us consider the sample without replacement they are as follows
(3,7),(3,11),(3,15),(7,11),(7,15),(11,15)
The sampling means are 5,7,9,9,11,13
μ x́=1/6(5+7+9+9+11+13)=9
μ x́=µ
1
σ 2x́ =
{(5−9 )2 + ( 7−9 )2 + ( 9−9 )2 +( 11−9 )2 + ( 13−9 )2 }
6
40 20
¿ =
6 3
σ 2 N −n
Consider n
x [
N −1 ]
20 4−2
¿
2
x [ ]
4−1
σ2 N −n
¿ x
n [ ]
N −1
2
= σ x́
b) Certain tubes manufactured by a company have mean life time of 800 hours and S.D of
60hours. Find the probability that a random sample of 16 tubes from the group will have a
mean life time a) between 790 hours and 810 hours b) less than 785 hours c) more than 820
hours d) between 770 hours and 830 hours.
b ¿ P ( ź<785 )
785−800
(
¿ p z<
15 )
¿ P ( z ←1 )
¿ P ( z >1 )
¿ 0.1587
¿ P ( x́>820 )
c¿ 820−800 ¿=P ( z>1 ⋅33 ) ¿=0.0918¿
(
¿ P z>
15 )
d)
¿ P ( 770< x́ <880 )
770−880 830−800
p ( 15
<
15 )
¿ P (−2< z< 2 )
¿ 2 P ( 0< z<2 )
¿ 0 ⋅9 772
4 a) A prototype automotive tire has a design life of 38500 miles with S.D. of 2500 miles. Five such
tires are manufactured and tested. On the assumption that the actual population S.D. is 2500
miles, find the probability that the sample mean will be less than 36000 miles. Assume that the
distribution of lifetimes of such tires is normal.
σ 2 ⋅5
σ x́ = = =1 ⋅11803 thousands of miles
√n √5
These normally distributed
¿ P ( x́ <86 )
= 0.0125
That is, if the time perfom ad designed, there is only about a 125% chance
that the average of a sample of this size would be so low
b) An automobile battery manufacturer claims that its midgrade battery has a mean life of 50
months with a S.D. of 6 months. Suppose the distribution of battery lives of this particular
brand is approximately normal. a) On the assumption that the manufacturer claims are true,
find the probability that a randomly selected battery of this type will last less than 48 months.
b) On the same assumption, find the probability that mean of a random sample of 36 such
batteries will be less than 48 months.
x N ( μ , σ2)
x−μ
=2 N ( 0,1 )
σ
¿ p ( x< 48 )
48−μ
(
¿ P z<
σ )
¿ p ( z←0.33 )
¿ 0 ⋅3707
n = Sample Size =36
Therefore, Sample Mean = μ x́ =¿ µ =50
σ 6
σ x́ = = =1
√ n √36
μ x́
x́ −¿ =P ( x́ <48 )
σ x =z N ( 0,1 ) ¿
48−50
¿ P z<( 1
⋅ )
¿ P ( z ←2 )
¿ 0.0228
5 a) The weights of 1500 ball bearings are normally distributed with a mean of 635 gms and S.D. of
1.36 gms. If 300 random samples of size 36 are drawn from this population. In the case of
random sampling with replacement, find how many random samples would have their mean
a)between 634.76gms and 635.24 gms, b) greater than 635.6 gms, c)less than 634.5 gms or
more than 635.24 gms
X= wt of ball bearing
N-pop Size =1500
µ= 635 gm
σ= 1.36 gm
No of random sample =300
n= sample size=36
X N ( μ , σ2)
x́ N ( μ x́ , σ 2x́ )
μ x́ =μ=635
σ 1 ⋅36
σ x́ = = =0.2267
√n 6
x́−635
z= N ( 0,1 )
0 ⋅2267
i) The no of random samples have their mean between 634.75 g & 635.24 g
=300 x 0.8554
=256.62
=257
= 300 x P[X>635.6]
= 300 x P [ Z> 2.6467]
= 300 x [0.5-0.4960]
= 300 x 0.004
=1.2
=1
d)No of Samples with mean less than 634.5 g or more than 635.24g
=300 x [p[z<-2.21]+P[z>1.06]]
= 300 x [0.0136+0.1446]
= 300 x[ 0.1582]
=47.46
b) 500 ball bearings have a mean weight of 142.30 gms and S.D. of 8.5 gms. Find the probability
that a random sample of 100 ball bearings chosen from this group will have a combined weight
a) between 14061 and 14175 gms b) more than 14460 gms
i) P[ the combined wt of the group lies between 140.61 gms &141.75 gms]
=P[140.61/n<x,141.75 /n]
=P[140.61<x<141.75]
=p[-1.988<Z<-0.647]
=0.2345
GMS]
=P[X>144.6]
=P[Z>2.71]
= 0.0034
6 a) The mean and S.D of the maximum loads supported by 60 cables are 11.09 tonnes and 0.73
tonnes respectively. Find a) 95% b) 99% confidence limits for mean of the maximum loads of
all cables by the company.
6A) By data
X max load supported by a cable then
x́=11.09 , σ =b 0.73 , n=60
a) 95% confidence limits for the mean of maximum loads are given by
x ± 196 ( σ / √ n )
± 11.09 ±1 ⋅ 96 ( 0 ⋅73 / √ 60 )
á , 1.09 ±0 ⋅18
Limits are 10.91 tonner & 11.27 tonner
b) 99% confidence limits for the mean of maximum loads are given by
= x́ ± 2 ⋅58 ( σ / √ n )
¿ 11.09 ± 2⋅58 ( 0 ⋅73 / √ 60 )
¿ 11.09 ± 0 ⋅ 24
Limits are 10.85 tonners∧11.33 tonners
b) A sample of 900 men is found to have a mean height of 64inch. If this sample has been
drawn from a normal population with standard deviation 20 inch, find the 99% confidence
limits for the mean height of the men in the population.
6 b) Given n=900
Let x: ht
x́=64 inch , σ=20 inch
99% confidence limits for the mean is given by
¿´x ± 2 ⋅58 ( σ / √ n )
¿ 64 ± 2⋅58 ( 20 /30 )
¿ 64 ± 2⋅58 ( 0.6667 )
¿ 64 ± 1⋅720086
The limits are 62.279914 & 65.720086
7 a) A sample of 5000 students in a college was taken and their average height was found to be
62.5Kg with a standard deviation of 22kg. Find the 95% confidential limits of the average
weight of the students in the entire University.
7 a) n=50000, x́= Average wt=62.5 kg
Σ= 22 kg
a) 95% Confidence limits for the mean of maximum loads are given by
x́ ± 1.96 ( 22/ √ n )
¿ 62 ⋅5± 1 ⋅ 96 ( 22/ √ 5000 )
¿ 62 ⋅5± 0 ⋅6098
Limits are 61.8902∧63.1098
b) Systolic blood pressure of 566 males was taken. Mean BP was found to be 128.8mm and SD
13.05mm. Find 95% confidence limits of BP within which the populations mean would lie.
7 b) Let x: systolic blood pressure
n= 566 , x́=128.8 mm , σ =13.05mm
x́ ± 1.96 ( σ / √ n )
¿ 129 ⋅8 ±1 ⋅ 96 ( 13.05/ √ 566 )
¿ 128 ⋅8 ±1.07506
Limits are 127.72494∧129.87506
8 a) Standard deviation of blood sugar level in a population is 6 mg%. If population mean is not
known, within what limits is it likely to lie if a random sample of 100 has a mean of 80mg%?
b) To know the mean weights of all 10 year old boys in Delhi a sample of 225 was taken. The
mean weight of the sample was found to be 67 pounds with s.d. of 12 pounds. What can we
infer about the mean weight of the population?
8 b) x : wt of a 10 years old boy in Delhi
n = 22.5 x́=67 pounds , σ =12 pounds
95% Confidence limits of x is x́ ± 1.96 ( σ / √ n )
= 67 ± 1.96 ( 0.8 )
= 67 ± 1.568
The limits are 65.432 pounds
& 68.568 pounds
We can interfere that pop mean wt lies in (65.432, 68.568)
99% Confidence limits of x is
x́ ± 2.58 ( σ / √n )
= 67 ± 2.58 ( 0.8 )
= 67 ± 2.064
The limits are 64.936 & 69.064 pounds
We can interfere that pop mean wt lies in ( 64.936, 69.064) pound with 99 %
confidence
9 a) The mean and S.D of the diameters of a sample of 250 rivet heads manufactured by a company
are 7.2642 mm and 0.0058mm respectively. Find (a) 99% (b) 95% confidence limits for the
mean diameter of all the rivet heads manufactured by the company.
X: Diameter of rivert read
n = 250 , x́=7.2642 mm , σ=0.0058mm
b) Spring break can be a very expensive holiday. A sample of 80 students is surveyed, and the
average amount spent by students on travel and beverages is $593.84. The sample standard
deviation is approximately $369.34.Construct a 95% confidence interval for the population
mean amount of money spent by spring breakers.
10 a) 400 items are sampled from a normally distributed population with a sample mean x¯ of 22.1
and a population standard deviation(σ) of 12.8. Construct a 95% confidence interval for the
true population mean.
n = 400 , x́=22.1 , σ=12.8
95% Confidence limits of x is
x́ ± 1.96 ( σ / √ n )
= 22.1 ±1.96 ( 12.8 / √ 400 )
= 22.1 ± 1.2544
The limits are 20.8456 & 23.3544
b) The mean and S.D. marks of a sample of 100 students are 67.45 and 2.92 respectively. Find
(a) 95% (b) 99% confidence intervals for estimating the marks of the population.
11 a) A machine is expected to produce nails of length 3 inches. A random sample of 25 nails gave
an average length of 3.1 inch with standard deviation 0.3. Can it be said that the machine is
producing nails as per specification?(t0.05 for 24 d.f. is 2.064)
x́=3 inch
t = x́−µ/s ¿
= 0.1/0.3 √ 25=1.67−2.064=t 0.05 , df
Thus the hypo that the machine is producing nails are per specifications is accepted
At 5% level of significance.
b) Ten individuals are chosen at random from a population and their heights in inches are found to
be 63,63,66,67,68,69,70,70,71,71. Test the hypothesis that the mean height of the universe is
66 inches. ( t0.05=2.262 for 9 d.f.)
= t ❑0.05
The hype is rejected it with 95% confidence we can say that the stimulus in general is accompanied with
Increase in bp.
b) A machinist is making engine parts with axle diameter of 0.7 inch. A random sample of 10
parts shows mean diameter 0.742 inch with a standard deviation of 0.04 inch. On the basis of
this sample, would you say that the work is inferior? ( t0.05=2.262 for 9 d.f.)
x: axle diameter
n=10
x́=0.742
S = 0.04
µ = 0.7
x́−μ
t= ⋅√n
s
= 0.042/0.04 √ 10
= 3.3204> 2.262
9ⅆf
= t ❑0.05
The hypo is rejected 0.7 on the basis of sample we can say that the work is interior
σs
13 a) Show that 95% confidence limits for the mean µ of the population are x¯ ± t0.05.
1 t 1 ≤ t0.05
x́−μ
⇒ | σs
. √ n ≤ t0.05
|
x́−μ
⇒ - t0.05 ≤ . √ n ≤ t0.05
σs
σst σs
⇒- 0.05
≤ x́−μ ≤ t0.05
√n √n
σst σs
⇒ x́ - 0.05
≤ μ≤ x́ + t0.05
√n √n
σst
∴ 95% confidence limits of μ is x́ ± 0.05
√n
b) A random sample of 10 measurements of the diameter of a sphere gave a mean of 12 cm and
standard deviation 0.15 cm. Find 95% confidence limits for the actual diameter. ( t 0.05=2.262
for 9 d.f.)
13 b)
X = diameter of the sphere
n = 10; x́=12 cm; S = 0.15 cm
∴ 95% confidence limits are
S
x́ ± t 0.05
√n
12
i.e 12 ± (2.262)
√ 10
i.e 12 ± 0.1073
i.e 11.8927, 12.1073
14 a) A random sample of 10 boys had the following I.Q.: 70, 120, 110, 101, 88, 83,95, 98, 107,100.
Do these data support the assumption of a population mean I.Q. of 100 at 5% level of
significance (t0.05=2.262 for 9 d.f.)
14 a)
X = IQ of a boy
X́ = (70 + 120 + 110 + 101 + 88 + 83 + 95 + 98 + 107 + 100) / 10
= 972 / 10
= 97.2
1
S2 = ¿
9
= 203.7333
∴ S = 14.2735
2.8
∴1t1= √ 10 = 0.6203 < 2.262
14.273
∴ Hypothesis that the population mean IQ of 100 at 5% level of significance is accepted
b) A random sample of size 25 from a normal population has the mean 47.5 and s.d 8.4. Does this
information refute the claim that the mean of the population is 42.1.( t0.05=2.064 for 9 d.f.)
14 b)
n = 25; X́ = 47.5; S = 8.4
µ = 42.1
´
X−μ
∴ t= . √n
s
47.5−42
= . √ 25
8.4
= 45 / 14
= 3.2143 > 2.064 = t0.05 ; d.f = 24
∴ Hypothesis is rejected at 5% 1.0.3
15 a) A process for making certain bearings is under control if the diameter of the bearings have the
mean 0.5 cm. What can we say about this process if a sample of 10 of these bearings has a
mean diameter of 0.506 cm. and S.D. of 0.004cm?( t0.05=2.262 for 9 d.f.)
15 a)
X = diameter of bearing
µ = 0.5; n = 10; X́ =0.506 ; s = 0.004
´
X−μ
∴ t= . √n
s
0.006
= . 10
0.004 √
´
X−μ
∴ t= . √n
s
0.008
= . 10
0.008 √
= 3.1623 > 2.262 = t0.05 , 9 df
∴ The hypothesis is rejected i.e the machine is not working in proper order at 5% 1.03
16 a) The number of automobile accidents per week in a certain community are as follows:
12,8,20,2,14,10,15,6,9,4. Are these frequencies in agreement with the belief that the accident
conditions were the same during this 10 week period ?
16 a)
Total number of accidents in 10 week period = 100
∴ number of accidents expected per week = 100/10 = 10
Oi = 12 8 20 2 14 10 15 6 9 4
Ei = 10 10 10 10 10 10 10 10 10 10
∴ X2 = ∑ (Oi−Ei)2
Ei
1
= [4+ 4+100+64 +16+0+ 25+16+1+36]
10
266
=
10
= 26.6 > 16.92 = X 20.05 ; 9 df
∴ Hypothesis that number of accident conditions were same during this 10 week period is
rejected at 5% > 1.03
b) A sample analysis of examination results of 500 students was made. It was found that 220
students had failed, 170 had secured third class 90 had secured second class and 20 had secured
first class. Do these figures support the general examination result which is in the ratio 4:3:2:1
for the respective categories. (χ2 0.05=7.81 for 3 d.f.)
16 b)
Let us take hypothesis that these figures support to the general result in the ratio 4:3:2:1
The repeated frequencies are
4 3 2 1
x 500 ; x 500 ; x 500 ; x 500
10 10 10 10
i.e 200; 150; 100; 50
∴ We have
Observed frequency Oi: 220 170 90 20
Expected frequency Ei: 200 150 100 50
∴ The hypothesis is rejected that is the figures don’t support to the general result in the ratio
4:3;2:1 with 5% 1.03
17 a) The following figures show the distribution of digits in numbers chosen at random from a
telephone directory. Test whether the digits may be taken to occur equally frequently in the
directory.(χ2 0.05=16.92 for 9 d.f.)
Digits 0 1 2 3 4 5 6 7 8 9
Frequency 1026 1107 997 966 1075 933 1107 972 964 853
17 a)
Digits 0 1 2 3 4 5 6 7 8 9
Frequency 1026 1107 997 966 1075 933 1107 972 964 853
2 ∑ ( Ei−Oi)2
∴X =
Ei
= 58542 / 1000
= 58.542 > X20.05 , 9 df = 16.92
∴ The hypothesis is rejected at 5% = 10.8
b) Fit a Poisson distribution for the following data and test the goodness of fit given that
(χ2 0.05=7.81 for 3 d.f.)
x 0 1 2 3 4
f 122 60 15 2 1
17 b)
x 0 1 2 3 4
f 122 60 15 2 1
μ=
∑ fixi
∑ fi
0+60+30+6 +4
=
200
= 0.5
mx e−m
∴ P(m) =
x!
f(x) = 200 x P(x)
1213 x e−0.5
=
x!
∴ Oi = 122 60 15 2+1 =3
Ei = 121 61 13 3
1 1
X2 = + + 0 + 0 = 0.025 < 7815 = X 20.05
121 61
18 a)
x 0 1 2 3 4 5
f 173 168 37 18 3 1
We have m=
∑ fixi = 0.7825
∑ fi
m x e−m 0.782x e−0.782
P(x) = =
x! x!
∴ f(x) = 400 x P(x)
(182.9) 0.7825x
=
x!
∴ f(0) = 183
f(1) = 143
f(2) = 56
f(3) = 15
f(4) =3
f(5) =0
∵ the last of the expected freq. is 0, we shall club it with the previous one.
∴ Oi 173 168 37 18 (3+1 = 4)
Ei 183 143 56 15 (3+0 = 3)
2
∴ X2 = ∑
(Oi−Ei)
= 12.297
Ei
= 12.3 > 9.49 = X20.05
∴ with 5% > 1.0.3 we say that
Poisson fit is not good
18 b)
Round & yellow Wrinkled & yellow Round & green Wrinkled & green Total
315 101 108 32 556
2 ∑ (Oi−Ei)2
∴X =
Ei
2
2 32 4 2 32
= + + +
313 104 104 35
= 0.0128 + 0.0865 + 0.1538 + 0.257
19 a) 200 digits were chosen at random from a set of tables. The frequencies of the digits are shown
below. Use the chi square test to assess the correctness of the hypothesis that the digits were
distributed in equal number in the tables from which these were chosen..(χ2 0.05=16.92 for 90
d.f.)
Digit 0 1 2 3 4 5 6 7 8 9
Frequency 18 19 23 21 16 25 22 20 21 15
19 a)
No. of digits = 10
Digit 0 1 2 3 4 5 6 7 8 9
Oi 18 19 23 21 16 25 22 20 21 15
Ei 20 20 20 20 20 20 20 20 20 20
(Oi – Ei)2 4 1 9 1 16 25 4 0 1 25
10
( Oi−Ei ) 2
X2 = ∑ = 86/20 = 4.2 < 16.92
i=1 Ei
∴ X20.05 = 9
∴ Hypothesis is accepted i.e digits were distributed in equal number in the tables from which
19 b)
x 0 1 2 3 4
fi 419 352 154 56 19
xfi 0 352 308 168 76
∴ m=
∑ fixi = 904/1000 = 0.904
∑ fi
∴ e-m = 0.4049
e−m mxi
∴ P(xi) =
x!
∴ Ei = 1000 x P(xi)
∴ X2 = ∑ ( Ei−Oi)2
Ei
you say that the dice are fair on the basis of the chi square test at 0.05 level of significance?.(χ2
Sum 2 3 4 5 6 7 8 9 10 11 12
Frequency 8 24 35 37 44 65 51 42 26 14 14
20 a)
1
When a pair of dice is thrown, each sample point will have some Pred =
36
1
P[sum=2] = P (1,1) =
36
2
P [Sum=3] = P (1,2) + P (2,1) =
36
3
P [Sum=4] = P (1,3) + P (3,1) + P (2,2) =
36
4 5 6 5
Similarly, P[Sum=5] = , P[Sum=6] = , P[Sum=7] = ; P[Sum=8] = ;
36 36 36 36
4 3 2 1
P[Sum=9] = ; P[Sum=10] = ; P[Sum=11] = ; P[Sum=12] = ;
36 36 36 36
∴ We have
Sum 2 3 4 5 6 7 8 9 10 11 12
Oi 8 24 35 31 44 65 51 42 86 14 14
Ei 10 20 30 40 50 60 50 40 30 20 10
4 16 25 9 36 25 1 4 16 36 16
∴ X2 = + + + + + + + + + +
10 20 30 40 50 60 50 40 30 20 10
Fit a Poisson distribution to the data and test the goodness of fit.
x 0 1 2 3 4 5 6 7
f0 305 365 210 80 28 9 2 1
20 b)
x 0 1 2 3 4 5 6 7
f0 305 365 210 80 28 9 2 1
xfo 0 365 420 240 112 45 12 7
∴ m=
∑ xf = 1201 / 1000 = 1.201
∑f
∈m m x 0.3009 m x
∴ P(x) = =
x! x!
x 0 1 2 3 4 5 6 7
Ei=P(x) 300.9 361.37 217.004 80.37 26.08 6.26 1.25 0.2
x 1000 ≃301 ≃561 ≃ 217 ≃ 87 ≃ 26 ≃6 ≃1 ≃0
∵ the last expected value is 0, we shall add it with the previous one
we adjust in the 1st and last value equating f(0) = 301.5 = f(6) + f(7) = 1.5
∴ We have,
Question Question
Number
1. Sample error of sample mean with µ as the population mean , σ as the
standard deviation and sample size n is given by
(a) σ (b) σ/n (c) σ/√n (d) none of these
2. A hypothesis is true, but is rejected. Then this is an error of type
(a) II (b) I (c) both I & II (d) none of these
3. A hypothesis is false but accepted, then there is an error of type
(a) II (b) I (c) both I & II (d) none of these
4. The finite population correction factor is
(a) n-N/N-1 (b) N-n/N-1 (c) N-1/N-n (d) none of these
5. The probability distribution of a statistic is called distribution.
(a)Normal (b)Binomial (c) Sampling (d) none of these
6. Sample is a subset of
(a)Data (b) group (c) population (d) distribution
7. Any numerical value computed from population is called
(a)Statistic (b) bias (c) sampling error (d) parameter
8. In a random sampling, the probability of selecting an item from the population is
(a) Unknown (b ) undecided (c) known (d) zero
9. In sampling with replacement ,a sampling unit can be selected
(a )Only once (b) more than once (c) less than once (d) none of above
10. Sampling in which a sampling unit cannot be repeated more than once is called
(a) Sampling without replacement (b) simple sampling
(c) Sampling with replacement (d) none of above