Chapter 5 - Sample Statistics
Chapter 5 - Sample Statistics
246
DEFINITIONS :
X1 , X2 , · · · , Xn .
• A statistic is a function of X1 , X2 , · · · , Xn .
247
EXAMPLES :
( to be discussed in detail · · · )
√
• The sample standard deviation S = S2 .
248
For a random sample
X1 , X2 , · · · , Xn ,
• The sample range : the difference between the largest and the
smallest observation.
249
EXAMPLE : For the 8 observations
Sample mean :
1
X̄ = ( − 0.737 + 0.511 − 0.083 + 0.066
8
− 0.562 − 0.906 + 0.358 + 0.359 ) = − 0.124 .
Sample variance :
1
{ (−0.737 − X̄)2 + (0.511 − X̄)2 + (−0.083 − X̄)2
8
+ (0.066 − X̄)2 + (−0.562 − X̄)2 + (−0.906 − X̄)2
+ (0.358 − X̄)2 + (0.359 − X̄)2 } = 0.26 .
√
Sample standard deviation : 0.26 = 0.51 .
250
EXAMPLE : ( continued · · · )
we also have
251
The Sample Mean
Suppose the population mean and standard deviation are µ and σ .
252
How well does the sample mean approximate the population mean ?
X̄ − µ
√ ,
σ/ n
X̄ − µ
P( | √ | ≤ z) ∼
= 1 − 2 Φ(−z) .
σ/ n
253
It follows that
X̄ − µ σz
P | √ | ≤ z = P | X̄ − µ | ≤ √
σ/ n n
σz σz
= P µ ∈ [ X̄ − √ , X̄ + √ ]
n n
∼
= 1 − 2 Φ(−z) ,
254
σz σz
We found : P ( µ ∈ [X̄ − √ , X̄ + √ ] ) ∼
= 1 − 2 Φ(−z) .
n n
255
EXERCISE :
As in the preceding example, µ is unknown, σ = 3 , X̄ = 4.5 .
Use the formula
σz σz
P ( µ ∈ [X̄ − √ , X̄ + √ ]) ∼
= 1 − 2 Φ(−z) ,
n n
to determine
• The 50 % confidence interval estimate of µ when n = 25 .
• The 50 % confidence interval estimate of µ when n = 100 .
• The 95 % confidence interval estimate of µ when n = 100 .
256
The Sample Variance We defined the sample variance as
n n
2 1 X X 1
S ≡ (Xk − X̄)2 = 2
[ (Xk − X̄) · ].
n k=1 k=1
n
257
We have just argued that the sample variance
n
1 X
S2 ≡ (Xk − X̄)2 ,
n k=1
Nevertheless, we will show that for large n their values are close !
258
FACT 1 : We (obviously) have that
n n
1 X X
X̄ = Xk implies Xk = nX̄ .
n k=1 k=1
FACT 2 : From
σ2 ≡ V ar(X) ≡ E[(X − µ)2 ] = E[X 2 ] − µ2 ,
we (obviously) have
E[X 2 ] = σ 2 + µ2 .
259
FACT 4 : ( Useful for computing S 2 efficiently ) :
n n
1 X 1 X 2
S2 ≡ (Xk − X̄)2 = [ Xk ] − X̄ 2 .
n k=1 n k=1
PROOF :
n
1 X
S2 = (Xk − X̄)2
n k=1
n
1 X 2
= (Xk − 2Xk X̄ + X̄ 2 )
n k=1
n n
1 X 2 X
= [ Xk − 2X̄ Xk + nX̄ 2 ] ( now use Fact 1 )
n k=1 k=1
n n
1 X 2 1 X 2
= [ Xk − 2nX̄ 2 + nX̄ 2 ] = [ Xk ] − X̄ 2 QED !
n k=1 n k=1
260
THEOREM : The sample variance
n
1 X
S2 ≡ (Xk − X̄)2
n k=1
has expected value
1
E[S ] = (1 − ) · σ 2 .
2
PROOF : n
n
2 1X
E[S ] = E[ (Xk − X̄)2 ]
n k=1
h 1X n i
= E [Xk2 ] − X̄ 2 ( using Fact 4 )
n k=1
n
1X
= E[Xk2 ] − E[X̄ 2 ]
n k=1
= σ 2 + µ2 − (σX̄
2
+ µ2X̄ ) ( using Fact 2 n + 1 times ! )
2
2 2 σ 2 1 2
= σ + µ − ( + µ ) = (1 − ) σ . ( Fact 3 ) QED !
n n
REMARK : Thus limn→∞ E[S 2 ] = σ 2 .
261
Most authors instead define the sample variance as
n
1 X
Ŝ 2 ≡ (Xk − X̄)2 .
n − 1 k=1
262
EXAMPLE : The random sample of 120 values of a uniform
random variable on [−1, 1] in an earlier Table has
120
1 X
X̄ = Xk = 0.030 ,
120 k=1
120
1 X
S2 = (Xk − X̄)2 = 0.335 ,
120 k=1
√
S = S 2 = 0.579 ,
while
µ = 0,
Z 1
2 2 1 1
σ = (x − µ) dx = ,
−1 2 3
√ 1
σ = 2
σ = √ = 0.577 .
3
• What do you say ?
263
EXAMPLE :
264
EXAMPLE : ( continued · · · )
500
1 X
Results : X̄ = X̄k = − 0.00136 ,
500 k=1
500
1 X
S2 = (X̄k − X̄)2 = 0.00664 ,
500 k=1
√
S = S 2 = 0.08152 .
EXERCISE :
• What is the value of E[X̄] ?
• Compare X̄ to E[X̄] .
• What is the value of V ar(X̄) ?
• Compare S 2 to V ar(X̄) .
265
Estimating the variance of a normal distribution
We have shown that
n
1 X
S2 ≡ (Xk − X̄)2 ∼
= σ2 .
n k=1
How good is this approximation for normal random variables Xk ?
To answer this we need :
FACT 5 :
X n n
X
(Xk − µ)2 − (Xk − X̄)2 = n(X̄ − µ)2 .
k=1 k=1
PROOF :
Pn 2 2 2 2
LHS = k=1 { X k − 2X k µ + µ − X k + 2X k X̄ − X̄ }
266
Rewrite Fact 5
X n n
X
(Xk − µ)2 − (Xk − X̄)2 = n(X̄ − µ)2 ,
k=1 k=1
as n n
X Xk − µ 2 n 1 X 2
X̄ − µ 2
− 2
(X k − X̄) = √ ,
k=1
σ σ n k=1 σ/ n
and then as n
X n 2
Zk2 − 2
S = Z 2
,
k=1
σ
where
S 2 is the sample variance ,
and
Z and Zk are standard normal because the Xk are normal .
267
We have found that
n 2 2 2
2
S = χn − χ1 .
σ
268
n−1 2 2
For normal random variables : 2
Ŝ has the χn−1 distribution
σ
SOLUTION : n − 1 15
P (Ŝ ≥ 129) = P (Ŝ 2 ≥ 1292 ) = P Ŝ 2
≥ 129 2
σ2 1002
∼
= P (χ215 ≥ 24.96) ∼
= 5 % ( from the χ2 Table ) .
269
0.16
0.14
0.12
0.10
f (x)
0.08
0.06
0.04
0.02
0.00
0 5 10 15 20 25
x
The Chi-Square density functions for n = 5, 6, · · · , 15 .
(For large n they look like normal density functions .)
270
EXERCISE :
In the preceding example, also compute
P ( χ215 ≥ 24.96 )
using the standard normal approximation .
EXERCISE :
Consider the same shipment of light bulbs :
271
EXAMPLE : For the data below from a normal population :
272
SOLUTION : We have n = 16 , X̄ = 0.00575 , Ŝ 2 = 0.02278 .
(n − 1) Ŝ 2 2 (n − 1)Ŝ 2 15 · 0.02278
2
= 6.26 ⇒ σ = = = 0.05458
σ 6.26 6.26
(n − 1) Ŝ 2 2 (n − 1)Ŝ 2 15 · 0.02278
= 27.49 ⇒ σ = = = 0.01223
σ2 27.49 27.49
273
Samples from Finite Populations
274
EXAMPLE :
275
• With replacement : The possible samples are
(1, 1) , (1, 2) , (1, 3) , (2, 1) , (2, 2) , (2, 3) , (3, 1) , (3, 2) , (3, 3) ,
1
each with equal probability 9
.
276
• Without replacement : The possible samples are
(1, 2) , (1, 3) , (2, 1) , (2, 3) , (3, 1) , (3, 2) ,
1
each with equal probability 6
.
The sample means X̄ are
3 3 5 5
, 2 , , , 2 , ,
2 2 2 2
with expected value
1 3 3 5 5
E[X̄] = ( + 2 + + + 2 + ) = 2.
6 2 2 2 2
The sample variances S 2 are
1 1 1 1
, 1 , , , 1 , . ( Check ! )
4 4 4 4
with expected value
2 1 1 1 1 1 1
E[S ] = ( + 1 + + + 1 + ) = .
6 4 4 4 4 2
277
EXAMPLE : ( continued · · · )
278
EXAMPLE : ( continued · · · )
We have computed :
2
• Population statistics : µ = 2 , σ2 = 3
,
2 1 2 1
E[S ] = (1 − ) σ = .
2 3
279
QUESTION :
Why is E[S 2 ] wrong for sampling without replacement ?
280
NOTE :
{ 1 , 2 , 3 , ··· , N } .
P (X2 = 1 | X1 = 1) = 0 ,
281
The Sample Correlation Coefficient
We have
• | σX,Y | ≤ σX σY , (the Cauchy-Schwartz inequality )
• Thus | ρX,Y | ≤ 1 , ( Why ? )
• If X and Y are independent then ρX,Y = 0 . ( Why ? )
282
Similarly, the sample correlation coefficient of a data set
{ (Xi , Yi ) }N
i=1 ,
is defined as
PN
i=1 (Xi− X̄)(Yi − Ȳ )
RX,Y ≡ qP qP ;
N 2 N 2
i=1 (X i − X̄) i=1 (Yi − Ȳ )
283
The sample correlation coefficient
PN
i=1 (Xi − X̄)(Yi − Ȳ )
RX,Y ≡ qP qP .
N 2 N 2
i=1 (X i − X̄) i=1 (Yi − Ȳ )
In fact,
• If | RX,Y | = 1 then X and Y are related linearly .
Specifically,
• If RX,Y = 1 then Yi = cXi + d, for constants c, d, with c > 0 .
• If RX,Y = −1 then Yi = cXi + d, for constants c, d, with c < 0 .
Also,
• If | RX,Y | ∼
= 1 then X and Y are almost linear .
284
EXAMPLE :
285
A scatter diagram showing the average daily high temperature.
The sample correlation coefficient is RX,Y = 0.98
286
EXERCISE :
• The Table below shows class attendance and course grade/100.
11 47 13 43 15 70 17 72 18 96 14 61 5 25 17 74
16 85 13 82 16 67 17 91 16 71 16 50 14 77 12 68
8 62 13 71 12 56 15 81 16 69 18 93 18 77 17 48
14 82 17 66 16 91 17 67 7 43 15 86 18 85 17 84
11 43 17 66 18 57 18 74 13 73 15 74 18 73 17 71
14 69 15 85 17 79 18 84 17 70 15 55 14 75 15 61
16 61 4 46 18 70 0 29 17 82 18 82 16 82 14 68
9 84 15 91 15 77 16 75
• Any conclusions ?
287
Maximum Likelihood Estimators
EXAMPLE :
288
EXAMPLE : ( continued · · · )
2 1 2
E[S ] = (1 − ) σ .
n
289
The maximum likelihood procedure is the following :
Let
X1 , X2 , · · · , Xn ,
be
independent, identically distributed ,
each having
density function f (x ; σ) ,
with unknown parameter σ .
290
EXAMPLE : For our normal distribution with mean 0 we have
1 Pn
− x2k
e 2σ 2 k=1
f (x1 , x2 , · · · , xn ; σ) = √ . ( Why ? )
( 2π σ)n
291
EXAMPLE : ( continued · · · )
We had n
d 1 X 2
− 2 xk − n log σ = 0.
dσ 2σ k=1
292
EXERCISE :
Suppose a random variable has the general normal density function
1 − 21 (x−µ)2 /σ 2
f (x ; µ, σ) = √ e ,
2π σ
with unknown mean µ and unknown standard deviation σ .
293
EXERCISE : ( continued · · · )
n
1 X
µ̂ = Xk ,
n k=1
n 12
1 X
σ̂ = √ (Xk − X̄)2 ,
n k=1
that is,
294
NOTE :
2 1 2 ∼
E[S ] = (1 − ) σ = σ2.
n
295
EXERCISE :
296
EXAMPLE : Consider the special exponential density function
2 −λx
λ xe , x>0
f (x ; λ) =
0, x≤0
1.50 1.0
0.9
1.25
0.8
0.7
1.00
0.6
F (x)
f (x)
0.75 0.5
0.4
0.50
0.3
0.2
0.25
0.1
0.00 0.0
0 1 2 3 4 5 0 1 2 3 4 5
x x
297
EXAMPLE : ( continued · · · )
For the maximum likelihood estimator of λ , we have
f (x ; λ) = λ2 x e−λx , for x > 0 ,
so, assuming independence, the joint density function is
f (x1 , x2 , · · · , xn ; λ) = λ2n x1 x2 · · · xn e−λ(x1 +x2 + ··· +xn )
.
298
EXAMPLE : ( continued · · · )
We had
n n
d X X
2n log λ + log xk − λ xk = 0.
dλ k=1 k=1
Differentiating gives
n
2n X
− xk = 0,
λ k=1
from which
2n
λ̂ = Pn .
k=1 xk
299
EXERCISE :
• Verify that Z ∞
f (x ; λ) dx = 1 .
0
• Also compute
Z ∞
E[X] = x f (x ; λ) dx
0
300
NOTE :
• Maximum likelihood estimates also work in the discrete case .
• In such case we maximize the probability mass function .
EXAMPLE :
Find the maximum likelihood estimator of p in the Bernoulli trial
P (X = 1) = p,
P (X = 0) = 1−p .
SOLUTION : We can write
P (x ; p) ≡ P (X = x) = px (1 − p)1−x , ( x = 0, 1 ) (!)
301
EXAMPLE : ( continued · · · )
We found
Pn Pn
P (x1 , x2 , · · · , xn ; p) = p k=1 xk n
· (1 − p) −
· (1 − p) k=1 xk
.
Differentiating gives
n n
1 X n 1 X
xk − + xk = 0.
p k=1 1−p 1 − p k=1
302
EXAMPLE : ( continued · · · )
We found
n n
1 X n 1 X
xk − + xk = 0,
p k=1 1−p 1 − p k=1
from which
1 n
1 X n
+ xk = .
p 1 − p k=1 1−p
Multiplying by 1 − p gives
1 − p n n
X 1 X
+1 xk = xk = n,
p k=1
p k=1
303
EXERCISE :
where x is an integer, (0 ≤ x ≤ N ) .
304
Hypothesis Testing
305
EXAMPLE :
We assume that :
• The lifetime of the bulbs has indeed a normal distribution .
• The standard deviation is indeed σ = 100 hours.
• We test the lifetime of a sample of 25 bulbs .
306
Density function of X , Density function of X̄ (n = 25) ,
also indicating µ ± σX , also indicating µX̄ ± σX̄ ,
(µX = 1000 , σX = 100) . (µX̄ = 1000 , σX̄ = 20 ).
307
EXAMPLE : ( continued · · · )
308
EXAMPLE : ( continued · · · )
• Would you accept the hypothesis that that the mean is 1000 hours ?
309
EXAMPLE : ( continued · · · )
960 ≤ X̄ ≤ 1040 .
960 − 1000
P ( | X̄−1000 |≤ 40 ) = 1 − 2Φ √ = 1−2Φ(−2) ∼
= 95 % ,
100/ 25
P ( | X̄ − 1000 | ≥ 40 ) = 100 % − 95 % = 5 % .
310
Density function of X̄ (n = 25) , with µ = µX̄ = 1000 , σX̄ = 20 ,
P (960 ≤ X̄ ≤ 1040) ∼
= 95%
311
EXAMPLE : ( continued · · · )
(1 − 0.0013) − 0.1587 = 84 % .
312
µ = µX̄ = 980 µ = µX̄ = 1000 µ = µX̄ = 1040
P (accept) = 84% P (accept) = 95% P (accept) = 50%
313
EXAMPLE :
314
There are two hypotheses :
315
The density functions of X̄ (n = 25) , also indicating x̂ .
blue : (µ1 , σ1 ) = (1000, 100) , red : (µ2 , σ2 ) = (1100, 200) .
316
RECALL :
317
Probability of Type 1 error vs. x̂ Probability of Type 2 error vs. x̂
(µ1 , σ1 ) = (1000, 100) (µ2 , σ2 ) = (1100, 100) .
318
The probability of Type 1 and Type 2 errors versus x̂ .
Left : (µ1 , σ 1 ) = (1000, 100), (µ2 , σ 2 ) = (1100, 100).
Right : (µ1 , σ 1 ) = (1000, 100), (µ2 , σ 2 ) = (1100, 200).
Colors indicate sample size : 2 (red), 8 (blue), 32 (black) .
Curves of a given color intersect at the minimax x̂-value.
319
The probability of Type 1 and Type 2 errors versus x̂ .
320
NOTE :
is minimized .
321
0.0040 0.020
0.0035
0.0030 0.015
0.0025
fX̄ (x)
f (x)
0.0020 0.010
0.0015
0.0010 0.005
0.0005
0.0000 0.000
400 600 800 1000 1200 1400 1600 1800 400 600 800 1000 1200 1400 1600 1800
x x
322
The density functions of X̄ (n = 25) , with minimax value of x̂ .
323
The minimax value x̂∗ of x̂ is easily computed : At x̂∗ we have
P ( Type 1 Error ) = P ( Type 2 Error ) ,
⇐⇒
P (X̄ ≥ x̂∗ | µ = µ1 ) = P (X̄ ≤ x̂∗ | µ = µ2 ) ,
⇐⇒ µ − x̂∗ x̂∗ − µ
1
Φ √ = Φ √2 ,
σ1 / n σ2 / n
⇐⇒ µ1 − x̂∗ x̂∗ − µ2
√ = √ , ( by monotonicity of Φ ) .
σ1 / n σ2 / n
from which
∗ µ1 · σ2 + µ2 · σ1
x̂ = . ( Check ! )
σ1 + σ2
324
Thus we have proved the following :
is given by
∗ σ1 µ2 + σ2 µ1
x̂ = .
σ1 + σ2
325
EXERCISE :
For this x̂∗ find the probability of a Type 1 and a Type 2 Error ,
when
n=1 , n = 25 , n = 100 .
326
EXAMPLE ( Known standard deviation ) :
Do we accept H0 ?
327
SOLUTION ( Known standard deviation ) :
Given: n = 9 , σ = 0.2 , X̄ = 4.88, µ = 5.0, | X̄−µ |= 0.12 .
Since
X̄ − µ
Z ≡ √ is standard normal ,
σ/ n
328
EXAMPLE ( Unknown standard deviation, large sample ) :
P ( X̄ ≥ 4.847 ) < 5 % .
329
SOLUTION ( Unknown standard deviation, large sample ) :
CONCLUSION:
We (barely) accept H0 at level of significance 5 % .
( We would reject H0 at level of significance 10 % .)
330
EXAMPLE ( Unknown standard deviation, small sample ) :
NOTE :
If n ≤ 30 then the approximation σ ∼
= Ŝ is not so accurate.
In this case better use the ”student t-distribution ” Tn−1 .
331
The T - distribution Table
n α = 0.1 α = 0.05 α =0.01 α = 0.005
5 -1.476 -2.015 -3.365 -4.032
6 -1.440 -1.943 -3.143 -3.707
7 -1.415 -1.895 -2.998 -3.499
8 -1.397 -1.860 -2.896 -3.355
9 -1.383 -1.833 -2.821 -3.250
10 -1.372 -1.812 -2.764 -3.169
11 -1.363 -1.796 -2.718 -3.106
12 -1.356 -1.782 -2.681 -3.055
13 -1.350 -1.771 -2.650 -3.012
14 -1.345 -1.761 -2.624 -2.977
15 -1.341 -1.753 -2.602 -2.947
332
EXAMPLE ( Testing a hypothesis on the standard deviation ) :
A sample of 16 items from a normal population has sample
standard deviation Ŝ = 2.58 .
Do you believe the population standard deviation satisfies σ ≤ 2.0 ?
333
EXERCISE :
Ŝ = 0.83 .
σ ≤ 1.2 ?
( Probably Yes ! )
334