100% found this document useful (1 vote)
41 views273 pages

Slide MathStat DS 2025

The document outlines a Bachelor program in Mathematical Statistics at NEU, detailing course information including lecture hours, software used, scoring criteria, and required materials. It covers various topics in statistics such as descriptive statistics, hypothesis testing, and regression analysis, along with recommended textbooks. The course emphasizes both theoretical and practical applications of statistical methods using software like Microsoft Excel and R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
41 views273 pages

Slide MathStat DS 2025

The document outlines a Bachelor program in Mathematical Statistics at NEU, detailing course information including lecture hours, software used, scoring criteria, and required materials. It covers various topics in statistics such as descriptive statistics, hypothesis testing, and regression analysis, along with recommended textbooks. The course emphasizes both theoretical and practical applications of statistical methods using software like Microsoft Excel and R.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

MATHEMATICAL STATISTICS

Bachelor Program in Data Science

Bui Duong Hai


Facculty of Mathematical Economics - NEU
[Link]/buiduonghai Link

August 3, 2025

[Link]/buiduonghai Mathematical Statistics August 3, 2025 1 / 273


Course Information

Total time: Lecture: 45 hours; Tutorial: 10 hours


Software: Microsoft Excel, R
Scoring:
Attendance: 10%
Quiz on LMS: 20%
Group Assignment: 20%
Final exam, paper base: 50%
Material:
Computer
Calculator
Online folder: [Link]/buiduonghai Link

[Link]/buiduonghai Mathematical Statistics August 3, 2025 2 / 273


Your purpose?

job
s
od u rse
co
Go

Pa her
es Boo
cis
ss
Ot k’

er

se
Score +

C las s e x

xercises
THEORY

Certific
High

om

ate
se

s
C

p uter exerci

Statist
ic works
[Link]/buiduonghai Mathematical Statistics August 3, 2025 3 / 273
Topics

1 Introduction
2 Descriptive Statistics
3 Sampling Distribution
4 Point Estimation
5 Confidence Interval
6 Hypothesis Testing - One Sample
7 Inferences on Two Samples
8 Analysis of Variance
9 Non-parametric Test
10 Linear regression
11 Basic Bayes Statistics

[Link]/buiduonghai Mathematical Statistics August 3, 2025 4 / 273


Textbook

1 Jay L. Devore, Kenneth N. Berk (2012), Modern Mathematical


Statistics with Applications, 2nd Edition, Springer.
2 Irwin Miller, Marylees (2014), John E. Freund’s Mathematical
Statistics with Applications, 8th Edition, Pearson.
3 Robert V. Hogg, Joseph W. McKean, Allen T. Craig (2013),
Introduction to Mathematical Statistics, 7th Edition, Pearson.
4 Paul Newbaul, William L. Carlson, Betty M. Thorne (2013), Statistics
for Business and Economics, 8th Edition, Pearson.

[Link]/buiduonghai Mathematical Statistics August 3, 2025 5 / 273


Lec01. INTRODUCTION

1.1. Concept
1.2. Branches of Statistics
1.3. Data Sources
1.4. Population and Sample
1.5. Structure of Classical Data
1.6. Types of Variable
1.7. Revision of Probability

Reference
Book [1] Chapter 1, pp.1 - 9
Revision of Probability: Book [1] Chapter 2,3,4,5

August 3, 2025 6 / 273


1.1. Concept
Opinion Statistics
It seems that in the NEU, number In total 16,000 students, there are
of female students is greater than 10,000 females (62.5%) and 6,000
males. males (37.5%).

In general, the older the cus- Age 20-29 30-39 40+


tomers are, the less they spend Spend 35 28 23

Growth trend will continue !

[Link]

August 3, 2025 7 / 273


1.2. Branches of Statistics

Two main branches:


Descriptive Statistics
organize, summarize, present data in a convenient and informative way

Inferential Statistics
predict, forecast, verify knowledge by analyzing data

Descriptive Inferential
Data Probability
Statistics Statistics

Basic Statistics Mathematical Statistics

August 3, 2025 8 / 273


1.3. Data Sources

Primary data Secondary data


From Self-survey, Other parties
questionairs, records official reports, publishs
Advantages Relevant to the purpose Official, high accuracy
flexible, deep information Bigger data, less expense
Disadvantages Costly, not response Not completely relevant
missing information No further information
errors in measures,
errors in method,...

August 3, 2025 9 / 273


1.4. Population and Sample

Population Sample
Set of all interested elements subset of Population
Size N, maybe infinite n, finite
Value Parameter Statistics


• • • • Sample 1
• •••
•••


• ⋆ ⋆ ⋆ Sample 2
•⋆⋆⋆ ⋆⋆⋆
• ⋆⋆⋆ ⋆⋆⋆
• ⋆⋆⋆
Population, N = 100

August 3, 2025 10 / 273


1.5. Classical Data Set

Include: Obersvation (in row); Variable (in column); Value (in cell)

No. Name Sex Age Eng. mark Math score ···


1 Anderson Male 20 B 73 ···
2 Berky Female 19 A 80 ···
3 Charles Male 20 C 72 ···
.. .. .. .. .. .. ..
. . . . . . .

1 variable: univariate
2 variables: bivariate
n variables: multivariate
Bigdata: Methodology for many more type, complex structure,
non-structure data

August 3, 2025 11 / 273


1.6. Types of Variable
Includes: Qualitative and Quantitative
Qualitative (Categorical)
Nominal: incomparable value: name, address, occupation,...
Ordinal: comparable but cannot be calculated: rank, size,...

∗ Special: Binary: only two values

Quantitative (Scale)
Discrete: number of values is countable
Continuous: uncountable: time, lenght, weight,...

∗ By level of calculation
Interval: no true zero value, +, − but not ×, ÷
Ratio: true zero value, all mathematical operation
In practice: Nominal, Ordinal, Scale
August 3, 2025 12 / 273
1.6. Type of Variables

Qualitative Quantitative (Scale)

Nominal Ordinal Discrete Continuous

List, group List, group List, group, sort, rank


Sort, rank Calculate: +, −, ×, ÷, · · ·

Could be coded by number Could be used to rank

August 3, 2025 13 / 273


1.6. Type of Variables
Levels of measurement
No
More than 2 values? Binary
Yes
No
Could be compared? Nominal
Yes
No
Could be calculated? Ordinal
Yes
No
Could be ×, ÷? Interval
Yes

Ratio

August 3, 2025 14 / 273


1.7. Revision: Probability

Probability
mes(A)
P(A) =
mes(Ω)
Probability of intersection

P(A ∩ B) = P(A)P(B|A)

P(A ∩ B) = P(A)P(B) ⇔ A, B independent


Probability of union

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)

P(A ∪ B) = P(A) + P(B) ⇔ A, B mutually exclusive


Extend for A1 , A2 , ..., An

August 3, 2025 15 / 273


1.7. Revision: Random Variable (r.v)
Discrete rv: PMF: px = P(X = x); Continuous rv: PDF: f (x)
X Z +∞
pxi = 1 ; f (x)dx = 1
i −∞

Expected value:
X Z +∞
E (X ) = xi pi ; E (X ) = xf (x)dx
i −∞

Variance and Standard Deviation


2 2
V (X ) = E X − E (X ) = E (X 2 ) − E (X )
 

p
σX = V (X )
Covariance
n  o
Cov (X , Y ) = E X − E (X ) Y − E (Y ) = E (XY ) − E (X )E (Y )

August 3, 2025 16 / 273


1.7. Revision: Random Variable (cont.)

Correlation coefficient
Cov (X , Y )
ρX ,Y =
σX σY

X , Y are independent ⇒ Cov (X , Y ) = ρX ,Y = 0


Moment: Raw moment and Central moment order k
k
µk = E (X k ) ; µck = E X − E (X )


E (X ) = µ1 ; V (X ) = µc2 = µ2 − µ21
Skewness and Kurtosis
 3  4
E X − E (X ) E X − E (X )
Skew = ; Kurt =
σ3 σ4

August 3, 2025 17 / 273


1.7. Properties of Parameters

Properties of Expected value and Variance, with constant c

E (c) = c V (c) = 0
E (X + c) = E (X ) + c V (X + c) = V (X )
E (cX ) = cE (X ) V (cX ) = c 2 V (X )
E (X ±Y ) = E (X )±E (Y ) V (X ±Y ) = V (X )+V (Y ) ± 2Cov (X , Y )
P  P P  P
E i Xi = i E (Xi ) V i Xi = i V (Xi ) : Xi independent

Cov (aX + bY ) = a2 V (X ) + b 2 V (Y ) ± 2abCov (X , Y )


ρ(aX , bY ) = ρX ,Y

August 3, 2025 18 / 273


Example

Example 1.1
Players A and B play a game that have no draw, they are equally likely to
win each match. They intend to play 9 matches, who wins more will take
prize of 1 thousand USD. But after 7 matches and ratio of A:B is 4:3.
How to distribute the money?
Example 1.2
A couple have an online appointment, from 0:00 to 1:00, the first comer
will wait only 20 minutes. Find the probability that they meet each other.
Example 1.3
Consider the Vietnamese gamble “danh de” and its profit X
Find E (X ) and V (X ) when play 1 ([Link]) in one day;
Compare playing 10 mil in one day, and play 10 days, each day 1 mil.

August 3, 2025 19 / 273


1.7. Revision: Common Discrete Distributions

Distribution Prob. Mass Function E (X ) V (X )

Bernoulli P(X = x) = p x (1 − p)1−x p p(1 − p)


B(1, p) x = 0, 1

Binomial P(X = x) = Cnx p x (1 − p)n−x np np(1 − p)


B(n, p) x = 0, 1, 2, ..., n
λx e −λ
Poisson P(X = x) = λ λ
x!
P(λ) x = 0, 1, 2, ...
1 1−p
Geometric P(X = x) = (1 − p)x−1 p
p p
G (p) x = 1, 2, ...

August 3, 2025 20 / 273


1.7. Revision: Common Continuous Distribution

Distribution Prob. Density Function f (x) > 0 E (X ) V (X )


1 a+b (b − a)2
Uniform f (x) =
b−a 2 12
U(a, b) x ∈ (a, b)
1 1
Exponential f (x) = λe −λx
λ λ2
E (λ) x >0
1 (x−µ)2
Normal f (x) = √ e − 2σ2 µ σ2
σ 2π
N(µ, σ 2 ) x ∈R
1 z2
Standardized f (z) = √ e − 2 0 1

Normal N(0, 1) z ∈R

August 3, 2025 21 / 273


1.7. Revision: Normal Distribution

X −µ
X ∼ N(µ, σ 2 ) ⇒ Z = ∼ N(0, 1)
σ
 
b−µ
P(X < b) = P Z < ; P(Z < b ⋆ ) = P(X < µ + b ⋆ σ)
σ

f
Z ∼ N(0, 1) X −µ
Z=
σ
X ∼ N(µ, σ 2 )
X = µ + Zσ

• •
0 b−µ µ b x
σ
= b⋆ = µ + b⋆ σ

August 3, 2025 22 / 273


1.7. Revision: Critical Value
Definition
Critical value level α of Z distribution, denoted by zα , is a number that

P(Z > zα ) = α

z1−α = zα
z0 = +∞; z1 = −∞; z0.5 = 0
z0.05 = 1.645; z0.025 = 1.96

Z ∼ N(0, 1)

α
• •
z1−α 0 0 zα
= −zα

August 3, 2025 23 / 273


Example

Example 1.4
Income X is normal distributed with mean of 500 USD and variance of
400 USD2 .
(a) Find the probability that X > 510
(b) With probabilty of 0.95, find the upper limit of X
(c) With probabilty of 0.95, find the lower limit of X

Example 1.5
X ∼ N(µ, σ 2 ). With probability of (1 − α)
(a) Find the upper limit of X
(b) Find the lower limit of X
(c) Find an interval around the mean that X falls into

August 3, 2025 24 / 273


1.7. Revision: Chi-squared Distribution
Definition
v
Zi2 is Chi-squared
P
If Zi ∼ N(0, 1) and independent, then X =
i=1
distributed v degree of freedom, denoted by X ∼ χ2 (v ).

Critical value level α, denoted by χ2(v )α (See [1]Table A6 p.796)

P(X > χ2(v )α ) = α

f χ2 (v )

α

0 χ2(v )α

August 3, 2025 25 / 273


1.7. Revision: Student Distribution

Definition
Z
If Z ∼ N(0, 1) and X ∼ χ2 (v ), independent, then T = p is Student
X /v
distributed v degree of freedom, denoted by T ∼ T (v ).

Critical value t(v )α : P(T > t(v )α ) = α (see [1]Table A5 p.795)


v →∞ v →∞
T (v ) −−−→∼ N(0, 1); t(v )α −−−→ zα

f
v = 100

v =2

0 t(100)α

August 3, 2025 26 / 273


1.7. Revision: Fisher Distribution

Definition
X1 /v1
If X1 ∼ χ2 (v1 ) and X2 ∼ χ2 (v2 ), independent, then F = is Fisher
X2 /v1
distributed v1 , v2 degree of freedom, denoted by F ∼ F (v1 , v2 ).

Critical value level α, denoted by f(v1 ,v2 )α (See [1]Table A8 p.799)

P(F > f(v1 ,v2 )α ) = α


1
f(v1 ,v2 )1−α =
f(v2 ,v1 )α

Reference
For χ2 , T , F distribution, reference the book [1] pp. 315 - 325.

August 3, 2025 27 / 273


Example

Example 1.6
Find the following critical values and their probability meaning
χ2(20)0.05 , χ2(20)0.95
t(20)0.05 , t(20)0.95
t(200)0.025
f(2,10)0.05 , f(10,2)0.05 , f(2,10)0.95

Example 1.7
Error of a measurement is N(0, 1), and repair cost is square of error.
With probability of 0.95, find upper limit of cost of a measurement
Find probability that total cost of 3 independent measurements is
greater than 6.49

August 3, 2025 28 / 273


1.7. Revision: Central Limit Theorem

Theorem (simplified)
If X1 , X2 , ..., Xn are independent, identically distributed with mean of
E (X ), variance of V (X ) then
n
n→∞
X 
T = Xi −−−→∼ N nE (X ), nV (X )
i=1

and Pn  
i=1 Xi n→∞ V (X )
X = −−−→∼ N E (X ),
n n

In practice: n > 30 is enough to apply CLT. (see: Book [1] p.298)


Example 1.8
Weight of an egg is Uniform distributed in the interval of (50,62)g. Find
the probability that average weight of 100 eggs is lighter than 56.5 g.
August 3, 2025 29 / 273
Practice: Microsoft Excel and R

Distribution Value Excel 2016 R


N(µ, σ 2 ) f (x) [Link](x, µ, σ, 0) dnorm(x, µ, σ)
P(X < x) [Link](x, µ, σ, 1) pnorm(x, µ, σ)
qβ [Link](β, µ, σ) qnorm(β, µ, σ)
xα [Link](1 − α, µ, σ) qnorm(1 − α, µ, σ)
N(0, 1) P(Z < z) [Link](x, 0, 1, 1) pnorm(z)
qβ [Link](β, 0, 1) qnorm(β)
zα [Link](1 − α, 0, 1) qnorm(1 − α)
χ2 (v ) P(X < x) [Link](x, v , 1) pchisq(x, v )
2(v )
χα [Link](1 − α, v ) qchisq(1 − α, v )
T (n) P(X < x) [Link](x, v , 1) pt(x, v )
t(v ),α [Link](1 − α, v ) qt(1 − α, n)
F (v1 , v2 ) P(X < x) [Link](x, v1 , v2 , 1) pf(x, v1 , v2 )
f(v1 ,v2 )α [Link](1 − α, v1 , v2 ) qf(1 − α, v1 , v2 )

August 3, 2025 30 / 273


Exercise - Lec01

Using Microsoft Excel or R to find the following Probability and


correspond critical value
P[X < 125|X ∼ N(100, 202 )]
P[X < 1.25|X ∼ N(0, 1)]
P[X < 1.25|X ∼ T (10)]
P[X < 25|X ∼ χ2 (10)]
P[X < 2.5|X ∼ F (10, 20)]
Using Microsoft Excel or R to find the following Critical value and
correspond Probability value
z0.2
t(10)0.15
χ2(20)0.25
f(20,30)0.12

August 3, 2025 31 / 273


Lec 2. DESCRIPTIVE STATISTICS

Table and Chart


Measures of Locations
Measures of Variability
Measures of Shape
Measures of Relationship
Statistics of Grouped Data
Reference
Book [1] Chapter 1, pp.9 - 49
Book [4] Chapter 1 + 2

August 3, 2025 32 / 273


2.1. TABLE & CHART
Sample data size n
Listed data: x1 , x2 , ..., xn
Grouped data: x1 , x2 , ..., xk with correspond frequency f1 , f2 , ..., fk
fi
Relative frequency (proportion): p̂i =
n
X X
fi = n , p̂i = 1
i i

Frequency / Relative frequency table

Value x1 x2 ··· xk
Frequency f1 f2 ··· fk
Relative Freq. p̂1 p̂2 ··· p̂k
Percentage p̂1 % p̂2 % ··· p̂k %

August 3, 2025 33 / 273


Table and Chart — Example

Example 2.1
Data from customer survey, n = 50 observations

Gender Age Waiting Evaluation


No. (M / F) (Year) (Minute) (E, G, F, B)
1 Male 23 0 to 5 Good
2 Female 36 5 to 10 Excellent
.. .. .. .. ..
. . . . .
50 Female 28 10 to 15 Fair
Table: Customer’s feedback

E = Excellent F = Fair
G = Good B = Bad

August 3, 2025 34 / 273


Table and Pie chart

Gender Male Female Male


Freq. 20 30 40%
60%
% 40% 60% Female
Table: Sex distribution
Figure: Gender

Bad Excellent
Evaluation Excel. Good Fair Bad
12%
Freq. 10 25 9 6 Fair 20%
% 20% 50% 18% 12% 18%
50%
Table: Evaluation distribution

Good

August 3, 2025 35 / 273


Column chart

30
25
25
Evaluation Freq. %
20
Excel. 10 20%
Good 25 50% 15
Fair 9 18% 10
10 9
Bad 6 12% 6
Table: Evaluation distribution 5

0
Excel. Good Fair Bad

Figure: Evaluation distribution

August 3, 2025 36 / 273


Cross table
E G F B Sum E G F B Σ
M 6 11 2 1 20 M 12% 22% 4% 2% 40%
F 4 14 7 5 30 F 4% 28% 14% 10% 60%
Sum 10 25 9 6 50 Σ 20% 50% 18% 12% 100%
Table: Cross frequency Table: Cross percentage

Male Female Male Female


14 14

11 11

7 7
6 6
5 5
4 4
2 2
1 1

E G F B E G F B Excel. Good Fair Bad

Figure: Evaluation by Gender Figure: Gender by Evaluation

August 3, 2025 37 / 273


Row and Column Percentage Table
E G F B Σ E G F B Grand
M 30% 55% 10% 5% 100% M 60% 44% 22% 17% 40%
F 13% 47% 23% 17% 100% F 40% 56% 78% 83% 60%
Grand 20% 50% 18% 12% 100% Σ 100% 100% 100% 100% 100%

Table: Row-percent Table: Column-percent

Female
Male Female
Male
55%
40% 56% 78% 83%
47%

30%
60%
23%
17% 44%
13%
10%
5% 22%
17%

E G F B E G F B Excel. Good Fair Bad

Figure: Evaluation by Gender Figure: Gender by Evaluation


August 3, 2025 38 / 273
Scale variable: Column chart

Age 23 26 28 32 35 36 38 40 43 47 50 54 58 63 Sum
Freq. 1 2 2 2 4 3 5 8 7 4 5 4 2 1 50
% 2 4 4 4 8 6 10 16 14 8 10 8 4 2 100%

10

8
8
7
6
5 5
4 4 4
4
3
2 2 2 2
2
1 1
0
23 26 28 32 35 36 38 40 43 47 50 54 58 63
Figure: Distribution of Age

August 3, 2025 39 / 273


Scale variable: Column chart

Age 23 26 28 32 35 36 38 40 43 47 50 54 58 63 Sum
Freq. 1 2 2 2 4 3 5 8 7 4 5 4 2 1 50
% 2 4 4 4 8 6 10 16 14 8 10 8 4 2 100%

10

8
8
7

6
5 5
4 4 4
4
3
2 2 2 2
2
1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

Figure: Distribution of Age

August 3, 2025 40 / 273


Scale variable: Grouped chart

Age 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64 Sum
Freq. 1 4 2 12 15 4 9 2 1 50
% 2 8 4 24 30 8 18 4 2 100%

20

15
15
12
10 9

5 4 4
2 2
1 1
0
20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64
Figure: Distribution of Age

August 3, 2025 41 / 273


Scale variable: Histogram

Age 20-29 30-39 40-49 50-59 60-69 Sum


Freq. 5 14 19 11 1 50
% 10 28 38 22 2 100%

25

20 19

15 14
11
10
5
5
1
0
20-29 30-39 40-49 50-59 60-69
Figure: Histogram of Age

August 3, 2025 42 / 273


Histogram & Cummulative chart

38% • •
Distribution of Age 98% 100%

Age Freq. % Cumm. 28% •


76%
20-29 5 10% 10%
30-39 14 28% 38% 22%
40-49 19 38% 76%
50-59 11 22% 98% •
60+ 1 2% 100% 10% 38%
Table: Cummulative distribution
• 2%
10%
Cummulative chart: Ogive,
20-29 30-39 40-49 50-59 60+
Pareto chart
Figure: Histogram & cummulative of Age

August 3, 2025 43 / 273


Histogram & Cummulative chart

40%

• 100%
Distribution of Waiting time 96%

30% 86%
Time Freq. % Cumm. •
0-5 15 30% 30% 70%
5 - 10 20 40% 70%
10 - 15 8 16% 86% 16%
15 - 20 5 10% 96%
20+ 2 4% 100% • 10%
30%
Table: Distribution of Waiting 4%
time
0-5 5-10 10-15 15-20 20+

Figure: Waiting time distribution

August 3, 2025 44 / 273


Shape of Histogram

Sharp

Negatively/Left- Bell-shaped Positively/Right-


-skewed -skewed

Flat

Symmetrical

August 3, 2025 45 / 273


Stem and Leaf diagram

Data: 21, 22, 32, 34, 34, 35, 40, 42, 46, 48, 49, 52, 52, 57, 61

2 1 2
3 2 4 4 5
4 0 2 6 8 9
5 2 2 7
6 1

Data: 329, 332, 335, 337, 339, 340, 344, 345, 347, 350, 352, 355,
358, 361, 364, 365, 372

August 3, 2025 46 / 273


Scatter plot

Applied for pair sample


300
(dependent sample)
250
Labor Output Labor Output
11 80 15 250 200
11 130 16 220

Output
12 150 17 210 150
13 110 17 260
13 150 18 240 100
13 200 18 200
15 170 19 240 50
14 180 19 280
0
Table: Output - Labor data 10 12 14 16 18 20
Labor

Figure: Output - Labor scatter plot

August 3, 2025 47 / 273


Scatter plot and Correlation

(Weak) positively correlated (Strong) positively correlated

No correlated Negatively correlated

August 3, 2025 48 / 273


Scatter plot / Bubble chart

60
Applied for multi-variate data 50 C 23
D 30
Project R& D Adv. Profit 40

Advertising
A 30 35 50 A 50
B 40 25 40 30 F 20
C 45 50 23 B 40
20
D 15 45 30
E 50 10 15 10 E 15
F 10 30 20
Table: Projects’ data 0
0 10 20 30 40 50 60
R&D

Figure: Projects’ bubble chart

August 3, 2025 49 / 273


2.2. MEASURES OF LOCATION

Red
Mean
Median
1 2 3 4 5 6 7 8 9 10
Mode
Quartile and Quantile Green
Boxplot and Outlier

1 2 3 4 5 6 7 8 9 10

Example 2.2
Blue
Compare the wage in 3 firms ($.th.)
with histograms on the right

1 2 3 4 5 6 7 8 9 10

August 3, 2025 50 / 273


Mean (Arithmetic Mean)

Sample and Population Mean


Sum of values
Mean =
Size
Sample mean x̄ and Population mean µ
P P
i xi xi
x̄ = ; µ= i
n N

Mean has the same unit with X


Mean is ‘sensitive’ to any change in element’s value

Example 2.3
Compare the means of households’ income ($ thousand) in two areas
Area (A): 10, 11, 15, 17, 12
Area (B): 5, 5, 6, 8, 10, 50
August 3, 2025 51 / 273
Median

Median of data
Median is the middle value of an ordered data, denoted by me or x̃.
n+1
Median is value at position of .
2

Median is not affected by extreme value


Median is useful to compare datas with outliers

Example 2.4
Find the median
Data (A): 10, 11, 15, 17, 12
Data (B): 5, 5, 6, 8, 10, 50
Data (C): XS, XS, S, M, L, XL, XL

August 3, 2025 52 / 273


Mode

Mode of data
Mode is the most frequently value, denoted by m0 . Data may have 0, 1, or
more than 1 mode.

Example 2.5

Data (A): 10, 11, 15, 17, 12


Data (B): 5, 5, 6, 8, 10, 50
Data (C): XS, XS, S, M, L, XL, XL
Data (D): Yellow, Red, Red, Green, Blue

Nominal Ordinal Scale


Mean Yes
Median Yes Yes
Mode Yes Yes Yes

August 3, 2025 53 / 273


Quartile - Interquartile Range

Quartiles and Interquartile Range of data


Three quartiles divide ordered data into four equal parts, denoted by
Q1 (lower fourth), Q2 (median), Q3 (upper fourth).
Interquartile Range, IQR, or forth spread fs
IQR = fs = Q3 − Q1

Divide data into


4 parts: 3 quartiles
5 parts: 4 quintiles
10 parts: 9 deciles
100 parts: 99 percentiles: q0.01 , q0.02 , ..., q0.99

August 3, 2025 54 / 273


Quantile

Formula of Quantile
In ordered data size n, quantile level β, denoted by qβ is computed as
following
(n + 1)β = {integer .decimal}
qβ = xint. + (dec.)(xint.+1 − xint. )

Example 2.6
Find quartiles and quantile 30% of data
Data (A): 10, 11, 15, 17, 12
Data (B): 5, 5, 6, 8, 10, 50

August 3, 2025 55 / 273


Boxplot and Outlier
Five key-point: {xmin ; Q1 , Q2 , Q3 , xmax } or {q0 ; q0.25 ; q0.5 ; q0.75 ; q1 }
Mild range: (Q1 − 1.5 · IQR; Q3 + 1.5 · IQR)
Outlier: smaller than Q1 − 1.5 · IQR or greater than Q3 + 1.5 · IQR
Extreme: smaller than Q1 − 3 · IQR or greater than Q3 + 3 · IQR

• • • • •

BOXPLOT
1.5 · IQR IQR 1.5 · IQR
Outlier Outlier

August 3, 2025 56 / 273


Normal QQ plot

Example 2.7
Data is: 3, 3, 3, 4, 4, 5, 5, 7, 10
Is data approximately Normal distributed?

x q Normal
3 q0.1 -1.28 10
3 q0.2 -0.84
3 q0.3 -0.52
4 q0.4 -0.25 7
4 q0.5 0
5 q0.6 0.25 5
4
5 q0.7 0.52
3
7 q0.8 0.84
10 q0.9 1.28 -2 -1 0 1 2

August 3, 2025 57 / 273


Normal QQ plot

• •• ••••• •
•••• •••••••
• •
• •• ••••

• •• ••
•• ••• ••
• •
• ••• •
• • •
••• •• •


••• • •
••

• ••• •
•••• •••

• ••• •• • •

August 3, 2025 58 / 273


2.3. MEASURES OF VARIABILITY

Example 2.8
Compare variability of the data
Range
Interquartile Range (E)
Variance (F)
Standard Deviation
(G)
Coefficient of Variation
Standardized Value
(H)
1 2 3 4 5 6 7 8 9

August 3, 2025 59 / 273


Range and Interquartile Range

Fomular
Range = xmax − xmin : width of interval cover 100% values
IQR = Q3 − Q1 : width of interval cover 50% middle values

Example 2.9
Find range and IQR of data
Data (A): 10, 11, 15, 17, 12
Data (B): 5, 5, 6, 8, 10, 50

August 3, 2025 60 / 273


Variance
P
Deviation from mean: xi − mean ⇒ i (xi − mean) = 0
Sum of squares: Sxx = i (xi − mean)2
P

Definition of Variance
Variance of sample s 2 and Variance of Population σ 2
n N
(xi − x̄)2 (xi − µ)2
P P

s 2 = i=1 ; σ 2 = i=1
n−1 N

Other formula for sample variance


P 2 
n i xi n h 2 i
2
s = 2
− (x̄) = x − (x̄)2
n−1 n n−1
P 2 P 2
x − i xi /n
= i i
n−1

August 3, 2025 61 / 273


Standard Deviation

Definition of S.D
Standard Deviation of sample s and of Population σ
√ √
s = s 2 ; σ = σ2

Variance and S.D measure variability, but unit of Variance is squared


unit of X , but S.D’s is the same
sx2 > sy2 : X is more variability, dispersion, fluctuate than Y , and Y is
more stable, concentrated.

Example 2.10
Compare variance and standard deviation of samples
Data (A): 10, 11, 15, 17, 12
Data (B): 5, 5, 6, 8, 10, 50

August 3, 2025 62 / 273


Coefficient of Variation

Definition of CV
Coefficient of Variation of Sample and Population
Standard Deviation
CV = × 100%
Mean

Example 2.11
Compare variability of samples (A) income: $ thousand, (B) income: $
thousand, (J) profit: $ milion

Data Values Mean S.D CV


(A) 10, 11, 15, 17, 12
(B) 5, 5, 6, 8, 10, 50
(J) 0, 1, 5, 7, 2

August 3, 2025 63 / 273


Standardized Value

Definition
Standardized value, or z-score, of value xi , denoted by zi
xi − mean
zi =
S.D
Standardized data has zero mean and unit variance:
z̄ = 0 ; sz2 = 1

Example 2.12
Compare wage/week and work-time/week of a worker in company

variable Value Mean S.D z-score


Work-time (h) 46 40 2
Wage (USD) 300 240 30

August 3, 2025 64 / 273


2.4. MEASURES OF SHAPE

Skewness
Kurtosis and Adjusted Kurtosis

Skewness = 0; Kurtosis = 3
Normal distribution
Symmetric, bell-shaped

August 3, 2025 65 / 273


Skewness

Definition
Sample Skewness n
(xi − x̄)3 /n
P
i=1
skew =
sx3

skew < 0 skew = 0 skew > 0


Left-tailed Two-tailed Right-tailed

August 3, 2025 66 / 273


Kurtosis
Definition
Sample Kurtosis n
(xi − x̄)4 /n
P
i=1
kurt =
sx4
Adjusted Kurtosis (Excess Kurtosis)
kurt ∗ = kurt − 3

kurt < 3, kurt ∗ < 0 kurt = 3, kurt ∗ = 0 kurt > 3, kurt ∗ > 0
Platykurtic Mesokurtic Leptokurtic
August 3, 2025 67 / 273
Example
Example 2.13 (E) (F) (G) (H)
Statistics for 4 data n 5 5 9 9
(E) x̄ 5 5 5 5
Range 4 8 8 4
(F)
q0.25 3.5 2 2.5 4
(G)
q0.5 5 5 5 5
q0.75 6.5 8 7.5 6
(H)
123456789 IQR 3 6 5 2
s2 2.5 10 7.5 1.5
| | (E) s 1.58 3.16 2.74 1.22
CV 32% 63% 55% 24%
| | (F)
skew 0 0 0 0
| | (G)
kurt ∗ -1.2 -1.2 -1.2 -0.61
| | (H)
August 3, 2025 68 / 273
2.5. GROUPED DATA
Interval grouped frequency table
Value a0 - a1 a1 - a2 ··· ak−1 - ak
Frequency f1 f2 ··· fk
a0 + a1
Using the middle value of interval: x1 = , ...
2
Value x1 x2 ··· xk
Frequency f1 f2 ··· fk

Formula
P P
i fi xi
fi xi
x̄ = = Pi
n i fi
2 2
P  P 
2 i fi (xi − x̄) n i fi xi 2
s = = − (x̄)
n−1 n−1 n
August 3, 2025 69 / 273
Example

Example 2.14 Histogram - Box plot - Summarize statistics

Descriptives 25

20

xi fi 15

1 10 10

2 22 5

3 25 0
1 2 3 4 5 6 7 8 9
4 18
• •
5 13
6 7 Central Quartile Variability Shape
7 3 x̄ = 3.46 q0.25 = 2 s 2 = 2.857 skew = 0.73
8 1 me = 3 q0.5 = 3 s = 1.697 kurt = 3.42
9 1 m0 = 3 q0.75 = 4.25 CV = 0.489 kurt ∗ = 0.42

August 3, 2025 70 / 273


Pooled Sample

Example 2.15
Sample 1 of 4 elements has mean of 10; sample 2 of 6 elements has mean
of 12. What is the mean of pooled sample that is combination of two?
Example 2.16
Find sample mean and sample variance of pooled sample
(a) of two samples
sample size mean variance
(1) n1 x̄1 s12
(2) n2 x̄2 s22

(b) of k samples that sample size, mean, variance are nj , x̄j , sj2 , j = 1, k,
respectively.

August 3, 2025 71 / 273


2.6. MEASURES OF RELATIONSHIP

Scatter Plot Scatter plot


Covariance
Correlation Coefficient

Paired sample data


No. X Y

Y
1 x1 y1
2 x2 y2
.. .. ..
. . .
n xn yn
X

August 3, 2025 72 / 273


Covariance and Correlation in Sample

Sample or n pair (xi , yi ), i = 1, n

Definition
Sample Covariance n
P
(xi − x̄)(yi − ȳ )
i=1
cov (x, y ) =
n−1
Correlation Coefficient
P
cov (x, y ) (xi − x̄)(yi − ȳ )
rxy = = pP i pP
sx sy i (xi − x̄)
2
i (yi − ȳ )
2

Other formula for Covariance


P P P 
n xi yi xi yi n  
cov (x, y ) = − = x · y − x̄ · ȳ
n−1 n n n n−1

August 3, 2025 73 / 273


Covariance and Correlation in Population

Definition
Population Covariance N
P
(xi − µx )(yi − µy )
i=1
Cov (x, y ) =
N
Correlation Coefficient Cov (x, y )
ρxy =
σx σy

Correlaion Coef. measure ’linear’ relationship between two variables

r =1 linear positively correlated


cov > 0
r ∈ (0, 1) positively correlated
cov = 0 r =0 no correlated
r ∈ (−1, 0) negatively correlated
cov < 0
r = −1 linear negatively correlated

August 3, 2025 74 / 273


Correlation

(a) r = 0.5 (b) r = 0.8 (c) r = 1

(d) r = 0 (e) r = −0.7 (f) r = −0.98

August 3, 2025 75 / 273


Example
Example 2.17

Income (x) 10 10 11 13 15 15 16 18 19 20
Expenditure (y ) 9 5 10 10 12 13 9 14 14 15
No. xi yi xi2 yi2 xi yi P
1 10 9 100 81 90 xi
x̄ =
2 10 5 100 25 50 n
n h 2 i
3 11 10 121 100 110 sx2 = x − (x̄)2
n−1
4 13 10 169 100 130 P
yi
5 15 12 225 144 180 ȳ =
n
6 15 13 225 169 195 n h 2 i
sy2 = y − (ȳ )2
7 16 9 256 81 144 n−1
n
8 18 14 324 196 252 cov (x, y ) = [xy − x̄ ȳ ]
n−1
9 19 14 361 196 266 cov (x, y )
10 20 15 400 225 300 rx,y =
sx sy
Sum 147 111 2281 1317 1717

August 3, 2025 76 / 273


Properties of statistics

All elements
Statistic +a ×b
Mean +a ×b
Median, mode +a ×b
Min, Max +a ×b
Quartile, quantile +a ×b
Interquartile range constant ×b
Variance constant ×b 2
Standard deviation constant ×|b|
Coefficient of variation change constant
Skewness constant constant
Kurtosis constant constant

August 3, 2025 77 / 273


Population parameter
Population data size N: {(xi , yi ), i = 1, N}
Mean P P
i xi yi
µx = ; µy = i
N N
Variance
2 (yi − µy )2
P P
2 i (xi − µx ) 2
σx = ; σy = i
N N
Standard deviation q q
σx = σx2 ; σy = σy2

Covariance
P
i (xi − µx )(yi − µy )
Cov (x, y ) = = µxy − µx µy
N
Correlation Cov (x, y )
ρx,y =
σx σy
August 3, 2025 78 / 273
Sample statistics vs Population parameters

Sample Population On the


Measure statistic parameter same data
Size n N Equal
Mean x̄ µ Equal
Median, mode me , m0 Me , M0 Equal
Other locations 5 key-point 5 key-point Equal
Interquartile range IQR IQR Equal
Variance s2 σ2 Unequal
Standard deviation s σ Unequal
Coef. of Variation CV CV Unequal
Covariance cov (x, y ) Cov (x, y ) Unequal
Correlation rx,y ρx,y Equal

August 3, 2025 79 / 273


Practice: Microsoft Excel and R

Statistic Excel Function R function


Size =count(var) length(var)
Mean =average(var) mean(var)
Median =median(var) median(var)
Mode =mode(var)
Quartile j =quartile(var, j)
Quantile β =percentile(var, β) quantile(var, β)
Sample Variance =var(var) var(var)
Sample S.D =stdev(var) sd(var)
Sample Covariance cov(var1, var2)
Population Variance =var.p(var)
Population S.D =stdev.p(var)
Population Covariance =covar(var1, var2)
Correlation =correl(var1, var2) cor(var1, var2)

August 3, 2025 80 / 273


Practice: Summary Statistics

In Microsoft Excel
Install Data Analysis Toolpak
Using Descriptive Statistics
Using Covariance
Using Correlation
In R
> summary(var)
>

August 3, 2025 81 / 273


Exercise - Lecture 02

Book Page Compulsory Optional


[1] 8 1, 2
31 30, 32, 35 38, 39
41 41, 42, 44 46, 54, 55,
44 60 75, 76
[4] 27 1.1 to 1.8
67 2.9 to 2.11

Summary statistics in Microsoft Excel


Install Data Analysis Toolpak
Using Descriptive Statistics / Covariance / Correlation
In R
> summary(var)

August 3, 2025 82 / 273


Lec 3. SAMPLING DISTRIBUTION

3.1 Population
3.2 Random Sample [1] p.285
3.3 Sample Mean [1] p.296
3.4 Sample Variance [1] p.317
3.5 Sample Proportion

Reference
Book [1] Chapter 6, pp.284 - 381.
Book [2] Chapter 8
Book [4] Chapter 6

August 3, 2025 83 / 273


3.1. POPULATION

Population vs Random variable


Population is identical to Random variable.
Population mean is E (X ), population variance is V (X ).

Example 3.1
Number of dots when tossing a die
Population: {..., 1, 3, 6, 4, 2, ...}: infinite number of values
Rv X , X ∈ {1, 2, 3, 4, 5, 6}
Probability distribution
X 1 2 3 4 5 6
P 1/6 1/6 1/6 1/6 1/6 1/6
E (X ) = 3.5 is population mean
V (X ) = 2.917 is population variance

August 3, 2025 84 / 273


Parameter

Random variable Population parameter


Distribution Denoted Parameter Mean Variance
Bernoulli B(1, p) p p p(1 − p)
Binomial B(n, p) n, p np np(1 − p)
Negative Bi. NB(r , p) r, p r /p r (1 − p)/p 2
Geometric G (p) p 1/p (1 − p)/p 2
Poisson P(λ) λ λ λ
Uniform U(a, b) a, b (a + b)/2 (b − a)2 /12

August 3, 2025 85 / 273


Parameter

Random variable Population parameter


Distribution Denoted Parameter Mean Variance
Exponential E (λ) λ 1/λ 1/λ2
Erlang EL(λ, r ) λ, r r /λ r /λ2
Gamma Γ(α, β) α, β αβ αβ 2
Normal N(µ, σ 2 ) µ, σ 2 µ σ2
2 )/2 2 2
Log-normal ln X ∼ N µ, σ 2 e (2µ+σ e 2µ+σ (e σ − 1)
Chi-squared χ2 (v ) v v 2v
Student T (v ) v 0 v /(v − 2); v > 2

August 3, 2025 86 / 273


3.2. RANDOM SAMPLE
Population ≡ rv X . Population parameter: E (X ) = µ, V (X ) = σ 2 .

Random sample
With random variable X , random sample size n is set of n random
variable, denoted by X = (X1 , X2 , ..., Xn ), that
Xi are independent
Xi are indentically distributed with X ; (Xi are iid., i = 1, n).

For each element Xi :


E (X1 ) = E (X2 ) = · · · = E (Xn ) = µ
V (X1 ) = V (X2 ) = · · · = V (Xn ) = σ 2
Random sample ⇔ Repeat sample
Observed sample
...is observed value set from random sample, x = (x1 , x2 , ..., xn )

August 3, 2025 87 / 273


Statistic
Definition
Statistic is a function on sample. Assumme that function is G
Random sample: G = G (X1 , X2 , ..., Xn ) is a random variable
Observed sample: g = G (x1 , x2 , ..., xn ) is a number, oberved value

Example 3.2
Rolling a die, X is number of dots, consider sample size 3, sample mean
Random sample Observed sample
Sample X = (X1 , X2 , X3 ) x1 = (1, 3, 2); x2 = (1, 3, 4)
Sample mean X = (X1 + X2 + X3 )/3 x̄1 = 2; x̄2 = 8/3
Probability P(X = 1) = 1/216 P(x̄1 = 1) = 0
P(X = 2) =? P(x̄1 = 2) =?
P(X = 8/3) =? P(x̄1 = 8/3) =?
August 3, 2025 88 / 273
Statistic
Statistic - is a proxy of population parameter in sample, should be
n→∞
Stat. −−−→ Parameter
Mean of Stat. = Parameter
Variance of Stat. is small
n→∞
Variance of Stat. −−−→ 0
Probability distribution is specified

Random sample Observed sample


Sample X = (X1 , P
X2 , ..., Xn ) x = (x1 , P
x2 , ..., xn )
Xi xi
Sample mean X = i x̄ = i
n n
(Xi − X )2 (xi − x̄)2
P P
Sample variance S = i
2 s = i
2
n−1
√ n√− 1
Sample S.D S = S2 s = s2

August 3, 2025 89 / 273


3.3. SAMPLE MEAN

Sample Mean
From population with mean of µ, variance of σ 2 , sample (X1 , X2 , ..., Xn ),
sample mean P
Xi
X = i
n
is a random variable
σ2 σ
E (X ) = µ ; V (X ) = ; σX = √
n n

X : disperses around the (population) mean


σX is named ’standard error of the mean’ (S.E)
n→∞
V (X ) −−−→ 0
n=1
X −−→= X

August 3, 2025 90 / 273


Prob. Distribution of Sample mean

Example 3.3
Find distribution of sample mean when rolling a die 1, 2, 3 times
n = 1:
X 1 2 3 4 5 6
1 1 1 1 1 1
P 6 6 6 6 6 6
n = 2:
1 2 3 4 5 6
1 1 1.5 2 2.5 3 3.5
2 1.5 2 2.5 3 3.5 4
3 2 2.5 3 3.5 4 4.5
4 2.5 3 3.5 4 4.5 5
5 3 3.5 4 4.5 4.5 5
6 3.5 4 4.5 5 5.5 6

August 3, 2025 91 / 273


Example (cont.)
n=2
X̄ 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
1 2 3 4 5 6 5 4 3 2 1
P 36 36 36 36 36 36 36 36 36 36 36

n=3
X̄ 1 4/3 5/3 2 7/3 8/3 ··· 16/3 17/3 6
1 3 6 10 15 21 6 3 1
P 216 216 216 216 216 216 ··· 216 216 216

n=1 n=2 n=3

1 2 3 4 5 6 1 3 2 5 3 7 4 9 5 11 6 1 4 5 2 7 8 3 10 11 4 13 14 5 16 17 6
1 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3

August 3, 2025 92 / 273


Distribution of Sample mean: Normality Population
Normal population
If population is Normality X ∼ N(µ, σ 2 ) then X is Normal distributed
µX = E (X ) = µ
σ2
σX2 = V (X ) =
n
σ2
   
2
 σ 2
X ∼ N µX , σX , or N µ, , or N µ, √
n n

n=4 n = 10 n = 25

µ µ µ
August 3, 2025 93 / 273
Example

Example 3.4
Wage is Normality with mean of 200 and SD of 40.
(a) Find the Probability that average wage of a sample is greater than
210, with n = 1; n = 4; n = 16; n = 100
(b) With probability of 0.9, find the upper limit of average wage of 16
workers
(c) With probability of 0.9, find at least 3 interval aroud population mean
that avarge wage of 16 worker falls in

Example 3.5
For Nomality population, X ∼ N(µ, σ 2 ), sample size n, with probability of
(1 − α), build formula for intervals that sample mean falls in. Which of
them is narrowest?

August 3, 2025 94 / 273


Distribution of Sample mean: Non-Normality

Non-Normalily population
If population with mean of µ, variance σ 2 , Non-normality, with Large
sample (n > 30), apply the Central Limit Theorem, the X is Normal
distributed
σ2
   
2
 σ 2
X ∼ N µX , σX , or N µ, , or N µ, √
n n

For Non-normality and small sample, there is no specified Distribution of


Sample mean.
Example 3.6
Time (hour) worker spent to finish a business is Exponential distributed
with λ = 0.2. Random survey 50 workers.
(a) Find probability that sample mean is great than 6 hours
(b) With probability of 0.95, find the upper interval of sample mean
August 3, 2025 95 / 273
Interval for Sample mean

Population mean and variance: µ, σ 2 , Normality or Non-normality but


large sample, with probability of (1 − α), the interval of sample mean
(acceptance interval):
Two-tailed σ σ
µ − zα/2 √ < X < µ + zα/2 √
n n
Right-tailed and Left-tailed
σ σ
µ − zα √ < X and X < µ + zα √
n n

• • • •
µ µ µ

August 3, 2025 96 / 273


3.4. SAMPLE VARIANCE
Population variance is σ 2 .
If population mean µ is known
(Xi − µ)2
P
2
Sµ = i ⇒ E (Sµ2 ) = σ 2
n
If population mean µ is unknown
(Xi − X )2
P
MS = i = X 2 − (X )2
n
σ2 n−1 2
E (MS) = V (X ) − V (X ) = σ 2 − = σ
n n
MS is ‘biased’ proxy of population variance.

Sample Variance
− X )2
P
2 n i (Xi
S = MS = ⇒ E (S 2 ) = σ 2
n−1 n−1
August 3, 2025 97 / 273
Distribution of Sample Variance: Normality population

Expectation Variance Related Distribution Df


2σ 4 nSµ2
Sµ2 σ2 ∼ χ2 (n) n
n σ2

n−1 2 2(n − 1)σ 4 nMS


MS σ ∼ χ2 (n − 1) n−1
n n2 σ2

2σ 4 (n − 1)S 2
S2 σ2 ∼ χ2 (n − 1) n−1
n−1 σ2
Or S 2 is proportional to Chi-squared distributed, denoted: S 2 ∝ χ2 (n − 1).
Example 3.7

Variance of wage is 30 (USD2 ), random survey 25 workers, With


probability of 0.9, find the upper limit of sample variance.

August 3, 2025 98 / 273


Normal Population

X ∼ N(µ, σ 2 ) Known µ Unknown µ

X −µ
Known σ 2 √ ∼ N(0, 1)
σ/ n
nSµ2 (n − 1)S 2
∼ χ2 (n) ∼ χ2 (n − 1)
σ2 σ2
X −µ X −µ
Unknown σ 2 √ ∼ T (n) √ ∼ T (n − 1)
Sµ / n S/ n
(n − 1)S 2
∼ χ2 (n − 1)
σ2

August 3, 2025 99 / 273


3.5. SAMPLE PROPORTION
In population, X = 1 for event A, X = 0 for A. If P(A) = p then
X ∼ B(1, p).
Sample Proportion
X ∼ B(1, p), sample (X1 , X2 , ..., Xn ), then sample mean is sample
proportion P
Xi
X = i = p̂
n
p(1 − p)
E (X ) = E (p̂) = p ; V (X ) = V (p̂) =
n

Example 3.8
Probability that voter thay ‘Yes’ is 0.7. X = 1 if voter say Yes, and 0
otherwise. Data is {Yes, Yes, No, No, Yes}.
(a) Write sample with X , find sample mean and sample proportion
(b) Find probability that above sample occurs
August 3, 2025 100 / 273
Distribution of Sample Proportion

By Central Limit Theorem, n > 30 (more common: n ≥ 100), sample


proportion is assymtoptic Normal distributed
Distribution
With n large enough  
p(1 − p)
p̂ ∼ N p,
n

p = 0.7; n = 1 n=4 n = 16 n = 32

0 1 0 1 2 3 4 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 1 01 2 3 4 5 6 7 8 9101112131415161718192021222324252627282930311
1 1 4 4 4 4 4 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 32323232323232323232323232323232323232323232323232323232323232

August 3, 2025 101 / 273


Example

Example 3.9
Probability that a customer of commercial bank is in bad-debt is 0.2. With
sample of 200 customers, find
(a) probability that proportion of bad-dept is greater than 22 %
(b) the upper limit of proportion of bad-debt, with probability level of 0.9
(c) the narrowest interval of bad-dept proportion, with probability of 0.95

Example 3.10
Population proportion is p, in sample which sized of n, probabilty level of
(1 − α), find intervals for sample proportion

August 3, 2025 102 / 273


Interval for Sample proportion

Population proportion p, large n, with probability of (1 − α), the interval


of sample proportion (acceptance interval):
Right-tailed and Left-tailed
r r
p(1 − p) p(1 − p)
p − zα < p̂ and p̂ < p + zα
n n
Two-tailed
r r
p(1 − p) p(1 − p)
p − zα/2 < p̂ < p + zα/2
n n

August 3, 2025 103 / 273


Exercise - Lecture 03
Book Page Compulsory Optional
[1] 295 1, 2 5, 6
304 11, 12, 15 13, 16, 18, 19, 21, 22
312 27, 28 29, 33, 36, 38, 41
325 47, 48, 56
326 68, 69 71, 72, 74, 77
[2] 241
258
Using R to build acceptance interval for sample mean:
> mu <-...
> sigma <-...
> prob <-...
> size <-...
> ll <-...
> ul <-...
August 3, 2025 104 / 273
Lec 4. POINT ESTIMATION

4.1 Concepts [1] p.332


4.2 Criteria for Point Estimate [1] p.335
4.3 Percentile Matching Estimator
4.4 Moment Estimator [1] p.350
4.5 Maximum Likelihood Estimator [1] p.352
4.6 Fisher Information [1] p.371
4.7 Cramér-Rao inequality [1] p373

Reference
Book [1] Chapter 7, pp.284 - 381.
Book [2] Chapter 10
Book [3] Chapter 7

August 3, 2025 105 / 273


4.1. CONCEPTS
Unknown population parameter θ
µ, σ 2 of Normal distribution
p of Bernoulli, n, p of Binominal
λ of Poisson, λ of Exponential
...
Methodology
Based on random sample (X1 , X2 , ..., Xn )
Point estimate: single value θ̂
Interval estimate: (LL, UL) = (lowerlimit, upperlimit)

Example 4.1
P
Xi Xmin + Xmax
Estimate for mean: X = ; Xmid =
nP 2 P
(Xi − X )2 (Xi − X )2
Estimate for variance: S 2 = ; MS =
n−1 n
August 3, 2025 106 / 273
Estimator vs Estimate
Estimator is Random statistic on random sample
Estimate is observed value of statitics from observed sample
S.D of estimator is called Standard error (S.E)

Example 4.2

Random sample Observed sample


Estimation (X1 , X2 , X3 ) (10, 14, 21)

X1 + X2 + X3
for mean X = x̄ = 15
3
P3
(Xi − X )2
for variance S = i=1
2 s 2 = 31
3−1

Estimator Estimate

August 3, 2025 107 / 273


4.2. CRITERIA FOR ESTIMATOR

Mean Squared Error (MSE) when estimate θ by estimator θ̂

MSE = E (θ̂ − θ)2


 2  2
= E (θ̂) − θ + V (θ̂) = bias + V (θ̂)

Unbiasness
θ̂ is unbiased estimator of θ if
E (θ̂) = θ, or bias = 0

Biased estimator
Over-estimate: E (θ̂) > θ, or bias > 0
Under-estimate: E (θ̂) < θ, or bias < 0

August 3, 2025 108 / 273


Criteria for Estimator
Efficientness
θ̂1 and θ̂2 are unbiased estimator
If V (θ̂1 ) < V (θ̂2 ) then θ̂1 is more relatively efficient than θ̂2
If V (θ̂1 ) is minimum among every unbiased estimator → ‘efficient’
estimator, or MVUE: minimum variance unbiased estimator, BUE:
best unbiased estimator

Biased Unbiased
••
••• •
•• ••• ••
• • •••••
•• •• ••
• •• • ••• • ••
• • • •


Efficient
August 3, 2025 109 / 273
Example

Example 4.3
From population with mean of µ, variance of σ 2 ,
(a) random sample (X1 , X2 , X3 ), find the MVUE of µ among
1 1 1 1 1
M1 = X1 + X2 M4 = X1 + X2 + X3
3 3 2 3 6
1 1 1 1 1
M2 = X1 + X2 M5 = X1 + X2 + X3
2 2 2 4 4
1 2 1 1 1
M3 = X1 + X2 M6 = X1 + X2 + X3
3 3 3 3 3

(b) sample (X1 , X2 , ..., Xn ), find MVUE in class of

µ̂ = α1 X1 + α2 X2 + · · · + αn Xn

August 3, 2025 110 / 273


Example
Example 4.4
From population with mean of µ, variance of σ 2 , take two samples, size of
n1 = 3, n2 = 5, and sample mean of X 1 , X 2
(a) Find the MVUE in
1 1 1 2
N1 = X 1 + X 2 N3 = X 1 + X 2
3 3 3 3
1 1 2 1
N2 = X 1 + X 2 N4 = X 1 + X 2
2 2 3 3
(b) Find α1 , α2 of the MVUE in the class of
µ̂ = α1 X 1 + α2 X 2
(c) In general, there are k sample, size nj , sample mean X j , j = 1, k.
Find MVUE in class of
µ̂ = α1 X 1 + α2 X 2 + · · · + αk X k
August 3, 2025 111 / 273
Example
Example 4.5
For normality, which is the MVUE of population variance σ 2 in

Estimator Sµ2 MS S2
n−1 2
Expectation σ2 σ σ2
n
2σ 4 2(n − 1)σ 4 2σ 4
Variance
n n2 n−1

Example 4.6
From Normal population, sample 1 size 3 and variance of S12 , sample 2
size 5 and variance of S22 .
(a) Find MVUE of σ 2 in all linear combination of S12 and S22
(b) In general, k sample size nj , sample variance Sj2 , j = 1, n; find the
MVUE of σ 2 in all linear combination of Sj2
August 3, 2025 112 / 273
4.3. PERCENTILE MATCHING ESTIMATOR

If parameter could be calculated from percentile / quantile, estimate


quantile by sample quantile → estimate parameter
Using: median, 1st quartile,...

Example 4.7
Data is: 4, 4, 5, 6, 8, 9, 10, find Percentile matching estimate of
ln 2
(a) parameter λ of Exponential distribution, in which median is
λ
(b) parameter a, b of Uniform distribution
(c) parameter µ, σ 2 of Normality distribution

August 3, 2025 113 / 273


4.4. MOMENT ESTIMATOR

If parameter could be calculated from moments, estimate moment →


estimate parameter
Estimate k parameter by first k moment
Estimate E (X ) by X
Estimate E (X 2 ) by X 2
2
Estimate V (X ) = E (X 2 ) − E (X ) by X 2 − (X )2 = MS


Example 4.8
Data is: 4, 4, 5, 6, 8, 9, 10, find Moment estimate of
(a) parameter λ of Poisson distribution
(b) parameter λ of Exponential distribution
(c) parameter µ, σ 2 of Normality distribution
(d) paramater a, b of Uniform distribution

August 3, 2025 114 / 273


4.5. MAXIMUM LIKELIHOOD ESTIMATOR

Discrete rv X has PMF P(x), probability that sample (x1 , x2 , ..., xn )


orcurs: P(x) = P(x1 )P(x2 ) · · · P(xn ) is ‘likelihood measure’.

Likelihood function
Rv X with parameter θ, random sample X = (X1 , X2 , ..., Xn ) then
likelihood function is
(Q
n
P(Xi , θ) : discrete
L(X, θ) = Qni=1
i=1 f (Xi , θ) : continuous

Example 4.9
Probability is p = 0.7; which of the following samples is more likely?
x1 = (1, 0, 1); x2 = (0, 1, 0); x3 = (0, 0, 1); x4 = (1, 1, 1)

August 3, 2025 115 / 273


Example
Example 4.10
Sample is x = (1, 0, 1);
(a) which p is more likely? p = 0.3; p = 0.5; p = 0.7; p = 0.9
(b) Find the most likely value of p

p L(x, p)
0.01 0.000099 L(p)
0.02 0.000392 0.15
.. .. 0.12
. .
0.66 0.148104 0.09
0.67 0.148137
0.06
0.68 0.147968
.. .. 0.03
. .
0.98 0.019208 0 0.1 0.3 0.5 0.7 0.9 p
0.99 0.009801
August 3, 2025 116 / 273
Maximum Likelihood Estimator (MLE)

MLE
MLE of θ is θ̂ that maximize Likelihood function or logarithm of
Likelihood function

L(X, θ) → max or ln L(X, θ) → max


∂L(X, θ)

 =0
L(X, θ) → max ⇔ 2
∂θ
 ∂ L(X, θ) < 0

∂θ2

∂ ln L(X, θ)

 =0
ln L(X, θ) → max ⇔ 2
∂θ
 ∂ ln L(X, θ) < 0

∂θ2

August 3, 2025 117 / 273


Example

Example 4.11
Find MLE of parameter(s) in distributions

(a) Bernoulli P(x, p) = p x (1 − p)1−x ; x = {0, 1}


e −λ λx
(b) Poisson P(x, λ) = ; x = {0, 1, 2, ...}
x!
1 (x−µ)2
(c) Normal f (x, µ, σ 2 ) = √ e − 2σ2 ; x ∈R
2πσ 2
(d) Exponential f (x, λ) = λe −λx ; x >0

August 3, 2025 118 / 273


4.5. FISHER INFORMATION

Score and Information


Score function
∂ ln L(X, θ)
Score(X, θ) =
∂θ
Fisher’s information about θ on sample size n
 2 
∂ ln L(X, θ)
In (θ) = V Score = E Score 2 = −E
   
∂θ2

Properties
E (Score) = 0
Fisher’s information on one element is IX (θ), then

In (θ) = nIX (θ)

August 3, 2025 119 / 273


4.6. EFFICIENTNESS

Cramer-Rao Inequality
For every unbiased estimator θ̂ of θ on sample of n elements, then
1
V (θ̂) ≥
In (θ)

RHS of inequality if boundary of unbiased estimator’s variance


Unbiased estimator θ̂ that V (θ̂) = RHS is efficient / MVUE / BUE

Example 4.12
Proof that
(a) p̂ is MVUE of p in Bernoulli distribution
(b) X is MVUE of µ in Normality distribution
(c) S 2 is not MVUE of σ 2 in Normality distribution

August 3, 2025 120 / 273


Other Categories

Asymptotic unbiased estimator θ̂

lim E (θ̂) = θ
n→∞

Asymptotic efficient unbiased estimator θ̂


1
lim V (θ̂) =
n→∞ In (θ)

Consistent estimator

lim P(|θ̂ − θ| < ε) = 1


n→∞

August 3, 2025 121 / 273


In practice

Sample
Estimator
Small Large
Efficient :)) :)))
Unbiased Not Asymptotic efficient :| :))
efficient Not Asymptotic efficient :| :|
Asymptotic unbiased :(( :|
Biased
Not assymptotic unbiased :(( :((
Consistence :(( :)

August 3, 2025 122 / 273


Summarize

Distribution Parameter Method Estimator Property


Bernoulli p Moment, ML p̂ = X Efficient
Quantile Me Asym. unbiased
µ
Moment, ML X Efficient
Quantile ∗ Biased
Normal
Moment, ML MS Asym. unbiased
σ2
*Known µ Sµ2 Efficient
*Adjust MS S2 Asym. efficient
Poisson λ Moment, ML X Efficient
Exponential λ Moment, ML 1/X Biased

August 3, 2025 123 / 273


Exercise - Lecture 04

Book Page Compulsory Optional


[1] 346 1, 2, 7 8, 9, 10, 11, 12
359 21, 27 22, 23, 30
378 42 43, 44, 45, 46
[2] 290
299
305

Using R to estimate likelihood function of sample from Normality


population
> sample <-...
> mean <-...
> ms <-...
> density <-...
> likelihood <-...
August 3, 2025 124 / 273
LECTURE 5. CONFIDENCE INTERVAL

5.1 Concepts [1] p.332


5.2 CI for Normal Mean - Known σ [1] p.383
5.3 CI for Normal Mean - Unknown σ [1] p.401
5.4 CI for Normal Variance [1] p.409
5.5 CI for Proportion [1] p.395

Reference
Book [1] Chapter 8, pp.382 - 424.
Book [2] Chapter 11
Book [4] Chapter 8

August 3, 2025 125 / 273


5.1. CONCEPTS

Interval estimate ⇔ Confidence interval


Population parameter θ, needed to be estimated
Sample (X1 , X2 , ..., Xn )
Confidence level (1 − α)
Confidence Interval: (LL, UL)

P(LL < θ < UL) = 1 − α

Confidence width
w = UL − LL

August 3, 2025 126 / 273


5.2. C.I. FOR NORMAL MEAN - KNOWN σ

X ∼ N(µ, σ 2 ); σ is known, find C.I. for µ with confidence level (1 − α).


σ2
 
X −µ
X ∼ N µ, ⇒Z = √ ∼ N(0, 1)
n σ/ n
 
σ σ
P µ − zα/2 √ < X < µ + zα/2 √ =1−α
n n

Two-sided Confidence interval


 
σ σ
P X − zα/2 √ < µ < X + zα/2 √ =1−α
n n
σ
Other way: X ± ME , in which ME = zα/2 √
n
C.I. width: w = 2ME
 z σ 2  2z σ 2
α/2 α/2
Sample size: n = =
ME w

August 3, 2025 127 / 273


C.I. for Normal Mean - Known σ

One-sided C.I.
Right tail (Lower bounded) and Left tail (Upper bounded) C.I.
σ σ
P(X − zα √ < µ) = 1 − α and P(µ < X + zα √ ) = 1 − α
n n

Replace X in random sample by x̄ from observed sample.


Example 5.1
Weight of product (g) is Normal distributed with variance of 25g2 .
Random survey 16 products then total weight is 800g.
(a) Find C.I. 95% of mean
(b) Find Lower bounded C.I. 90% of mean
(c) To reduce C.I. width to 3g, confidence level 95%, how many product
should be survey
(d) Total weight of 24 other products is 1248g. Find C.I. 95% of mean
August 3, 2025 128 / 273
C.I: Random or Non-random

Which is correct?
 
σ σ
P X − zα/2 √ < µ < X + zα/2 √ =1−α
n n
 
σ σ
P X − 1.96 √ < µ < X + 1.96 √ = 0.95
n n
 
5 5
P X − 1.96 · < µ < X + 1.96 · = 0.95
4 4
 
5 5
P 50 − 1.96 · < µ < 50 + 1.96 · = 0.95
4 4
P(50 − 2.45 < µ < 50 + 2.45) = 0.95

August 3, 2025 129 / 273


Confidence Interval: random or non-random

95%
X
| |

August 3, 2025 130 / 273


5.3. C.I. FOR NORMAL MEAN - UNKNOWN σ
X −µ
Replace σ by S: T = √ ∼ T (n − 1)
S/ n

C.I. for mean


Two-sided C.I. S S
X − t(n−1)α/2 √ < µ < X + t(n−1)α/2 √
n n

Right-sided and Left-sided


S S
X − t(n−1)α √ < µ and µ < X + t(n−1)α √
n n

CI is for mean, interval for single random observation is prediction


interval (P.I.), with prediction level (1 − α)
r
1
X ± t(n−1)α/2 S 1 +
n
August 3, 2025 131 / 273
Example

Example 5.2
Assumed that wage per hour ($) of financial expert is normal distributed.
Data of 16 randomly surveyed experts is below
Wage (x) 20 22 24 26
Freq. 1 6 5 4
P P 2
And i xi = 376; i xi = 8888
(a) Calculate sample mean and standard deviation
(b) Find the 95% confidence interval of mean (population mean)
(c) Find the 90% prediction interval
(d) Find the 80% upper confidence interval of mean

August 3, 2025 132 / 273


5.4. C.I. FOR NORMAL VARIANCE

Variance of Normal population


Two-sided C.I.
(n − 1)S 2 (n − 1)S 2
2
< σ2 < 2
χ(n−1)α/2 χ(n−1)1−α/2
Right-sided and Left-sided
(n − 1)S 2 (n − 1)S 2
< σ2 and σ2 <
χ2(n−1)α χ2(n−1)1−α

Example 5.3
Wage survey of 16 experts shows the mean of $23.5 and the variance of
$3.467 Assume that wage is normally distributed.
(a) Find the 95% confidence interval of variance
(b) Find the 90% upper confidence of standard deviation

August 3, 2025 133 / 273


5.5. C.I. FOR PROPORTION

 
p(1 − p)
n large enough, p̂ ∼ N p,
n
r r !
p(1 − p) p(1 − p)
P p − zα/2 < p̂ < p + zα/2 =1−α
n n

Two-sided C.I.
q
2 /2n 2 /(4n2 )
p̂(1 − p̂)/n + zα/2
p̂ + zα/2
p∈ 2 /n
± zα/2 2 /n
1 + zα/2 1 + zα/2
One-sided C.I.

August 3, 2025 134 / 273


C.I. for Proportion - Large Sample

Large sample (n ≥ 100)


Two-sided C.I r r
p̂(1 − p̂) p̂(1 − p̂)
p̂ − zα/2 < p < p̂ + zα/2
n n
r
p̂(1 − p̂)
Shorten: p̂ ± ME , in which ME = zα/2
n
2 p̂(1 − p̂)
zα/2 2 p̂(1 − p̂)
4zα/2
Sample size: n = =
ME 2 w2
One-sided C.I.

August 3, 2025 135 / 273


Example

Example 5.4
In 200 insurance customers, there were 48 claims.
(a) Find the 90% C.I. for claim proportion in customers
(b) Keep ME = 3%, C.I 90%, how many customers should be surveyed?
(c) Keep ME = 3%, with 200 customers, what is confidence level?
(d) Find the 95% C.I of number of claims in 1000 customers.

Example 5.5
To estimate proportion on sample n = 400, with confidence level 90%,
what is the maximum value that the ME of C.I. would be?
Example 5.6
Catch 500 fishes in the lake, mark on them then release. Then catch 1000
fishes, and there are 50 marked ones. Find the right-sided (lower bounded)
C.I 90% of total number of fishes in the lake.
August 3, 2025 136 / 273
Exercise - Lecture 05

Book Page Compusory Optional


[1] 390 1, 2, 3, 4 5, 6, 7, 8, 9
399 12, 13 15, 19, 23, 25, 28
407 35, 37 42
411 47 48
420 59 61, 62, 64, 65
[2] 325
329
331
333
Using R to build CI for mean, variance of Normality variable
> sample <-...
> mean <-...
> ssq <-...
> conflevel <-...
August 3, 2025 137 / 273
R code: C.I. for Mean: Known sigma

> ci mean <- function(sample,sigma,alpha){


> n <- length(sample)
> ME <- qnorm(1-alpha/2)*sigma/sqrt(n)
> ll <- mean(sample) - ME
> ul <- mean(sample) + ME
> c(ll, ul)
> }
> sample <- c(...)
> sigma <-...
> alpha <- 0.05
> ci mean(sample,sigma,alpha)

August 3, 2025 138 / 273


R code: C.I. for Mean: Unknown sigma

> ci mean <- function(sample,alpha){


> n <- length(sample)
> ME <- qt(1-alpha/2,n-1)*sd(sample)/sqrt(n)
> ll <- mean(sample) - ME
> ul <- mean(sample) + ME
> c(ll, ul)
> }
> sample <- c(...)
> alpha <- 0.05
> ci mean(sample, alpha)

August 3, 2025 139 / 273


R code: C.I. for Variance

> ci var <- function(sample,alpha) {


> n <- length(sample)
> ll <- (n-1)*var(sample)/qchisq(1-alpha/2,n-1)
> ul <- (n-1)*var(sample)/qchisq(alpha/2,n-1)
> c(ll, ul)
> }
> sample <- c(...)
> alpha <- 0.05
> ci var(sample,alpha)

August 3, 2025 140 / 273


R code: C.I. for Proportion

> ci p <- function(n,freq,alpha) {


> ph <- freq/n
> za <- qnorm(1-alpha/2)
> ph2 <- (ph + za^2/(2*n))/(1+za^2/n)
> me <- sqrt(ph*(1-ph)/n + za^2/(4*n^2))/(1+za^2/n)
> ll <- ph2 - za*me
> ul <- ph2 + za*me
> c(ll, ul)
> }
> n <- ...
> freq <- ...
> alpha <- 0.05
> ci p(sample,alpha)
August 3, 2025 141 / 273
Lec 6. HYPOTHESIS TESTING - ONE SAMPLE

6.1 Concepts [1] p.426


6.2 Testing Procedure [1] p.428
6.3 Test for Normal Mean - known σ [1] p.436
6.4 Test for Normal Mean - unknown σ [1] p.443
6.5 Test for Normal Variance
6.6 Test for Proportion [1] p.450
6.7 Most Powerful test [1] p.473
6.8 Likelihood Ratio test [1] p.475

Reference
Book [1] Chapter 9, pp. 425 – 483.
Book [2] Hypothesis Testing, pp.337 – 390.

August 3, 2025 142 / 273


6.1. CONCEPTS

Statement to be tested is hypothesis


Parametric hypothesis: about parameter θ
Non-parametric hypothesis: distribution, independent
Hypotheses pair
H0 : Null hypothesis
H1 : Alternative hypothesis
Using information from sample to decide:
Reject H0 : not reject H1
Not reject H0 : reject H1
Simple hypotheses pair about parameter θ
(
H0 : θ = θ0
H1 : θ = θ1

August 3, 2025 143 / 273


Hypotheses Pair
General hypotheses pair
(
H0 : θ ∈ Ω 0
, Ω0 ∩ Ω1 = ∅
H1 : θ ∈ Ω 1

3 composite hypotheses pairs (1) upper-tail (2) lower-tail (3) two-tail


( ( (
H0 : θ = θ 0 H0 : θ = θ 0 H0 : θ = θ 0
(1) (2) (3)
H1 : θ > θ 0 H1 : θ < θ 0 H1 : θ ̸= θ0

Example 6.1
Determine hypotheses pair, conclusion when reject and accept H0
(a) Last year, average income was 120. This year, income has increased
(b) Average price does not differ from 15
(c) Proportion of bad-dept has been over 10%
(d) The variability of stock price is greater than 25 $2
August 3, 2025 144 / 273
Types of Error

Null hypothesis H0 is reference


Error type 1: Reject H0 when H0 is True
Error type 2: Not reject H0 when H0 is False

Decision H0 is TRUE H0 is FALSE


Reject H0 Error type 1 Correct
Prob. = α Prob. = 1 − β
Not reject H0 Correct Error type 2
Prob. = 1 − α Prob. = β
P(ET .1) = α: significant level
P(ET .2) = β
1 − β: power of the test

August 3, 2025 145 / 273


Example

Example 6.2
Testing for ’capacity’ of employee.
H0 : capacity level is ’Good’
Using the ’Entry test’
Statistic: Test’s score (0 to 100)
Critical value: 80
Reject region: ”Score < 80” → Error type 1: P(ET 1) = α
Statistic ≥ 80: Not reject H0 → Error type 2: P(ET 2) = β
Change α → change critical value and Reject region → change β
To reduce α → increase β
Given α, choose Statistic that minimize β

August 3, 2025 146 / 273


6.2. PROCEDURE OF TESTING

CLASSICAL WAY (Reject region approach)


Step 1. Hypotheses pair
Step 2. Significant level α → Critical value, Reject region
Step 3. From observed sample → Statistical value
Step 4.
• If Statistical value ∈ Reject region → Reject H0
• If Statistical value ∈
/ Reject region → NOT Reject H0
Step 5. Final conclusion

ALTERNATIVE WAY (P-value approach)


From observed sample → P-value
If P-value < α → Reject H0
If P-value ≥ α → Not Reject H0

August 3, 2025 147 / 273


Example

Example 6.3
2
( X ∼ N(µ, σ ), σ = 2, and µ is unknown. Test for single
Assumed that
H0 : µ = 6
hypothesis
H1 : µ = 9
(a) Random survey n = 1, critical value is 7, reject region: X > 7. Find
α = P(ET .1), β = P(ET .2)
α = P(X > 7|µ = 6) =
β = P(X ≤ 7|µ = 9) =

β α
6 7 9 x

August 3, 2025 148 / 273


Example

Example 6.4

(b) n = 4, reject region X > 7


X ∼ N(µ, σX ), σX = 1
α = P(X > 7|µ = 6) =
β = P(X ≤ 7|µ = 9) =

6 7 9 x

August 3, 2025 149 / 273


Example
Example 6.5
Generalization: X ∼ N(µ, σ 2 ), σ is known. Test for
(
H0 : µ = µ0
H1 : µ = µ1 (µ1 > µ0 )
sample size n, statistic X , reject resion X > c
given α, find critical value c,
σ
P(X > c|µ0 ) = α ⇒ c = µ0 + zα √
n
Reject region (not depend on µ1 )
σ
X > µ0 + zα √
n
find β  µ0 − µ1 
P(X ≤ c|µ1 ) = P Z < zα + √
σ/ n
August 3, 2025 150 / 273
6.3. TEST FOR NORMAL MEAN - KNOWN σ

Statistic Hyp. pair Reject region P-value


(
H0 : µ = µ 0
Z > zα P(Z > Zstat )
H1 : µ > µ0
X − µ0 (
Z= √ H0 : µ = µ 0
σ/ n Z < −zα P(Z > −Zstat )
H1 : µ < µ0
(
H0 : µ = µ 0
|Z | > zα/2 2P(Z > |Zstat |)
H1 : µ ̸= µ0

May apply for Non-normality population, large sample


Not reallity

August 3, 2025 151 / 273


Example

Example 6.6
Price is normality with variance of 25 ($2 ). Survey 100 observations and
sample mean is 24 ($).
(a) Test the hypothesis that average price is higher than 23($), with
significant level of 5%, 1%;
(b) Find the P-value of the test in (a)
(c) If the true mean is 24.8($), find power of the test in (a)
(d) Test the hypothesis that average price is 24.5($), at 5% of significant,
and find the P-value
(e) Find P-value of the test that mean of price is less than 25.5($)?

August 3, 2025 152 / 273


6.4. TEST FOR NORMAL MEAN - UNKNOWN σ

Statistic Hyp. pair Reject region P-value


(
H0 : µ = µ0
T > t(n−1)α P(T > Tstat )
H1 : µ > µ0
X − µ0 (
T = √ H0 : µ = µ0
S/ n T < −t(n−1)α P(T > −Tstat )
H1 : µ < µ0
(
H0 : µ = µ0
|T | > t(n−1)α/2 2P(T > |Tstat |)
H1 : µ ̸= µ0

August 3, 2025 153 / 273


Example

Example 6.7
Random survey wage of 16 persons, obtains sample mean of 23.5 and
sample variance of 3.467. Assumes that wage is normal distributed.
(a) At 5% significant, test the hypothesis that mean of wage is higher
than 22.5.
(b) Estimate the P-value of the test in (a)
(c) Compute P-value of the test in (a) by Microsoft Excel or R
(d) At 10% sig. test the hypothesis that mean of wage less than 24, and
estimate the P-value of the test

August 3, 2025 154 / 273


T-Test for Mean in R

> x <- c(rep(20,1),rep(22,6),rep(24,5),rep(26,4))


> [Link](x, mu = 22.5, alternative = "greater")

One Sample t-test


data: x
t = 2.1483, df = 15, p-value = 0.02421
alternative hypothesis: true mean is greater than 22.5
95 percent confidence interval:
22.684 Inf
sample estimates:
mean of x
23.5

August 3, 2025 155 / 273


6.5. TEST FOR NORMAL VARIANCE

Statistic Hyp. pair Reject region P-value


(
H0 : σ 2 = σ02
χ2 > χ2(n−1)α P(χ2 > χ2stat )
H1 : σ 2 > σ02
(n − 1)S 2 (
χ2 = H0 : σ 2 = σ02
σ02 χ2 < χ2(n−1)1−α P(χ2 < χ2stat )
H1 : σ 2 < σ02
(
H0 : σ 2 = σ02 χ2 > χ2(n−1)α/2 or 2P(χ2 >; < χ2stat )
H1 : σ 2 ̸= σ02 χ2 < χ2(n−1)1−α/2 If s 2 >; < σ02

Example 6.8
On 16 workers, sample mean is 23.5, sample variance is 3.467. At 5%, test
the hypothesis that variance is greater than 2.6, and estimate the P-value
of the test.
August 3, 2025 156 / 273
6.6. TEST FOR PROPORTION

For large sample

Statistic Hyp. pair Reject P-value


(
H0 : p = p0
Z > zα P(Z > Zstat )
H1 : p > p0
p̂ − p0 (
Z=p H0 : p = p0
p0 (1 − p0 )/n Z < −zα P(Z > −Zstat )
H1 : p < p0
(
H0 : p = p0
|Z | > zα/2 2P(Z > |Zstat |)
H1 : p ̸= p0

August 3, 2025 157 / 273


Example

Example 6.9
In random survey 400 customers of choosing between product A and B,
223 favor A, the others favor B.
(a) At 5%, test the hypothesis that more than 50% favor A?
(b) Find P-value of the test in question (a)
(c) If true value p = 0.6, find Power of the test in question (a)
(d) If in 400 surveyees, 164 favor A, 158 favor B, the others are
indifferent. Test the hypothesis that surveyees prefer A to B, at 5%;
and find P-value of the test.

August 3, 2025 158 / 273


Test for Proportion - Small sample

( for p. In sample, size n, frequency X ∼ B(n, p).


Significant level α, test
H0 : p = p0
For hypotheses pair
H1 : p > p0
Reject region: freq ≥ crit
Crititcal value: P(X ≥ crit|p = p0 ) < α
Max of P(error type 1) = P(X ≥ freqstat )
(
H0 : p = p0
For hypotheses pair
H1 : p < p0
Reject region: freq ≤ crit
Crititcal value: P(X ≤ crit|p = p0 ) < α
Max of P(error type 1) = P(X ≤ freqstat )

August 3, 2025 159 / 273


Example B(n = 20)
x p = 0.3
0 .0008
1 .0068
2 .0278
Example 6.10
3 .0716
Sample of 20 observations, test the hypotheses with 4 .1304
H0 : p = 0.3. Find the rejection region and 5 .1789
corresponde maximum P(type 1 error) for: 6 .1916
7 .1643
(a) H1 : p > 0.3, significant 5%
8 .1144
(b) H1 : p > 0.3, significant 1% 9 .0654
(c) H1 : p < 0.3, significant 5% 10 .0308
(d) H1 : p ̸= 0.3, significant 5% 11 .0120
(e) Test p > 0.3, frequency is 9. What is P-value 12 .0039
of the test? 13 .0010
14 .0002
15 .0000
August 3, 2025 160 / 273
Example

Example 6.11
Apply Central Limit Theorem, build the reject region for the test
(a) X ∼ U(0, b), test b > b0
(b) X ∼ E (λ), test λ > λ0

Example 6.12
With the following sample
12, 12, 14, 15, 15, 17, 18, 18, 21, 22, 22, 24, 25, 25, 25, 26, 28, 28, 29, 30
(a) Test the hypothesis H0 : median = 20 with H1 : median > 20,
significant 5%
(b) Test the hypothesis H0 : median = 20 with H1 : median ̸= 20,
significant 10%

August 3, 2025 161 / 273


6.7. MOST POWERFULL TEST

Most powerful
Given hypotheses pair, significant α, reject region that maximize power of
the test is ’most power reject region’, test is ’most powerful test’
(
H0 : θ = θ 0
Simple pair → Uniformly most powerfull test (UMP)
H1 : θ = θ 1

Neyman-Pearson Lemma
(
H 0 : θ = θ0 L(x1 , x2 , ..., xn , θ0 )
For , sample (x1 , x2 , ..., xn ), let Λ =
H 1 : θ = θ1 L(x1 , x2 , ..., xn , θ1 )
With sig. α, constant k, reject region R is UMP if
Λ ≤ k inside R
Λ ≥ k outside R

August 3, 2025 162 / 273


Most Powerfull Test

Example 6.13
Proof that Reject region in part 6.3 ( is UMP
H0 : µ = µ 0
X ∼ N(µ, σ 2 ), σ is known, test for
H1 : µ = µ 1
(Xi − µ1 )2 − (Xi − µ0 )2
P P 
L(X1 , X2 , ..., Xn , θ0 )
Λ= = exp
L(X1 , X2 , ..., Xn , θ1 ) 2σ 2
2
σ ln k 2
µ − µ12
Λ ≤ k ⇐⇒ (µ0 − µ1 )X ≤ + 0 =a
 n  2
a σ
If µ1 < µ0 ; P X ≤ µ = µ0 = α ⇔ X ≤ µ 0 − z α √
µ0 − µ1 n
 
a σ
If µ1 > µ0 ; P X ≥ µ = µ0 = α ⇔ X ≥ µ 0 − z α √
µ0 − µ1 n
Equivalent to Z-test

August 3, 2025 163 / 273


6.8. LIKELIHOOD RATIO TEST

General hypotheses pair, with Ω0 ∩ Ω1 = ∅, Ω1 ∪ Ω2 = Ω


(
H0 : θ ∈ Ω 0
H 1 : θ ∈ Ω1

Maximum likelihood value, sample X = (X1 , X2 , ..., Xn )

L(Ω0 ) = max L(X, θ) ; L(Ω) = max L(X, θ)


θ∈Ω0 θ∈Ω

L(Ω0 )
Likelihood Ratio: Λ =
L(Ω)
Reject region with sig. level α

Λ≤k that P(Λ ≤ k|H0 ) = α

August 3, 2025 164 / 273


Likelihood Ratio Test

Example 6.14
(
H0 : µ = µ 0
X ∼ N(µ, σ 2 ), σ is unknown, test for
H1 : µ ̸= µ0
Ω0 = {µ0 }, Ω1 = R\µ0 ⇒ Ω = R
(Xi − µ0 )2
P
L(Ω0 ) = L(X, µ = µ0 ) replace σ2 c2 =
=σ0
n
(Xi − X )2
P
L(Ω) = L(X, µ = X ) replace σ 2 = MS =
n
n/2
(Xi − X )2
P
2 L(X, µ0 )
Replace MLE of σ is MS, Λ = =
(Xi − µ0 )2
P
L(X, X )
r
X − µ0  1 
It is proven that Λ ≤ k ⇔ √ ≥ (n − 1) 2/n − 1
S/ n k
Equivalent to the T-test

August 3, 2025 165 / 273


Likelihood Ratio Test

Test for Normality mean: X ∼ N(µ, σ 2 ), σ is unknown


(
H0 : µ = µ0
H1 : µ ̸= µ0
n
f (Xi , µ, σ 2 )
Q
Sample (X1 , X2 , ..., Xn ), L =
i=1
n n
L(Ω0 ) =
Q c2 );
f (Xi , µ0 , σ L(Ω) =
Q
f (Xi , X̄ , MS)
0
i=1 i=1
L(Ω0 )
Λ= ; χ2 = −2ln(Λ)
L(Ω)
Reject region: {χ2 : χ2 > χ2(1)α }

August 3, 2025 166 / 273


Likelihood Ratio Test

Test for Normality mean and variance: X ∼ N(µ, σ 2 ),


(
H0 : µ = µ0 and σ 2 = σ02
H1 : µ ̸= µ0 or σ 2 ̸= σ02
n n L(Ω0 )
f (Xi , µ0 , σ02 ); L(Ω) =
Q Q
L(Ω0 ) = f (Xi , X̄ , MS); Λ=
i=1 i=1 L(Ω)
Reject region: {χ2 = −2ln(Λ) : χ2 > χ2(2)α }

Example 6.15
From normality population, the sample is

(17, 18, 18, 18, 19, 19, 20, 22, 22, 23, 25, 25)

Using likelihood ratio test to test the hypothesis that mean is 22 and
variance is 20.

August 3, 2025 167 / 273


Likelihood Ratio Test

Solution 6.15 From data: x̄ = 20.5; ms = 7.25


xi f (xi , µ0 , σ02 ) f (xi , x̄, ms)
17 0.0477 0.0637
18 0.0598 0.0963
18 0.0598 0.0963
H0 : µ = 22 and σ 2 = 20
18 0.0598 0.0963
19 0.0712 0.1269 L(Ω0 ) =
19 0.0712 0.1269 L(Ω1 ) =
20 0.0807 0.1456
22 0.0892 0.1269 Λ=
22 0.0892 0.1269 Statistical value =
23 0.087 0.0963 Critical value =
25 0.0712 0.0367
25 0.0712 0.0367
Π 1.464E-14 2.786E-13

August 3, 2025 168 / 273


Exercise - Lecture 06
Book Page Compulsory Optional
[1] 434 1, 2, 3 9, 10, 11, 13
447 15, 16 19, 20, 21, 26, 30, 33
454 36, 38 39, 42, 44
465 45, 46 47, 48, 49, 54, 57
481 72, 74 75, 79, 88, 89
[2] 345
354
369
Using R to build hypothesis testing for variance
> sample <-...
> siglevel <-...
> chisq <-...
> critical <-...
> pvalue <-...
August 3, 2025 169 / 273
Lec 7. INFERENCES ON TWO SAMPLES

7.1 Dependent and Independent Samples [1] pp. 515-516


7.2 Inferences for Two Means [1] pp. 485-515
7.3 Inferences for Two Variances [1] pp. 527-531
7.4 Inferences for Two Proportions [1] pp. 519-525
7.5 Test for Correlation [1] pp. 668-669

Reference
Book [1] Chapter 10, pp. 484 - 551; Chapter 12, pp. 668 - 669
Book [2] Chapter 13
Book [4] Chapter 10, Chapter 11

August 3, 2025 170 / 273


7.1. DEPENDENT & INDEPENDENT SAMPLES

Compare parameters of two population

Population Variable Mean Variance Proportion


[1] X1 µ1 σ12 p1
[2] X2 µ2 σ22 p2

Using information from two samples

Sample Size Mean Variance Proportion


(1) n1 X1 S12 p̂1
(2) n2 X2 S22 p̂2

August 3, 2025 171 / 273


Pair and Independent Samples
Unpair (independent) Firm Sales
Example 7.1 samples A 77
A 79
Pair (related, dependent) Sales A 76
samples Firm A Firm B A 80
77 97 A 82
Revenue
Shop 79 86 A 83
Jan Feb
76 85 B 97
(1) 72 76
80 93 B 86
(2) 75 79
82 81 B 85
(3) 70 77
83 72 B 93
(4) 82 80
88 B 81
(5) 70 75
90 B 72
(6) 83 89
82 B 88
[?] Mean of Revenue in Feb B 90
[?] Difference in mean
is higher than that in Jan? B 82
between A and B?
August 3, 2025 172 / 273
Pair and Independent Samples

Pair samples Independent samples


X1 , X2 from Same individuals Different and
independent individuals
Size n1 , n2 Must equals Can be different
Order of observation Cannot be changed Could be changed

Example 7.2
Pair or independent samles?
(a) Income and Expenditure of households?
(b) Income of households in urban and rural area?
(c) Wage of undergraduates and graduates?
(d) Microeconomics and Macroeconomics score of students?
(e) Microeconomics score of K60 and K61 studens?
August 3, 2025 173 / 273
7.2. INFERENCES FOR TWO MEANS

Variable X1 ∼ N(µ1 , σ12 ), X2 ∼ N(µ2 , σ22 )


Testing between µ1 and µ2
(
H0 : µ 1 = µ 2
H1 : µ1 (̸=, >, <)µ2
If µ1 ̸= µ2 or µ1 − µ2 =
̸ 0, find confidence interval for µ1 − µ2 .
Or testing (
H0 : µ 1 − µ 2 = ∆
H1 : µ1 − µ2 (̸=, >, <)∆

August 3, 2025 174 / 273


Logic of Inferences for Two Means

Pair sample d = X1 − X2 T-test C.I. p.509

Known σ12 , σ22 Z-test C.I. p.485

Independent Unknown σ12 , σ22


T-test C.I. p.504
samples assume σ12 = σ22

Unknown σ12 , σ22


T-test C.I. p.499
assume σ12 ̸= σ22

August 3, 2025 175 / 273


7.2.1. Pair data: Test for Two Means
Let di = X1i − X2i then µ1 − µ2 = µd
Hypotheses pair
( (
H0 : µ 1 = µ 2 H0 : µ d = 0

H1 : µ1 ̸= µ2 H1 : µd ̸= 0

Statistic
d
T = √
Sd / n
Reject region with significant α

|T | > t(n−1)α/2

C.I (1 − α) for the difference between two mean (µ1 − µ2 ) = µd


Sd
d ± t(n−1)α/2 √
n
August 3, 2025 176 / 273
Example: Pair samples

Example 7.3

Revenue
Shop d
Jan Feb
(1) 72 76 4 d =4 ; sd2 = 10
(2) 75 79 4 Hypotheses pair

(3) 70 77 7
(
(4) 82 80 -2 H0 : µFeb = µJan
(5) 70 75 5
H1 : µFeb > µJan
(6) 83 89 6
(
(a) At 5%, mean of Revenue H 0 : µd = 0

in Feb is higher than H 1 : µd > 0
that in Jan?
(b) C.I. 90% of difference?

August 3, 2025 177 / 273


Two Means Test in R: Paired sample

> xJan <- c(72,75,70,82,70,83)


> xFeb <- c(76,79,77,80,75,89)
> d <- xFeb - xJan
> [Link](d, mu = 0, alternative = "greater")

One Sample t-test


data: d
t = 3.0984, df = 5, p-value = 0.01345
alternative hypothesis: true mean is greater than 0
95 percent confidence interval:
1.398584 Inf
sample estimates:
mean of x
4

August 3, 2025 178 / 273


7.2.2. Independent data: Known σ12 , σ22
Hypotheses pair
( (
H0 : µ 1 = µ 2 H0 : µ 1 − µ 2 = 0
or
H1 : µ1 ̸= µ2 H1 : µ1 − µ2 ̸= 0
Statistic
(X 1 − X 2 ) − 0 X1 − X2
Z= =s
σX 1 −X 2 σ12 σ22
+
n1 n2
Reject region with significant α
|Z | > zα/2
C.I (1 − α) for the difference between two mean (µ1 − µ2 )
s
σ12 σ22
(X 1 − X 2 ) ± zα/2 +
n1 n2
August 3, 2025 179 / 273
Example

Example 7.4
Assummed that wage in industry A and B are normality distributed with
standard deviation are 10 and 15, respectively. Random survey 20 worker
in industry A and 25 worker in industry B expresses the sample mean are
240 and 260.
(a) At 5%, test the hypothesis that average wage in A is lower than that
in B
(b) Find P-value of the above test
(c) Find confidence interval 90% of the difference between two average
wages

August 3, 2025 180 / 273


7.2.3. Independent data: Unknow, σ12 = σ22

Statistic Hyp. pair Reject region


(
H0 : µ 1 = µ 2
X1 − X2 |T | > t(n1 +n2 −2)α/2
T =s H1 : µ1 ̸= µ2
Sp2 Sp2 (
+ H0 : µ 1 = µ 2
n1 n2 T > t(n1 +n2 −2)α
H1 : µ1 > µ2
(n1 − 1)S12
+ (n2 − 1)S22
Sp2 =
(
n1 + n2 − 2 H0 : µ 1 = µ 2
T < −t(n1 +n2 −2)α
H1 : µ1 < µ2

C.I for difference s


Sp2 Sp2
(µ1 − µ2 ) ∈ (X 1 − X 2 ) ± t(n1 +n2 −2)α/2 +
n1 n2

August 3, 2025 181 / 273


7.2.4. Independent data: Unknow, σ12 ̸= σ22

Statistic Hyp. pair Reject region


(
X1 − X2 H0 : µ 1 = µ2
T =s |T | > t(v )α/2
S12 S22 H1 : µ1 ̸= µ2
+ (
n1 n2 H0 : µ 1 = µ2
T > t(v )α
(S12 /n1 + S22 /n2 )2 H1 : µ1 > µ2
v=
(S12 /n1 )2 (S22 /n2 )2
(
+ H0 : µ 1 = µ2
n1 − 1 n2 − 1 T < −t(v )α
H1 : µ1 < µ2

If n1 , n2 > 30 → t(v )α ≈ zα
C.I for difference s
S12 S22
(µ1 − µ2 ) ∈ (X 1 − X 2 ) ± t(v )α/2 +
n1 n2
August 3, 2025 182 / 273
Example

Example 7.5
x̄A = 79.5; sA2 = 7.5
x̄B = 86.0; sB2 = 53.5
Sales
Firm A Firm B (a) Test for equal mean between A
77 97 and B, assuming equal variance,
79 86 significant level 5%, and
76 85 estimate P-value of the test
80 93
(b) Test for equal mean between A
82 81
and B, assuming unequal
83 72
variance, significant level 5%,
88
and estimate P-value of the test
90
82 (c) Find C.I 95% of difference
between two means
Sales are Normal distributed

August 3, 2025 183 / 273


Two Means Test in R: Equal variance

> xA <- c(77,79,76, 80,82,83)


> xB <- c(97,86,85,93,81,72,88,90,82)
> [Link](xA,xB,alternative= "[Link]", [Link]= TRUE)

Two Sample t-test


data: xA and xB
t = -2.061, df = 13, p-value = 0.0599
alternative hypothesis: true difference in means is not
equal to 0
95 percent confidence interval:
-13.3134141 0.3134141
sample estimates:
mean of x mean of y
79.5 86.0

August 3, 2025 184 / 273


Two Means Test in R: Unequal variances

> xA <- c(77,79,76, 80,82,83)


> xB <- c(97,86,85,93,81,72,88,90,82)
> [Link](xA,xB,alternative= "[Link]", [Link]= FALSE)

Welch Two Sample t-test


data: xA and xB
t = -2.4233, df = 10.944, p-value = 0.03391
alternative hypothesis: true difference in means is not
equal to 0
95 percent confidence interval:
-12.4072744 -0.5927256
sample estimates:
mean of x mean of y
79.5 86.0

August 3, 2025 185 / 273


Test for Two Samples

d
Pair sample d = X1 − X2 T = √
Sd / n

X1 − X2
Z=s
Known σ12 , σ22 σ12 σ22
+
n1 n2

X1 − X2
Independent Unknown σ12 , σ22 T =s
Sp2 Sp2
samples σ12 = σ22 +
n1 n2
X1 − X2
F-test for Unknown σ12 , σ22 T =s
S12 S22
variances σ12 ̸= σ22 +
n1 n2

August 3, 2025 186 / 273


7.3. INFERENCES FOR TWO VARIANCE

Statistic Hyp. pair Reject region


(
H0 : σ12 = σ22 F > f(n1 −1,n2 −1)α/2 or
S12 H1 : σ12 ̸= σ22 F < f(n1 −1,n2 −1)1−α/2
F =
S22 (
H0 : σ12 = σ22
F > f(n1 −1,n2 −1)α
1 H1 : σ12 > σ22
f(v1 ,v2 )1−α =
f(v2 ,v1 )α (
H0 : σ12 = σ22
F < f(n1 −1,n2 −1)1−α
H1 : σ12 < σ22
C.I for ratio
S12 S12
S22 σ12 S22
< <
f(n1 −1,n2 −1)α/2 σ22 f(n1 −1,n2 −1)1−α/2
August 3, 2025 187 / 273
Example

Example 7.6

Sales x̄A = 79.5; sA2 = 7.5


Firm A Firm B x̄B = 86.0; sB2 = 53.5
77 97
79 86 (a) Test for equality of variances at
76 85 5%, and estimate P-value of the
80 93 test
82 81 (b) Which conclusion about
83 72 Equality mean test is correct?
88 (c) Find C.I 95% of ratio of two
90 variances
82

Sales are Normal distributed

August 3, 2025 188 / 273


Variances Test in R

> xA <- c(77,79,76,80,82,83)


> xB <- c(97,86,85,93,81,72,88,90,82)
> [Link](xA,xB, ratio = 1, alternative = "[Link]")

F test to compare two variances


data: xA and xB
F = 0.14019,
num df = 5, denom df = 8, p-value = 0.04457
alternative hypothesis: true ratio of variances is not
equal to 1
95 percent confidence interval:
0.02910087 0.94726710
sample estimates:
ratio of variances 0.1401869

August 3, 2025 189 / 273


7.4. INFERENCES FOR TWO PROPORTION

Statistic Hyp. pair Reject region


(
H0 : p1 = p2
p̂1 − p̂2 |Z | > zα/2
Z=r 1 H1 : p1 ̸= p2
1
p(1 − p) +
(
n1 n2 H0 : p1 = p2
Z > zα
n1 p̂1 + n2 p̂2 f1 + f2 H1 : p1 > p2
p= = (
n1 + n2 n1 + n2 H0 : p1 = p2
Z < −zα
H1 : p1 < p2

C.I for difference s


p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
(p1 − p2 ) ∈ (p̂1 − p̂2 ) ± zα/2 +
n1 n2

August 3, 2025 190 / 273


Example

Example 7.7
In 200 male customers, 140 say “satisfied”; in 300 female customers, 156
say “satisfied”.
(a) Test the hypothesis that male is more satisfied than female, at 5%
(b) Find the P-value of the test
(c) Find 90% C.I of the difference between two proportions

August 3, 2025 191 / 273


7.5. CORRELATION TEST

Statistic Hyp. pair Reject region


(
H0 :ρ=0
|T | > t(n−2)α/2
H1 : ρ ̸= 0
√ (
R n−2 H0 :ρ=0
T = √ T > t(n−2)α
1 − R2 H1 :ρ>0
(
H0 :ρ=0
T < −t(n−2)α
H1 :ρ<0

Example 7.8
Sample correlation of Present and Midterm of 83 students is 0.1726. At
5% test for correlation, and find the P-value of the test.
August 3, 2025 192 / 273
Summary Example

Example 7.9
Result table of wage from a firm
s2
Pn, x,P Senior Junior
xi , xi2
Post grad. 40; .......; ....... 60; .......; .......
1000; 25202.8 1020; 17882.8
Grad 50; ...... ; ....... 80; ......; .......
1150; 27077.2 1200; 18971.7

(a) Test for equality of variance between Post and Grad in Seinor; Junior
(b) Test for equality of mean between Post and Grad in Seinor; in Junior
(c) Test for equality of variance and mean between Senior and Junior in
Post; in Grad
(d) Confidence interval of difference between means of Post and Grad in
Senior; Junior; between means of Senior and Junior in Post; Grad
August 3, 2025 193 / 273
Exercises - Lecture 07

Book Page Compulsory Optional


[1] 495 1, 2, 4 5, 6, 7, 11, 17
505 21, 25 26, 36, 37
517 39 42, 44
525 48, 50 55, 57, 59
531 61, 62 64, 65, 68
545 85, 86 90, 91, 92

Using R to build Z test for two proportion, correlation

August 3, 2025 194 / 273


Lec 8. ANALYSIS OF VARIANCE

8.1 One-Factor ANOVA [1] pp. 553-582


8.2 Two-Factor ANOVA without Interaction [1] pp. 583-596
8.3 Two-Factor ANOVA With Interaction [1] pp. 597-612

Reference
Book [1] Chapter 10, pp. 553-612
Book [4] Chapter 15

August 3, 2025 195 / 273


8.1. ONE-FACTOR ANOVA

Example 8.1
Analyze deviation of Wage in two firms A and B
Quantitative variable: Wage
Factor: “training”

Firm A Firm B
Wage 5 6 8 9 Wage 5 6 8 9
Trained? No No Yes Yes Trained? No Yes No Yes

Measure the variability of Firms’ Wage


In which firm the effect of Training is stronger?

August 3, 2025 196 / 273


Analysis of Variance
Firm A’s Wage analysis

Values xj 5 6 8 9
Total mean x̄¯ 7
Deviations xj − x̄¯ −2 −1 +1 +2
Groups j No Yes
Group mean x̄i x̄No = 5.5 x̄Yes = 8.5
Dev. by Treatment x̄i − x̄¯ −1.5 −1.5 +1.5 +1.5
Error xij − x̄i −0.5 +0.5 −0.5 +0.5

(xij − x̄¯)2 = 10
P
Total Sum of Squares: SST =
Treatment Sum of Squares: SSTr = (x̄i − x̄¯)2 = 9
P

Error Sum of Squares: SSE = (xij − x̄i )2 = 1


P

August 3, 2025 197 / 273


ANOVA Table

Degree of freedom
Df in total: 4 − 1 = 3
Df of Treatment (Between-sample): 2 − 1 = 1
Df of Error (Within-sample): 4 − 2 = 2
Table

Sources SS df MS
SSTr 9
Treatment SSTr = 9 1 MSTr = = =9
df 1
SSE 1
Error SSE = 1 2 MSE = = = 0.5
df 2
Total SST = 10 3

August 3, 2025 198 / 273


ANOVA
Factor F : I categories → I groups x1,1
x1,2
Group i: ni observations G1 .. x̄1
Total obeservations n = j ni
P .
x1,n1
Total mean: x̄¯ = i i xij /n
P P
P x2,1
Group mean: x̄j = i xij /ni x2,2
G2 .. x̄2
XX
2 .
SST = (xij − x̄¯)
x2,n2
i j .. .. ..
X
2
. . .
SSTr = ni (x̄i − x̄¯) xI ,1
i
xI ,2
XX GI .. x̄I
SSE = (xij − x̄i )2 .
i j xI ,n
P I
If number of observation in each xij x̄¯
category (sample) is J then n = I × J
August 3, 2025 199 / 273
ANOVA Table

Sources SS df MS F

SSTr MSTr
Treatment SSTr I −1 MSTr = F =
I −1 MSE
SSE
Error SSE n−I MSE =
n−I

Total SST n−1

August 3, 2025 200 / 273


F-test for Means

Assumption
Normality: in each category Xi ∼ N(µi , σi2 )
Heteroscedasticity: V (Xi ) = σi2 are equal
Samples are random and independent
Testing
Hypotheses pair (
H 0 : µ1 = µ2 = · · · = µ I
H1 : not H0

H0 : Equality of means: Factor does not affect to the mean


Statistic:
MSTr SSTr /(I − 1)
F = =
MSE SSE /(n − I )
Reject region: F > f(I −1,n−I )α

August 3, 2025 201 / 273


ANOVA Zone Wage
1 8
Example 8.2 1 6
1 9
Wage of Zones 1 8
Z1 Z2 Z3 x̄¯ = 7.0 1 7
8 9 5 2 9
6 6 6 x̄1 = 7.6 2 6
9 9 5 x̄2 = 7.5 2 9
8 7 7 2 7
7 6 6 x̄3 = 5.8 2 6
8 SST = 28 2 8
3 5
(a) Build ANOVA SSTr = 10.5 3 6
Table 3 5
(b) Test for equality 3 7
of Means 3 6

August 3, 2025 202 / 273


ANOVA Table

Example 8.2

Sources SS df MS F

Treatment

Error

Total

Hypothesis testing
H0 : µ1 = µ2 = µ3 : Zone does not affect to the wage
Fstat =
f(2,13)0.05 = 3.8; f(2,13)0.01 = 6.7

August 3, 2025 203 / 273


One-Factor ANOVA in R

Example 8.2
> zone <- c(rep(’z1’,5), rep(’z2’,6), rep(’z3’,5))
> wage <- c(8,6,9,8,7,9,6,9,7,6,8,5,6,5,7,6)
> [Link] <- aov(wage ∼ zone)
> summary([Link])

Df Sum Sq Mean Sq F value Pr(>F)


zone 2 10.5 5.250 3.9 0.0471 *
Residuals 13 17.5 1.346

---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
‘.’ 0.1 ‘ ’ 1

August 3, 2025 204 / 273


8.2. Two-Factor ANOVA Without Interaction

Factor A: I categogries A1 , A2 , ..., AI ,


Factor B: J categories: B1 , B2 , ..., BJ
In each case (Ai , Bj ): nij observations

B1 B2 ··· BJ Mean
x1,1,1 x1,2,1 x1,J,1
A1 .. .. ··· .. x̄A1
. . .
x2,1,1 x2,2,1 x2,J,1
A2 .. .. ··· .. x̄A2
. . .
.. .. .. .. .. ..
. . . . . .
xI ,1,1 xI ,2,1 xI ,J,1
AI .. .. ··· .. x̄AI
. . .
Mean x̄B1 x̄B2 ··· x̄BJ x̄¯

August 3, 2025 205 / 273


Two-Factor ANOVA
A B X
A1 B1 x1,1,1
XXX A1 B1 x1,1,2
SST = (xijk − x̄¯)2 .. .. ..
i j k . . .
A1 B2 x1,2,1
X .. .. ..
SSA = nAi (xAi − x̄¯)2 . . .
i A1 BJ x1,J,1
X
2
.. .. ..
SSB = nBj (xBj − x̄¯) . . .
j A2 B1 x2,1,1
.. .. ..
. . .
XXX
SSE = (xijk −x̄Ai −x̄Bj +x̄¯)2 A2 BJ x2,J,1
i j k
.. .. ..
. . .
AI B1 xI ,1,1
SST = SSA + SSB + SSE .. .. ..
. . .
August 3, 2025 206 / 273
Two-Factor ANOVA Without Interacation: Table

Sources SS df MS F

SSA MSA
Factor A SSA I −1 MSA = FA =
I −1 MSE
SSB MSB
Factor B SSB J −1 MSB = FB =
J −1 MSE
SSE
Error SSE n−I −J +1 MSE =
n−I −J +1

Total SST n−1

August 3, 2025 207 / 273


Two-Factor ANOVA Without Interacation: Testing

Testing for effect of factor A


H0 : µA1 = µA2 = ... = µAI : Factor A does not affect to the means
Reject region
MSA
FA = > f(I −1,n−I −J+1)α
MSE
Testing for effect of factor B
H0 : µB1 = µB2 = ... = µBJ : Factor B does not affect to the means
Reject region
MSB
FB = > f(J−1,n−I −J+1)α
MSE

August 3, 2025 208 / 273


Two-Factor ANOVA without Interaction: Example
Example 8.3 Adv Zone Sales
H I 12
H I 14
Zone
Sales H S 11
Inner Sub Outer
H S 12
High 12 11 10
H O 10
14 12 11
H O 11
Advertising

Medium 10 9 8
M I 10
9 7 8
M I 9
Low 9 5 7
M S 9
10 6 6
M S 7
None 6 7 5
M O 8
5 6 5
M O 8
(a) Two-way ANOVA Table without L I 9
interaction? L I 10
.. .. ..
(b) At 5%, test for effect of Factors . . .
August 3, 2025 209 / 273
Two-Factor ANOVA without Interaction: Example

Example 8.3

Zone
Sales Mean
Inner Sub Outer
High 12, 14 11, 12 10, 11 11.667
Advertising

Medium 10, 9 9, 7 8, 8 8.5


Low 9, 10 5, 6 7, 6 7.167
None 6, 5 7, 6 5, 5 5.667
Mean 9.375 7.875 7.5 8.25

SST = 154.5; SSA(Adv ) = 117.5; SSB(Zone) = 15.75

August 3, 2025 210 / 273


Two-Factor ANOVA without Interaction: Example

Sources SS df MS F

Advertising

Zone

Error

Total

August 3, 2025 211 / 273


Two-Factor ANOVA without Interaction: Example

Testing for affect of Advertising


H0 : µHigh = µMed = µLow = µNone
Fstat =
Fcrit =
Conclusion

Testing for affect of Zone


H0 : µInner = µSub = µOuter
Fstat =
Fcrit =
Conclusion

August 3, 2025 212 / 273


Two-Factor ANOVA withou Interaction in R

Example 8.3
> adv <- c(rep(h,6),rep(m,6),rep(l,6),rep(n,6))
> z1 <- c(’i’,’i’,’s’,’s’,’o’,’o’)
> zone <- c(rep(z1,4))
> sales <- c(12,14,11,12,10,11,10,9,9,7,8,8,9,10,
5,6,7,6,6,5,7,6,5,5)
> [Link] <- aov(sales ∼ adv + zone)
> summary([Link])

Df Sum Sq Mean Sq F value Pr(>F)


adv 3 117.50 39.17 33.176 1.52e-07 ***
zone 2 15.75 7.88 6.671 0.0068 **
Residuals 18 21.25 1.18
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
‘.’ 0.1 ‘ ’ 1
August 3, 2025 213 / 273
8.3. TWO-FACTOR ANOVA WITH INTERACTION

Factor A, Factor B
Interaction of A and B (A ∗ B)

B1 B2 ··· BJ Mean
x1,1,1 x1,2,1 x1,J,1
A1 .. ⇒ x̄11 .. ⇒ x̄12 ··· .. ⇒ x̄1J x̄A1
. . .
x2,1,1 x2,2,1 x2,J,1
A2 .. ⇒ x̄21 .. ⇒ x̄22 ··· .. ⇒ x̄2J x̄A2
. . .
.. .. .. .. .. ..
. . . . . .
xI ,1,1 xI ,2,1 xI ,J,1
AI .. ⇒ x̄I 1 .. ⇒ x̄I 2 ··· .. ⇒ x̄IJ x̄AI
. . .
Mean x̄B1 x̄B2 ··· x̄BJ x̄¯

August 3, 2025 214 / 273


Two-Factor ANOVA with Interaction: Sum of Squares

XXX
SST = (xijk − x̄¯)2
i j k
X
SSA = nAi (x̄Ai − x̄¯)2
i
X
SSB = nBj (x̄Bj − x̄¯)2
j
XXX
SSI = (x̄ij − x̄Ai − x̄Bj + x̄¯)2
i j k
XXX
SSE = (xijk − x̄ij )2
i j k

SST = SSA + SSB + SSI + SSE

August 3, 2025 215 / 273


Two-Factor ANOVA with Interaction: Table

Sources SS df MS F

SSA MSA
Factor A SSA I −1 MSA = FA =
I −1 MSE
SSB MSB
Factor B SSB J −1 MSB = FB =
J −1 MSE
SSI MSI
Interaction SSI (I −1)(J −1) MSI = FI =
(I −1)(J −1) MSE
SSE
Error SSE n − IJ MSE =
n − IJ

Total SST n−1

August 3, 2025 216 / 273


Two-Factor ANOVA With Interacation: Testing
Testing for effect of factor A
H0 : µA1 = µA2 = ... = µAI : Factor A does not affect to the means
Reject region
MSA
FA = > f(I −1,n−IJ)α
MSE
Testing for effect of factor B
H0 : µB1 = µB2 = ... = µBJ : Factor B does not affect to the means
Reject region
MSB
FB = > f(J−1,n−IJ)α
MSE
Testing for effect of Interaction of A and B
H0 : Interaction A*B does not affect to the means
Reject region
MSI
FB = > f((I −1)(J−1),n−IJ)α
MSE
August 3, 2025 217 / 273
Two-Factor ANOVA with Interaction: Example

Example 8.4

Zone
Sales
Inner Sub Outer
Zone
H 12 11 10 Sales Mean
I S O
14 12 11
H 13 11.5 10.5 11.667
Advertising

M 10 9 8

Adv.
⇒ M 9.5 8 8 8.5
9 7 8
L 9.5 5.5 6.5 7.167
L 9 5 7
N 5.5 6.5 5 5.667
10 6 6
Mean 9.375 7.875 7.5 8.25
N 6 7 5
5 6 5

SST = 154.5; SSAAdv = 117.5; SSBZone = 15.75;


SSIAdv ∗Zone = 13.25

August 3, 2025 218 / 273


Two-Factor ANOVA with Interaction: Example

Sources SS df MS F

Advertising

Zone

Interaction

Error

Total

August 3, 2025 219 / 273


Two-Factor ANOVA with Interaction: Example
Testing for affect of Advertising
H0 : Advertising does not affect ot the means
Fstat =
Fcrit =
Conclusion

Testing for affect of Zone


H0 : Zone does not affect ot the means
Fstat =
Fcrit =
Conclusion

Testing for affect of Interactin


H0 : Interaction of Adv and Zone does not affect ot the means
Fstat =
Fcrit =
Conclusion
August 3, 2025 220 / 273
Two-Factor ANOVA with Interaction in R

Example 8.4
> adv <- ...
> zone <- ...
> sales <- ...
> [Link] <- aov(sales ∼ adv*zone)
> summary([Link])

Df Sum Sq Mean Sq F value Pr(>F)


adv 3 117.50 39.17 58.750 1.91e-07 ***
zone 2 15.75 7.88 11.812 0.00146 **
adv:zone 6 13.25 2.21 3.312 0.03674 *
Residuals 12 8.00 0.67
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
‘.’ 0.1 ‘ ’ 1
August 3, 2025 221 / 273
Exercises - Lecture 08

Book Page Compulsory Optional


[1] 563 1, 2, 6 4, 8
594 35, 36 38, 40
606 49 52

August 3, 2025 222 / 273


Lec 9. NON-PARAMETRIC TEST

9.1 Goodness of Fit test [1] pp. 553-582


9.2 Independence Test [1] pp. 583-596
9.3 Normality Test

Reference
Book [1] Chapter 13, pp. 723 - 757
Book [3] Chapter 10
Book [4] Chapter 14

August 3, 2025 223 / 273


9.1. GOODNESS OF FIT TEST

Multinomial population: each element is assigned to one and only one


of several categories (groups)
There is a theory of population distribution for each categories
Testing for the hypothesis: Population distribution follow a specified
rule, denoted by R

Example 9.1
Testing for
Proportions of product in market in four quality level A, B, C, D are
10%, 20%, 30%, 40%, respectively.
Distribution of customers in week are uniform (same proportion)
Number of claims in one day in insurance company is Poisson
distribution

August 3, 2025 224 / 273


Chi-squared Test

Hypothesis pair (
H0 : Distribution is R
H1 : Distribution is not R

Chi-squared statistic
k
X (Fi − Ei )2
χ2 =
Ei
i=1

Reject region χ2stat > χ2(k−m−1)α


in which
k: number of categories (groups)
Fi : Observed frequency in each group
Ei : Expected frequency under R in each group
m: number of estimated parameter

August 3, 2025 225 / 273


Multinomial Population Test
Example 9.2
There are 4 quality levels of A, B, C, D. The marketing research reported
that “Proportion of four quality levels A, B, C, D in products are 10%,
20%, 30%, 40%, respectively”.
A random sample of 400 products shows that number of A, B, C, D is 50,
60, 125, 165, respectively.
(a) With significant level at 5%, is there enough evident to say the report
is inappropriate?
(b) Estimate the P-value of the test

Example 9.3
Testing for uniform distribution of the week-day data
Day Mon Tue Wes Thu Fri Sat Sun
Num. of customers 123 115 118 130 148 154 150

August 3, 2025 226 / 273


Multinomial Population Test: Example

Observed Expected
(Fi − Ei )2
Levels Frequency Proportion Frequency
Ei
Fi pi Ei
A 50
B 60
C 125
D 165
Sum 400
H0 : Proportion of A, B, C, D is 10%, 20%, 30%, 40%
χ2stat =
Critical value χ2(k−m−1)α =

August 3, 2025 227 / 273


Multinomial Test in R

Example 9.2
> freq <- c(50,60,125,165)
> prop <- c(0.1,0.2,0.3,0,4)
> [Link](freq, p = prop)

Chi-squared test for given probabilities


data: freq
X-squared = 7.8646, df = 3, p-value = 0.04889

August 3, 2025 228 / 273


Example
Example 9.4
At 5%, test the hypothesis that proportion of fail are equal among 3 groups
Group Pass Fail
Morning 50 150
Afternoon 60 180
Evening 50 70

Example 9.5
At 5% of significant, using following data to test the hypothesis
X 0 1 2 3 4 5 6
Freq. 70 74 36 13 3 3 1

(a) X is Poisson distributed with mean of 2


(b) X is Poisson distributed
August 3, 2025 229 / 273
9.2. INDEPENDENCE TEST

Two qualitative variables A and B


A includes I categories A1 , A2 , ..., AI ; B includes B1 , B2 , ..., BJ
Observed frequency of (Ai , Bj ) is Fij
Contingency table and its marginal sum
P
B1 B2 ··· BJ
A1 F11 F12 ··· F1J R1
A2 F21 F22 ··· F2J R2
.. .. .. .. .. ..
. . . . . .
AI FI 1 FI 2 ··· FIJ RI
P
C1 C2 ··· CJ n

August 3, 2025 230 / 273


Independence Hypothesis

Hypothesis (
H0 : A and B are independent
H1 : A and B are not independent

Ri
Expected proportion of Ai is estimated by
n
Cj
Expected proportion of Bj is estimated by
n
Cj Ri
Expected proportion of (Ai , Bj ) is estimated by ×
n n
Expected frequency in (Ai , Bj ) is

Cj Ri Ri Cj
Eij = × ×n =
n n n
Degree of freedom: IJ − (I − 1) − (J − 1) − 1 = (I − 1)(J − 1)

August 3, 2025 231 / 273


Expected Frequency Table

P P
B1 · · · BJ B1 ··· BJ
R 1 C1 R1 CJ
A1 F11 · · · F1J R1 A1 E11 = ··· E1J = R1
n n
.. .. .. .. .. ⇒ .. .. .. .. ..
. . . . . . . . . .
R I C1 RI CJ
AI FI 1 · · · FIJ RI AI EI 1 = ··· EIJ = RI
n n
P P
C1 ··· CJ n C1 ··· CJ n
P P (Fij − Eij )2
If χ2 = i j > χ2((I −1)(J−1))α : reject H0
Eij
s
χ2 /n
Cramer’s V: V = : as small as more independent.
min{I − 1, J − 1}

August 3, 2025 232 / 273


Independence Test: Example
Example 9.6 Year Vote
1st D
Sample of Student vote for new policy in a .. ..
. .
university is shown below 1st D
1st N
Frequency Year 1st N
Table 1st 2nd 3rd 1st N
1st A
Disagree 10 16 5 .. ..
. .
Vote

Neutral 3 5 3 1st A
Agree 8 20 30 2nd D
.. ..
. .
(a) At 5%, test the hypothesis that Year 2nd D
and Vote are independent 2nd N
.. ..
(b) Estimate the P-value of the test . .
August 3, 2025 233 / 273
Independence Test: Example

Obs. Year Exp. Year


P P
Fij 1st 2nd 3rd Eij 1st 2nd 3rd
D 10 16 5 31 D 31

Vote

Vote
N 3 5 3 11 N 11
A 8 20 30 58 A 58
P P
21 41 38 100 21 41 38 100

H0 : Year and Vote are independent


χ2stat =
χ2crit =
Pvalue =

August 3, 2025 234 / 273


Independence Test in R

Example 9.6
> year <- c(rep(’1st’,21),rep(’2nd’,41),
rep(’3rd’,38))
> vote <- c(rep(’d’,10),rep(’n’,3),rep(’a’,8),
rep(’d’,16), rep(’n’,5),rep(’a’,20),rep(’d’,5),
rep(’n’,3),rep(’a’,30))
> data <- [Link](year,vote)
> table <- table(data$vote,data$year)
> [Link](table)

Pearson’s Chi-squared test


data: table
X-squared = 12.128, df = 4, p-value = 0.01643

August 3, 2025 235 / 273


9.3. RANK TEST

Pair samples of ordinal variables {(Xi , Yi ), i = 1, n}


Values are rank, not scale value
Spearman’s rank correlation, with di = Xi − Yi
n
di2
P
6
i=1
rS = 1 −
n(n2 − 1)
(
Hypotheses pair H0 : X , Y are not associated
H1 : X , Y are associated

Statistical value and reject region


|rS | > s(n−1)α/2 if n ≤ 30

|Z | = |rS n − 1| > zα/2 if n > 30

August 3, 2025 236 / 273


Rank Test critical value

df .05 .025 .01 .005 df .05 .025 .01 .005


5 .900 18 .399 .476 .564 .625
6 .829 .886 .943 19 .388 .462 .549 .608
7 .714 .786 .893 20 .377 .450 .534 .591
8 .643 .738 .833 .881 21 .368 .438 .521 .576
9 .600 .683 .783 .833 22 .359 .428 .508 .562
10 .564 .648 .745 .794 23 .351 .418 .496 .549
11 .523 .623 .736 .818 24 .343 .409 .485 .537
12 .497 .591 .703 .780 25 .336 .400 .475 .526
13 .475 .566 .673 .745 26 .329 .392 .464 .515
14 .457 .545 .646 .716 27 .323 .385 .456 .505
15 .441 .525 .623 .689 28 .317 .377 .448 .496
16 .425 .507 .601 .666 29 .311 .370 .440 .487
17 .412 .490 .582 .645 30 .305 .364 .432 .478

August 3, 2025 237 / 273


Example

Example 9.7
Customer’s response about design and quality, using Likert scale:

Design 2 3 5 4 3 4 3 5 2 3
Quality 3 4 4 3 5 3 4 5 4 2
(1: very bad; 2: bad; 3: neutral; 4: good; 5: very good)

(a) Compute Pearson correlation and test for correlation


(b) Compute Spearman correleation and test for association.

Example 9.8
Testing for association between economic and social effect rank in 10 firms

Economic 2 4 5 7 6 9 1 8 10 3
Social 6 5 7 8 4 10 3 9 2 1

August 3, 2025 238 / 273


9.4. NORMALITY TEST

Jacques-Bera Test
H0 : Variable is Normal distributed
Sample size: n
(xi − x̄)3 /n
P
Skewness: Skew = 3
P s
(xi − x̄)4 /n
Kurtorsis: Kurt ∗ = −3
s4
Statistic

Skew 2 Kurt ∗2 Skew 2 (Kurt − 3)2


   
2
JB = χ = n + =n +
6 24 6 24

Reject region: JB > χ2(2)α

August 3, 2025 239 / 273


Example

Example 9.9
Test for Normality with the following sample

x 20 21 22 23 24 25
freq. 10 16 23 22 17 12

Calculation table
xi fi x i fi fi (xi − x̄)2 fi (xi − x̄)3 fi (xi − x̄)4
20 10 200 65.54 -167.77 429.5
21 16 336 38.94 -60.74 94.76
22 23 506 7.21 -4.04 2.26
23 22 506 4.26 1.87 0.82
24 17 408 35.25 50.76 73.1
25 12 300 71.44 174.32 425.34
sum 100 2256 222.64 -5.6 1025.78

August 3, 2025 240 / 273


Normality test | Goodness of fit test
Example 9.10
Test for Normality with the following sample
x 20 - 22 22 - 24 24 - 26 26 - 28 28 - 3025
freq. 4 8 18 14 6
Calculating table
xiL - xiU fi xi fi xi fi xi2 pi Ei (fi − Ei )2 /Ei
20 - 22 4 21 84 1764 0.053 2.65 0.688
22 - 24 8 23 184 4232 0.201 10.053 0.419
24 - 26 18 25 450 11250 0.346 17.325 0.026
26 - 28 14 27 378 10206 0.274 13.721 0.006
28 - 30 6 29 174 5046 0.125 6.251 0.01
Sum 50 1270 32498 1 50 1.149

x̄ = 25.4; ms = 4.8
pi = P(xiL < X < xiU |X ∼ N(µ = x̄, σ 2 = ms)
August 3, 2025 241 / 273
Kolmogorov-Smirnov test

Test for distribution of variable:


H0 : X is D-distributed with cummulative function F (x)
Ascending ordered sample (x1 , x2 , ..., xn )
Let C (xi ) = i/n, ∆i = |C (xi ) − F (xi )|

D = max{∆i }; KS = D/ n
If KS > kα : reject H0
Critical value
α 0.01 0.05 0.1 0.15 0.2
n =10 0.490 0.410 0.368 0.342 0.322
15 0.404 0.338 0.304 0.283 0.266
20 0.356 0.294 0.264 0.246 0.231
30 0.290 0.240 0.220 0.200 0.190
√ √ √ √ √
>35 1.63/ n 1.36/ n 1.22/ n 1.14/ n 1.07/ n

August 3, 2025 242 / 273


9.5. KOLMOGOROV-SMIRNOV TEST

Example 9.11
Test the hopthesis that X is Normal distributed with mean of 3 and SD of
1, by the following data

1, 1.3, 1.4, 1.8, 2, 2.8, 2.9, 3, 3.6, 3.8, 4

i xi C (xi ) F (xi ) ∆i i xi C (xi ) F (xi ) ∆i


1 1 0.1 0.023 0.077 6 2.8 0.6 0.421 0.179
2 1.3 0.2 0.045 0.155 7 2.9 0.7 0.460 0.240
3 1.4 0.3 0.055 0.245 8 3 0.8 0.500 0.300
4 1.8 0.4 0.115 0.285 9 3.6 0.9 0.726 0.174
5 2 0.5 0.159 0.341 10 4 1 0.841 0.159

August 3, 2025 243 / 273


Exercise - Lecture 09

Book Page Compulsory Optional


[1] 730 3, 4, 6, 9 5, 11,
752 25, 26, 27, 29 33
754 36, 38, 40, 41

Excel Goodness of fit test: [Link](actual range,


expected range)
R: [Link](actual data, p = theoretical prob.)

August 3, 2025 244 / 273


LECTURE 10. LINEAR REGRESSION

10.1. Linear Regression model


10.2. OLS Estimation
10.3. Multiple Regression

Reference
Book [1] Chapter 12, pp.613 - 757
Book

August 3, 2025 245 / 273


10.1. Linear Regression Model

Quantitative variable: Y and X


Pair sample data (Yi , Xi ), i = 1, n
Relationship: Y depends on X , X explains to Y

Example 10.1

y: wage
X is experience (year), Y is 10
9
wage, sample of 5 staff
8
x 1 2 2 3 4 7
6
y 4 6 5 7 9 5
4
Correlation: rxy = 0.98
3
0 1 2 3 4 5
x: experience

August 3, 2025 246 / 273


Linear Regression Model

y: wage
In sample, linear regression, in general
10
yb = b0 + b1 x 9

8
For each obersvation
7
yi = b0 + b1 xi + ei 6

Coefficient: b0 , b1 5

b0 : intercept term, b1 : slope 4

yb: fitted value 3


0 1 2 3 4 5
ei are residuals
x: experience

August 3, 2025 247 / 273


Population Regression Model

In population, Y is continuous random variable, X is non-random


Model in population
Y = β0 + β1 X + ε
b0 is estimator of β0
b1 is estimator of β1
ε: random error
If E (ε|X ) = 0, then
E (Y |X ) = β0 + β1 X

August 3, 2025 248 / 273


10.2. OLS Estimation
Find b0 , b1 that
n
X n
X
ei2 = (yi − b0 − b1 xi )2 → min
i=1 i=1

If n > 2 and sx2 > 0, OLS estimator


P
(xi − x)(yi − y )
b1 = ; b0 = y − b1 x
(xi − x)2
P

Example 10.1
With data of 5 observations
b1 = 1.6538; b0 = 2.231
Regression: yi = 2.231 + 1.6538xi + ei
On avarage, wage of non-experience staff is 2.231 units; when
experience increases 1 year, wage increases 1.6538 units
August 3, 2025 249 / 273
OLS Estimation — Standard error

Assumption: V (ε) = σ 2
Estimate for σ 2 is
ei2
P
c2 =
σ
n−2
p
Standard error or Regression: σ
b= σc2
Accuracy of estimator b0 , b1 around β0 , β1 are: V (b0 ), V (b1 )
Standard error of estimated coefficient
p p
se(b0 ) = V (b0 ), se(b1 ) = V (b1 )

August 3, 2025 250 / 273


Goodness of Fit

Beside x, residuals also explain to y


Total variation of y : SST (total)
Total variation of y due to x: [Link] (regression)
Total variation of y due to e: [Link] (error)
Coefficient of Determination
[Link] [Link]
R2 = =1−
SST SST
Meaning: Proportion (or %) of total variation in y is explained by
model (by variation in x)

August 3, 2025 251 / 273


R - practice

> x <- c(1,2,2,3,4)


> y <-c(4,6,5,7,9)

> summary(lm(y ~ x))


Residuals:
1 2 3 4 5
0.1154 0.4615 -0.5385 -0.1923 0.1538

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.2308 0.5015 4.448 0.02113 *
x 1.6538 0.1923 8.600 0.00331 **
--
Multiple R-squared: 0.961, Adjusted R-squared: 0.948

August 3, 2025 252 / 273


10.3. MULTIPLE REGRESSION AND INFERENCEE

Regress y on k independent variable x1 , x2 , ..., xk


Population model:

y = β0 + β1 x1 + · · · + βk xk + ε

In sample of n observations:

yi = b0 + b1 x1 + · · · + bk xk + e
P 2
OLS estimator: that ei → min
P 2
2 2
ei
Assumption V (ε) = σ , then σ =
c
n−k −1
Standard error of estimated coefficient: se(bj ), j = 0, k

August 3, 2025 253 / 273


Inference of Coefficient

βj is unknown parameter
Point estimate of βj is bj
Using bj and se(bj ) to establish confidence interval (1 − α) and
testing for βj with significant level α.
Assumption: Normality distribution of random Error: ε ∼ N(µ, σ 2 )
Confidence interval

βj ∈ bj ± se(bj )t(n−k−1)α/2

August 3, 2025 254 / 273


Coefficient T-test

Model: y = β0 + β1 x1 + · · · + βk xk + ε
Hypotheses pair

H0 : βj = 0 : coefficientisinsignificant
H1 : βj ̸= 0 : coefficientissignificant

bj − 0
If |T | = > t(n−k−1)α/2 then reject H0 .
se(bj )
General T-test
H0 : βj = b ∗

Hypotheses pair
H1 : βj {̸= 0, >, <}b ∗
bj − b ∗
Statistic: T =
se(bj )

August 3, 2025 255 / 273


Overall Significant F-test

Model: y = β0 + β1 x1 + · · · + βk xk + ε
Hypotheses pair

H0 : β1 = · · · = βk = 0 : model is overall insignificant
H1 : At least oneβj ̸= 0 : model is overall significant

R 2 /k
If F = > f(k,n−k−1)α then reject H0 .
(1 − R 2 )/(n − k − 1)

Example 10.2
Data of 10 worker, exp is experience (year), edu is education (year)
> exp <-c(1 ,2 ,2 ,3 ,4 ,5 ,7, 10,10,12,15,16)
> edu <-c(13,12,16,11,15,15,10,15,13,11,13,15)
> wage <-c(6 ,6 ,12,6 ,11,8 ,8 ,10,11,10,15,13)

August 3, 2025 256 / 273


Example

Example 10.2

> summary(lm(wage ~ exp + edu))


Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.17703 3.57673 -0.609 0.55779
exp 0.40085 0.09815 4.084 0.00274 **
edu 0.67453 0.26252 2.569 0.03021 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Multiple R-squared: 0.7232, Adjusted R-squared: 0.6617
F-statistic: 11.76 on 2 and 9 DF, p-value: 0.003087

(a) Interpret result


(b) Test for overall significant of model, and significant of coefficients
(c) Confidence interval for coefficient

August 3, 2025 257 / 273


Relationship Analysis

Relationship between X and Y


Default: X explains to Y

Variable Variable Y
X Qualitative Quantitative
Qualitative Pie chart, bar chart Histograms, bar chart
Z-test for p1 , p2 T-test for µ1 , µ2
Independence Chisq test ANOVA, F-test for means
Spearman’s correlation test
Quantitative Scatter plot, bubble chart
Pearson’s correlation test
Regression

August 3, 2025 258 / 273


LECTURE 11. BAYESIAN STATISTCS

11.1. Concept
11.2. Bayesian distribution
11.3. Beta-Binomial inference
11.4. Gamma-Poisson inference
11.5. Normal-normal inference

Reference
Book [1] Chapter 14, pp.776 - 783
Book
The statistician who use traditional statistic inferences: frequentist.
Parameter is unchangeable.
Bayesian statistics: parameter is random variable.

August 3, 2025 259 / 273


11.1. Concept

Traditional statistics: frequentist statistics

Frequentist Statistics Bayesian Statistics


Probability Objective Subjective
observed by relative freq. updated by information
Starting belief No Yes
Parameter θ Non-random Random variable
Compute for θ Estimator θb E (θ)
Interval for θ Confidence interval Creditable interval
Hypothesis Testing with α Prob. of correct hypothesis
Sample size Large enough Any (n ≥ 1)

August 3, 2025 260 / 273


11.2. Bayesian Distribution

Total Probability and Bayes’ Theorem


Total probablity: H1 , H2 , ..., Hn are partitions, then probability of A:
P(A) = P(H1 )P(A|H1 ) + · · · + P(Hn )P(A|Hn )

and P(A ∩ Hi ) P(Hj )P(A|Hj )


P(Hj |A) = =P
P(A) i P(Hi )P(A|Hi )

P(Hj ) is prior probability; P(Hj |A) is posterior probability

Example 11.1
Proportion of people has virus is 20%. People are checked by a test, with
correct prob. is 0.9. (+) will be keep in hospital, (-) are free.
(a) Find proportion of Has and No virus in hospital
(b) People in hospital will be checked again, find prob. of Has and No
virus in people with (++)
August 3, 2025 261 / 273
Bayes probability

Example 11.1

Hi P(Hi ) P(+|Hi ) P(+ ∩ Hi ) P(Hi |+)


Has 0.2 0.9 0.18 0.692
No 0.8 0.1 0.08 0.308
Sum 1 0.26 1

For the second test


Hi P(Hi |+) P(+ + |Hi ) P(+ + ∩Hi ) P(Hi | + +)
Has 0.692 0.9 0.623 0.953
No 0.308 0.1 0.031 0.047
Sum 1 0.654 1

August 3, 2025 262 / 273


Posterior Probability distribution

Sample x = (x1 , x2 , ..., xn )


If θ is discrete: θ ∈ (θ1 , θ2 , ..., θm ), prior distribution P(θi ) and
P
i P(θi ) = 1, then posterior distribution

P(θj )P(x|θj )
P(θj |x) = P
i P(θi )P(x|θi )
R +∞
If θ is continuous, prior distribution fθ (θ) and −∞ fθ (θ)dθ = 1, then
posterior distribution

P(x|θ)f (θ)
fθ|x (θ|x) = R +∞
−∞ P(x|θ)fθ (θ)dθ

P(x|θ) is likelihood function of x in condition of θ

August 3, 2025 263 / 273


Inference of parameter p | Discrete

Example 11.2

Discrete p, what is most relevant value among 0.1, 0.3, 0.5, 0.7, 0.9
Sample x = (Y , Y , Y , N)
”Belief”: P(p = 0.1) = P(p = 0.3) = · · · = P(p = 0.9) = 0.2
Prior and Posterior distribution:
pi P(pi ) P(x|pi ) P(x ∩ pi ) P(pi |x)
0.1 0.2 0.0009 0.0002 0.0035
0.3 0.2 0.0189 0.0038 0.0732
0.5 0.2 0.0625 0.0125 0.2422
0.7 0.2 0.1029 0.0206 0.3987
0.9 0.2 0.0729 0.0146 0.2824
sum 1 0.0516 1
(prior) (posterior)

August 3, 2025 264 / 273


Conjugacy & Credible Interval

Conjugacy
If Distribution of X (P(x) or fX (x)) such that posterior fθ|x (θ|x) and prior
fθ (θ) are in the same class, then Disrtibution of X and distribution of θ are
‘conjugate’.

3 common conjugacy families of distributions


Binomial - Beta
Poisson - Gamma
Normal - Normal

Credible Interval
The interval (θ1 , θ2 ) that P(θ1 < θ < θ2 ) = 1 − α is credible interval at
level (1 − α).

August 3, 2025 265 / 273


11.3. Binomial - Beta conjugacy
X ∼ B(n, p); 0 ≤ p ≤ 1, assummed: p ∼ Beta(α, β)

α = 1, β =1
Beta distribution Beta(α, β) α = 1, β =3
f (p) α = 2, β =2
Γ(α + β) α−1 α = 2, β =5
fp (p) = p (1−p)β−1
Γ(α)Γ(β) α = 4, β =1

0 ≤ p ≤ 1, α > 0, β > 0
R∞
Γ(α) = 0 x α−1 e −x dx
integer α: Γ(α) = (α − 1)!
α
E (p) =
α+β
αβ
V (p) =
(α + β)2 (α + β + 1) p
0 1
August 3, 2025 266 / 273
Binomial - Beta conjugacy

Preposition
X ∼ B(n, p), prior distribution p ∼ Beta(α, β). Sample x of n observation
has x success and n − x failures, then posterior distribution

p|x ∼ Beta(α + x, β + n − x)

Example 11.3
There is no information of p (prob. of Yes), or prior belief of p is Uniform
in (0, 1).
(a) First sample is (N, Y , N, N, N). Find posterior belief, expectation,
variance and upper bounded credible interval 95% of p
(b) Do the similar if second sample of (Y , N, Y , N, N) is updated to the
data.

August 3, 2025 267 / 273


Binomial - Beta conjugacy

Solution 11.3
p ∼ U(0, 1) = Beta(1, 1), E (p) = 0.5; V (p) = 1/12 = 0.0833
(a) Sample x1 that n = 5, x = 1
Posterior distribution: p|x1 ∼ Beta(1 + 1, 1 + 4) = Beta(2, 5)
2 10
E (p) = ; V (p) = = 0.0255
7 392
R UL
P(p < UL) = 0.95 ⇔ 0 30p(1 − p)4 dp = 0.95 ⇔ UL = 0.5818
(b) Sample x2 that n = 5, x = 2
Posterior distribution: p|x1 , x2 ∼ Beta(2 + 2, 5 + 3) = Beta(4, 8)
1 32
E (p) = ; V (p) = = 0.017
3 1872
R UL
P(p < UL) = 0.95 ⇔ 0 15840p 3 (1 − p)7 dp = 0.95 ⇔ UL = 0.5644

August 3, 2025 268 / 273


Poisson - Gamma conjugacy

X ∼ P(λ); λ > 0, assummed: λ ∼ Gamma(α, β)

f (λ)

Distribution Gamma(α, β) α = 1, β =1
α = 2, β =1
1 α = 3, β =1
fλ (λ) = λα−1 e −λ/β
β α Γ(α) α = 1, β =2
α = 2, β =2
λ > 0, α > 0, β > 0
R∞
Γ(α) = 0 x α−1 e −x dx
integer α: Γ(α) = (α − 1)!
E (λ) = αβ
V (λ) = αβ 2

0 λ

August 3, 2025 269 / 273


Poisson - Gamma conjugacy

Preposition
X ∼ P(λ), prior distribution λ ∼ Gamma(α, β). Sample x = (x1 , x2 , ..., xn )
then posterior distribution
Xn
p|x ∼ Gamma(α∗ , β ∗ ) with α∗ = α + xi , β ∗ = n + β
i=1

Example 11.5
Number of customer in one hour is Poisson distributed with mean of λ.
Prior belief is λ has Gamma distributed with mean of 10 and variance of
20. Data of 5 hours shows number of customers are: 8, 9, 12, 13, 15.
(a) Posterior distribution, mean and variance of λ?
(b) Upper bounded credible interval 90% of λ ?

August 3, 2025 270 / 273


11.5. Normal - Normal conjugacy

Preposition
X ∼ N(µ, σ 2 ), σ is known, prior distribution µ ∼ N(η, τ 2 ). Sample
x = (x1 , x2 , ..., xn ) then posterior distribution

ησ 2 + nx̄τ 2 σ2τ 2
µ|x ∼ N(η ∗ , τ 2∗ ) with η ∗ = , τ 2∗ =
σ 2 + nτ 2 σ 2 + nτ 2

Example 11.6
Price is normal distributed with mean of µ and variance of 9. Prior belief
is the mean is normality N(20, 4). Random sample is (22, 23, 24, 28, 30).
(a) What is posterior distribution of µ?
(b) Find narrowest credible interval 95% of µ
(c) The second sample is (20, 24, 23, 21, 20). Compare between posterior
distribution of µ when (i) second step with new sample, and (ii)
combine two samples into one.
August 3, 2025 271 / 273
11.5. Predictive Inference

Not inference about posterior distribution of parameter θ, but variable X


that f (x, θ)
Prior distribution of X : P(x, θ) or fX (x, θ)
Posterior distribution of X : P(x, θ|x) or fX (x, θ|x)
Inference of X on posterior distribution of X .
Posterior distribution function of X
Z +∞
F (x, θ|x) = P[X < x|(θ|x)] = P[X < x|(θ|x)]fθ|x (θ|x)dθ
−∞
Z +∞ Z x 
= fX [t|(θ|x)]dt fθ|x (θ|x)dθ
−∞ −∞

August 3, 2025 272 / 273


Exercise - Lecture 11

Book Page Compulsory Optional


[1] 782 24, 25, 28

Bayesian statistics by R and Python: next course

THE END

August 3, 2025 273 / 273

You might also like