Slide 2023.2 MI2036 Chap4
Slide 2023.2 MI2036 Chap4
HYPOTHESIS TESTING
HANOI – 2024
(1)
Email: [email protected]
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 1/96
HANOI – 2024 1 / 96
CHAPTER OUTLINE
✍ After careful study of this chapter you should be able to do the following:
1 Understand concepts of hypothesis testing and significance testting.
2 Understand the binary hypothesis testing, multiple hypothesis test.
CONTENT
1 4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING
4.1.1 Random Sample and Sample Mean
4.1.2 Central Limit Theorem
4.1.3 Statistical Inference
4.1.4 Two Categories of Statistical Inference
2 4.2 Significance Testing
4.2.1 Significance Level
4.2.2 Design a Significance Test
4.2.3 Two Kinds of Errors
4.2.4 Problems
3 4.3 Binary Hypothesis Testing
4.3.1 Likelihood Functions
4.3.2 Type I Error and Type II Error
4.3.3 Design of a Binary Hypothesis Test
4.3.4 Four Methods of Choosing A0
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 4/96
HANOI – 2024 4 / 96
4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING 4.1.1 Random Sample and Sample Mean
Random Sample
■To define the random sample, consider repeated independent trials of an experiment.
■ Each trial results in one observation of a random variable, X. After n trials, we have
sample values of the n random variables X1 , . . . , Xn , all with the same CDF as X.
■ Define (X , X , . . . , X ) to be a random sample of size n from X.
1 2 n
✍ The random variables X1 , X2 , . . . , Xn are a random sample of size n if
■ the X ’s are independent random variables, and
i
■ every X has the same probability distribution.
i
Sample Mean
Sample Mean
Let E(X) = µ and V ar(X) = σ 2 . The sample mean X has expected value and variance
σ2
E(X) = µ and V (X) = (2)
n
(2)
σ2
(2) Pn Pn
E(X) = 1
n
1
i=1 E(Xi ) = n nµ = µ and V ar(X) = 1
n2 i=1 V ar(Xi ) = 1
n2
nσ 2 = n
.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 7/96
HANOI – 2024 7 / 96
4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING 4.1.2 Central Limit Theorem
Xn − µ
Zn = √
σ/ n
✍ The normal approximation for depends on the sample size n. The following figures
show the distributions of average scores obtained when tossing one, two, three, and five
dice, respectively.
■ To use the central limit theorem, we observe that we can express the iid sum
X = X1 + X2 + · · · + Xn as
√
X = σ nZn + nµ. (4)
Example 1
A modem transmits one million bits. Each bit is 0 or 1 independently with equal
probability. (a) Estimate the probability of at least 502,000 ones. (b) Transmit one
million bits. Let A denote the event that there are at least 499,000 ones but no
more than 501,000 ones. What is P (A)?
Solution. Let Xi be the value of bit i (0 or 1). Because Xi ∼ B(p) with p = 0.5,
E(Xi ) = 0.5 and V (Xi ) = 0.25 for all i. The number of ones in one million bits is
P 6
X = 10 i=1 Xi .
✍ To appreciate why the ±0.5 terms increase the accuracy of approximation, consider
the following simple but dramatic example in which k1 = k2 .
0.5 −0.5
P (8 ≤ K ≤ 8) ≈ P (7.5 ≤ X ≤ 8.5) = Φ √ −Φ √ = 2Φ(0.23) − 1
4.8 4.8
= 0.1819.
8
■ The exact value is P (X = 8) = C20 (0.4)8 (0.6)12 ≈ 0.1797.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 15/96
HANOI – 2024 15 / 96
4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING 4.1.2 Central Limit Theorem
Example 3
K is the number of heads in 100 flips of a fair coin. What is P (50 ≤ K ≤ 51)?
Solution.
■ Since K is a binomial (n = 100, p = 1/2) random variable,
50
P (50 ≤ K ≤ 51) = P (K = 50)+P (K = 51) = C100 (0.5)100 +C100
51
(0.5)100 ≈ 0.1576.
■ Since E(K) = 50 and σK = 5, the ordinary central limit theorem approximation
produces
51 − 50 50 − 50
P (50 ≤ K ≤ 51) ≈ Φ √ −Φ √ = Φ(0.33) − Φ(0)
25 25
= 0.62930 − 0.5 = 0.1293.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 16/96
HANOI – 2024 16 / 96
4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING 4.1.2 Central Limit Theorem
■ This approximation error of roughly 50% occurs because the ordinary central limit
theorem approximation ignores the fact that the discrete random variable K has
two probability masses in an interval of length 1.
■ As we see next, the De Moivre–Laplace approximation is far more accurate.
51 + 0.5 − 50 50 − 0.5 − 50
P (50 ≤ K ≤ 51) ≈ Φ √ −Φ √ = Φ(0.3) − Φ(0.1)
25 25
= 0.61791 + 0.53983 − 1 = 0.1577.
✍
■ Although the central limit theorem approximation provides a useful means of
calculating events related to complicated probability models, it has to be used with
caution. When the events of interest are confined to outcomes at the edge of the
range of a random variable, the central limit theorem approximation can be quite
inaccurate.
■ In these applications, it is necessary to resort to more complicated methods than a
central limit theorem approximation to obtain useful results. In particular, it is often
desirable to provide guarantees in the form of an upper bound rather than the
approximation offered by the central limit theorem.
Statistical Inference
Statistical Inference
Example 4
Suppose X1 , . . . , Xn are iid samples of an exponential (λ) random variable X with
unknown parameter λ. Using the observations X1 , . . . , Xn , each of the statistical
inference methods can answer questions regarding the unknown λ. For each of the
methods, we state the underlying assumptions of the method and a question that
can be addressed by the method.
1 Significance Test: Assuming λ is a constant, should we accept or reject the
hypothesis that λ = 3.5?
2 Hypothesis Test: Assuming λ is a constant, does λ equal 2.5, 3.5, or 4.5?
■ For the hypothesis test, the answer must be one of the numbers 2.5, 3.5, or 4.5.
CONTENT
1 4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING
4.1.1 Random Sample and Sample Mean
4.1.2 Central Limit Theorem
4.1.3 Statistical Inference
4.1.4 Two Categories of Statistical Inference
2 4.2 Significance Testing
4.2.1 Significance Level
4.2.2 Design a Significance Test
4.2.3 Two Kinds of Errors
4.2.4 Problems
3 4.3 Binary Hypothesis Testing
4.3.1 Likelihood Functions
4.3.2 Type I Error and Type II Error
4.3.3 Design of a Binary Hypothesis Test
4.3.4 Four Methods of Choosing A0
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 24/96
HANOI – 2024 24 / 96
4.2 Significance Testing 4.2.1 Significance Level
Significance Level
■ A significance test begins with the hypothesis, H0 , that a certain probability model
describes the observations of an experiment.
The question addressed by the test has two possible answers: accept the hypothesis
or reject it.
■ The significance level of the test is defined as the probability of rejecting the
hypothesis if it is true.
The test divides S, the sample space of the experiment, into an event space
consisting of an acceptance set A and a rejection set R = Ac .
■ If the observation s ∈ A, we accept H0 .
■ If s ∈ R, we reject the hypothesis.
■ The significance level is
α = P (s ∈ R) (8)
■ To design a significance test, we start with a value of α and then determine a set R
that satisfies Equation (8).
■ In many applications, H0 is referred to as the null hypothesis.
In these applications, there is a known probability model for an experiment. Then
the conditions of the experiment change and a significance test is performed to
determine whether the original probability model remains valid.
■ The null hypothesis states that the changes in the experiment have no effect on the
probability model.
■ Since E(N ) is large, we can use the central limit theorem and approximate
(N − µN )/σN by the standard normal distribution random variable Z so that
c c
α ≃ P |Z| ≥ √ =2 1−Φ √ = 0.05.
1000 1000
√ √
■ In this cases, Φ c/ 1000 = 0.975 and c = 1.96 × 1000 = 61.9806. Therefore, if
we observe more than 1000 + 61 calls or fewer than 1000 − 61 calls, we reject the
null hypothesis at significance level 0.05.
(3)
(3) N√−λ
If N is a Poisson random variable with E(N ) = λ and V ar(N ) = λ, Z = λ
is approximately a standard
normal random variable. The approximation is good for λ > 5.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 29/96
HANOI – 2024 29 / 96
4.2 Significance Testing 4.2.3 Two Kinds of Errors
In a significance test, two kinds of errors are possible. Statisticians refer to them as Type
I errors and Type II errors with the following definitions:
1 Type I error (False Rejection): Reject H0 when H0 is true.
2 Type II error (False Acceptance): Accept H0 when H0 is false.
✍ Remark
■ Although a significance test does not specify a complete probability model as an
alternative to the null hypothesis, the nature of the experiment influences the choice
of the rejection set, R.
■ In Example 5, we implicitly assume that the alternative to the null hypothesis is a
probability model with an expected value that is either higher than 1000 or lower
than 1000.
■ In the following example, the alternative is a model with an expected value that is
■ Under the null hypothesis, H0 , the probability model after the people take the diet
pill, is a N (190, 242 ), the same as before taking the pill.
■ The sample mean, X, is a normal random√ variable with expected value µX = 190
and a standard deviation σX = 24/ 64 = 3.
■ To design the significance test, it is necessary to find R such that
P (X ∈ R) = 0.01.
■ If we reject the null hypothesis, we will decide that the pill is effective and release it
to the public.
■ In this example, we want to know whether the pill has caused people to lose weight.
If they gain weight, we certainly do not want to declare the pill effective.
Therefore, we choose the rejection set R to consist entirely of weights below the
original expected value: R = {X ≤ r0 }.
■ We choose r0 so that the probability that we reject the null hypothesis is 0.01:
r − 190
0
P (X ∈ R) = P (X ≤ r0 ) = Φ = 0.01.
3
Since Φ(−2.33) = 0.01, it follows that (r0 − 190)/3 = −2.33, or r0 = 183.01.
■ Thus we will reject the null hypothesis and accept that the diet pill is effective at
significance level 0.01 if the sample mean of the population weight drops to 183.01
pounds or less.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 35/96
HANOI – 2024 35 / 96
4.2 Significance Testing 4.2.3 Two Kinds of Errors
✍ Remark
■ Note the difference between the symmetrical rejection set in Example 5 and the
significance test, and the one-sided rejection set is part of a one-tail significance
test.
z 0 1 2 3 4 5 6 7 8 9
0.0 0.50000 50399 50798 51197 51595 51994 52392 52790 53188 53586
0.1 53983 54380 54776 55172 55567 55962 56356 56749 57142 57535
0.2 57926 58317 58706 59095 59483 59871 60257 60642 61026 61409
0.3 61791 62172 62556 62930 63307 63683 64058 64431 64803 65173
0.4 65542 65910 66276 66640 67003 67364 67724 68082 68439 68739
0.5 69146 69447 69847 70194 70544 70884 71226 71566 71904 72240
0.6 72575 72907 73237 73565 73891 74215 74537 74857 75175 75490
0.7 75804 76115 76424 76730 77035 77337 77637 77935 78230 78524
0.8 78814 79103 79389 79673 79955 80234 80511 80785 81057 81327
0.9 81594 81859 82121 82381 82639 82894 83147 83398 83646 83891
1.0 84134 84375 84614 84850 85083 85314 85543 85769 85993 86214
1.1 86433 86650 86864 87076 87286 87493 87698 87900 88100 88298
1.2 88493 88686 88877 89065 89251 89435 89617 89796 89973 90147
1.3 90320 90490 90658 90824 90988 91149 91309 91466 91621 91774
1.4 91924 92073 92220 92364 92507 92647 92786 92922 93056 93189
1.5 93319 93448 93574 93699 93822 93943 94062 94179 94295 94408
1.6 94520 94630 94738 94845 94950 95053 95154 95254 95352 95449
1.7 95543 95637 95728 95818 95907 95994 96080 96164 96246 96327
1.8 96407 96485 96562 96638 96712 96784 96856 96926 96995 97062
1.9 97128 97193 97257 97320 97381 97441 97500 97558 97615 97670
z 0 1 2 3 4 5 6 7 8 9
2.0 97725 97778 97831 97882 97932 97982 98030 98077 98124 98169
2.1 98214 98257 98300 98341 98382 98422 99461 98500 98537 98574
2.2 98610 98645 98679 98713 98745 98778 98809 98840 98870 98899
2.3 98928 98956 98983 99010 99036 99061 99086 99111 99134 99158
2.4 99180 99202 99224 99245 99266 99285 99305 99324 99343 99361
2.5 99379 99396 99413 99430 99446 99261 99477 99492 99506 99520
2.6 99534 99547 99560 99573 99585 99598 99609 99621 99632 99643
2.7 99653 99664 99674 99683 99693 99702 99711 99720 99728 99763
2.8 99744 99752 99760 99767 99774 99781 99788 99795 99801 99807
2.9 99813 99819 99825 99831 99836 99841 99846 99851 99856 99861
3.0 0,99865 3,1 99903 3,2 99931 3,3 99952 3,4 99966
3.5 99977 3,6 99984 3,7 99989 3,8 99993 3,9 99995
4.0 999968
4.5 999997
5.0 99999997
Problems
Exercise 1
Let K be the number of heads in n = 100 flips of a coin. Devise significance tests for
the hypothesis H that the coin is fair such that
(a) The significance level α = 0.05 and the rejection set R has the form
{|K − E(K)| > c}.
(b) The significance level α = 0.01 and the rejection set R has the form {K > c}.
✍ Solution
(a) We wish
c
to develop a hypothesis test of the form P (|K − E(K)| > c),
2Φ σK − 1 = 0.95, and c =?.
(a) We wish to develop a testof the form P (K > c) = 0.01 and
P (K > c) ≃ 1 − Φ c−µσK
K
= 0.01, and c =?
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 39/96
HANOI – 2024 39 / 96
4.2 Significance Testing 4.2.4 Problems
Problems
Exercise 2
The duration of a voice telephone call is an exponential random variable T with
expected value E(T ) = 3 minutes. Data calls tend to be longer than voice calls on
average. Observe a call and reject the null hypothesis that the call is a voice call if
the duration of the call is greater than t0 minutes.
(a) Write a formula for α, the significance of the test as a function of t0 .
(b) What is the value of t0 that produces a significance level α = 0.05?
✍ Solution Z ∞
(a) The rejection region is R = {T > t0 } and α = P (T > t0 ) = fT (t)dt = e−t0 /3
t0
(b) t0 = −3 ln α and t0 =?
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 40/96
HANOI – 2024 40 / 96
4.2 Significance Testing 4.2.4 Problems
Problems
Exercise 3
A class has 2n (a large number) students. The students are separated into two
groups A and B each with n students. Group A students take exam A and earn iid
scores X1 , . . . , Xn . Group B students take exam B, earning iid scores Y1 , . . . , Yn . The
two exams are similar but different; however, the exams were designed so that a
student’s score X on exam A or Y on exam B have the same mean and variance
σ 2 = 100. For each exam, we form the sample mean statistic
X1 + · · · + Xn Y1 + · · · + Yn
XA = YB = .
n n
Problems
Exercise 3 (continuous)
Based on the statistic D = X A − Y B , use the central limit theorem to design a
significance test at significance level α = 0.05 for the hypothesis H0 that a students
score on the two exams has the same mean µ and variance σ 2 = 100. What is the
rejection region if n = 100? Make sure to specify any additional assumptions that you
need to make; however, try to make as few additional assumptions as possible.
✍ Solution
means is large. That is, R = {|D| ≥ d0 }.
We reject H0 if the difference in sample
α = P (|D| ≥ d0 ) = 2 1 − Φ(d0 /σD ) and d0 =?. If n = 100, d0 =?
CONTENT
1 4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING
4.1.1 Random Sample and Sample Mean
4.1.2 Central Limit Theorem
4.1.3 Statistical Inference
4.1.4 Two Categories of Statistical Inference
2 4.2 Significance Testing
4.2.1 Significance Level
4.2.2 Design a Significance Test
4.2.3 Two Kinds of Errors
4.2.4 Problems
3 4.3 Binary Hypothesis Testing
4.3.1 Likelihood Functions
4.3.2 Type I Error and Type II Error
4.3.3 Design of a Binary Hypothesis Test
4.3.4 Four Methods of Choosing A0
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 43/96
HANOI – 2024 43 / 96
4.3 Binary Hypothesis Testing 4.3.1 Likelihood Functions
Likelihood Functions
■ In a binary hypothesis test, there are two hypothetical probability models, H0 and
H1 , and two possible conclusions: accept H0 as the true model, and accept H1 .
■ There is also a probability model for H0 and H1 , conveyed by the numbers P (H0 )
and P (H1 ) = 1 − P (H0 ). These numbers are referred to as the a priori probabilities
or prior probabilities of H0 and H1 . They reflect the state of knowledge about the
probability model before an outcome is observed.
■ The complete experiment for a binary hypothesis test consists of two
sub-experiments.
1 The first sub-experiment chooses a probability model from sample space S ′ = {H0 , H1 }. The
probability models H0 and H1 have the same sample space, S.
2 The second sub-experiment produces an observation corresponding to an outcome, s ∈ S.
Likelihood Functions
■ When the observation leads to a random vector X, we call X the decision statistic.
Often, the decision statistic is simply a random variable X.
■ When the decision statistic X is discrete, the probability models are conditional probability
mass functions PX|H0 (x) and PX|H1 (x).
■ When X is a continuous random vector, the probability models are conditional probability
density functions fX|H0 (x) and fX|H1 (x).
■ In the terminology of statistical inference, these functions are referred to as
likelihood functions.
For example, fX|H0 (x) is the likelihood of x given H0 .
■ The design of a binary hypothesis test represents a trade-off between the two error
probabilities, PFA = P (A1 |H0 ) and PMISS = P (A0 |H1 ).
■ To understand the trade-off, consider an extreme design in which A0 = S consists
of the entire sample space and A1 = ∅ is the empty set. In this case, PFA = 0 and
PMISS = 1.
Now let A1 expand to include an increasing proportion of the outcomes in S. As A1
expands, PFA increases and PMISS decreases. At the other extreme, A0 = ∅, which
implies PMISS = 0. In this case, A1 = S and PFA = 1.
A graph representing the possible values of PFA and PMISS is referred to as a
receiver operating curve (ROC). Examples appear in Figure 2.
Example 8
The noise voltage in a radar detection system is a standard normal random variable,
N . When a target is present, the received signal is X = v + N volts with v ≥ 0.
Otherwise, the received signal is X = N volts. Periodically, the detector performs a
binary hypothesis test with H0 as the hypothesis no target and H1 as the hypothesis
target present. The acceptance sets for the test are A0 = {X < x0 } and
A1 = {X ≥ x0 }. Find PMISS and PFA as functions of x0 .
Solution
■ Find P
MISS and PFA as functions of x0 . To perform the calculations, we observe that
under hypothesis H0 , X = N is a normal N (0, 1) random variable;
under hypothesis H1 , X = v + N is a normal N (v, 1) random variable.
■ Therefore,
and
Figure 3: The probability of a miss and the probability of a false alarm as a function the threshold x0
for Example 8
The same data also appears in the corresponding receiver operating curves of Figure 4.
Figure 4: The corresponding receiver operating curve for the system. We see that the ROC improves as
v increases
Example 9
A modem transmits a binary signal to another modem. Based on a noisy
measurement, the receiving modem must choose between hypothesis H0 (the
transmitter sent a 0) and hypothesis H1 (the transmitter sent a 1). A false alarm
occurs when a 0 is sent but a 1 is detected at the receiver. A miss occurs when a 1 is
sent but a 0 is detected. For both types of error, the cost is the same.
■ The MAP test minimizes PERR , the total probability of error of a binary hypothesis
test. The law of total probability, Chapter 1, relates PERR to the a priori
probabilities of H0 and H1 and to the two conditional error probabilities,
PFA = P (A1 |H0 ) and PMISS = P (A0 |H1 ):
■ When the two types of errors have the same cost, as in Example 9, minimizing PERR
is a sensible strategy.
✍ Remark
■ Note that P (H |s) and P (H |s) are referred to as the a posteriori probabilities of
0 1
H0 and H1 .
Just as the a priori probabilities P (H0 ) and P (H1 ) reflect our knowledge of H0 and
H1 prior to performing an experiment, P (H0 |s) and P (H1 |s) reflect our knowledge
after observing s.
■ Theorem 2 states that in order to minimize P
ERR it is necessary to accept the
hypothesis with the higher a posteriori probability.
■ A test that follows this rule is a maximum a posteriori probability (MAP) hypothesis
test.
Equation (10) is another statement of the MAP decision rule. It contains the three
probability models that are assumed to be known:
■ The a priori probabilities of the hypotheses: P (H ) and P (H ).
0 1
■ The likelihood function of H : P (s|H ).
0 0
■ The likelihood function of H : P (s|H ).
1 1
When the outcomes of an experiment yield a random vector X as the decision statistic,
we can express the MAP rule in terms of conditional PMFs or PDFs.
■ If X is discrete, we take X = x to be the outcome of the experiment.
i
■ If the sample space S of the experiment is continuous, we interpret the conditional
Theorem 3
For an experiment that produces a random vector X, the MAP hypothesis test is
✍ Remark Theorem 3 states that H0 is the better conclusion if the evidence in favor of
H0 , based on the experiment, outweighs the prior evidence in favor of H1 .
Example 10
With probability p, a digital communications system transmits a 0. It transmits a 1
with probability 1 − p. The received signal is either X = −v + N volts, if the
transmitted bit is 0; or v + N volts, if the transmitted bit is 1. The voltage ±v is
the information component of the received signal, and N , a normal N (0, σ 2 )
random variable, is the noise component. Given the received signal X, what is the
minimum probability of error rule for deciding whether 0 or 1 was sent?
■ When p = 1/2, the threshold x∗ = 0 and the conclusion depends only on whether
the evidence in the received signal favors 0 or 1, as indicated by the sign of x.
■ When p ̸= 1/2, the prior information shifts the decision threshold x∗ .
■ The shift favors 1 (x∗ < 0) if p < 1/2.
■ The shift favors 0 (x∗ > 0) if p > 1/2.
Example 11
Find the error probability of the communications system of Example 10.
Solution
Applying Equation (9), we can write the probability of an error as
PERR = pP (X > x∗ |H0 ) + (1 − p)P (X < x∗ |H1 ).
Given H0 , X is N (−v, σ 2 ). Given H1 , X is N (v, σ 2 ). Consequently,
h x∗ + v i x∗ − v
PERR = p 1 − Φ + (1 − p)Φ
h σσ p v i
σ σ p v
=p 1−Φ ln + + (1 − p)Φ ln − .
2v 1 − p σ 2v 1 − p σ
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 64/96
HANOI – 2024 64 / 96
4.3 Binary Hypothesis Testing 4.3.4 Four Methods of Choosing A0
Example 12
At a computer disk drive factory, the manufacturing failure rate is the probability
that a randomly chosen new drive fails the first time it is powered up. Normally the
production of drives is very reliable, with a failure rate q0 = 10−4 . However, from
time to time there is a production problem that causes the failure rate to jump to
q1 = 10−1 . Let Hi denote the hypothesis that the failure rate is qi .
Every morning, an inspector chooses drives at random from the previous day’s
production and tests them. If a failure occurs too soon, the company stops
production and checks the critical part of the process. Production problems occur at
random once every ten days, so that P (H1 ) = 0.1 = 1 − P (H0 ). Based on N , the
number of drives tested up to and including the first failure, design a MAP
hypothesis test. Calculate the conditional error probabilities PFA and PMISS and the
total error probability PERR .
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 65/96
HANOI – 2024 65 / 96
4.3 Binary Hypothesis Testing 4.3.4 Four Methods of Choosing A0
Solution
Given a failure rate of qi , N is a random variable with
(
qi (1 − qi )n−1 , for n = 1, 2, . . . ,
PN (n) =
0, o.w.
and expected value 1/qi (see Chapter 2). That is, PN |Hi (n) = qi (1 − qi )n−1 for
n = 1, 2, . . . and PN |Hi (n) = 0 otherwise. Therefore, by Theorem 3, the MAP design
states
■ This implies that the inspector tests at most 45 drives in order to reach a conclusion
about the failure rate. If the first failure occurs before test 46, the company
assumes that the failure rate is 10−1 . If the first 45 drives pass the test, then
N ≥ 46 and the company assumes that the failure rate is 10−4 .
■ The error probabilities are:(4)
■ The MAP test implicitly assumes that both types of errors (miss and false alarm)
are equally serious. This is not the case in many important situations.
■ Consider an application in which C = C10 units is the cost of a false alarm (decide
H1 when H0 is correct) and C = C01 units is the cost of a miss (decide H0 when
H1 is correct). In this situation the expected cost of test errors is
E(C) = P (H0 )P (A1 |H0 )C10 + P (H1 )P (A0 |H1 )C01 (15)
✍ Remark
In this test, we note that only the relative cost C01 /C10 influences the test, not the
individual costs or the units in which cost is measured. A ratio > 1 implies that misses
are more costly than false alarms. Therefore, a ratio > 1 expands A1 , the acceptance set
for H1 , making it harder to miss H1 when it is correct. On the other hand, the same
ratio contracts H0 and increases the false alarm probability, because a false alarm is less
costly than a miss.
Example 13
Continuing the disk drive test of Example 12, the factory produces 1,000 disk drives
per hour and 10,000 disk drives per day. The manufacturer sells each drive for $100.
However, each defective drive is returned to the factory and replaced by a new
drive. The cost of replacing a drive is $200, consisting of $100 for the replacement
drive and an additional $100 for shipping, customer support, and claims processing.
Further, remedying a production problem results in 30 minutes of lost production.
Based on the decision statistic N , the number of drives tested up to and including
the first failure, what is the minimum cost test?
■ By comparison, the MAP test, which minimizes the probability of an error, rather
than the expected cost has an expected cost
Introduction
■ Given an observation, the MAP test minimizes the probability of accepting the
wrong hypothesis and the minimum cost test minimizes the cost of errors.
■ However, the MAP test requires that we know the a priori probabilities P (H ) of
i
the competing hypotheses, and the minimum cost test requires that we know in
addition the relative costs of the two types of errors.
In many situations, these costs and a priori probabilities are difficult or even
impossible to specify.
■ The Neyman-Pearson test minimizes P
MISS subject to the false alarm probability
constraint PFA = α, where α is a constant that indicates our tolerance of false
alarms.
Because PFA = P (A1 |H0 ) and PMISS = P (A0 |H1 ) are conditional probabilities, the
test does not require the a priori probabilities P (H0 ) and P (H1 ).
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 76/96
HANOI – 2024 76 / 96
4.3 Binary Hypothesis Testing 4.3.4 Four Methods of Choosing A0
Neyman-Pearson Test
fX|H0 (x)
x ∈ A0 if L(x) = ≥ γ; x ∈ A1 otherwise (18)
fX|H1 (x)
Z
where γ is chosen so that fX|H0 (x)dx = α.
L(x)<γ
Neyman-Pearson Test
PX|H0 (x)
x ∈ A0 if L(x) = ≥ γ; x ∈ A1 otherwise (19)
PX|H1 (x)
P
where γ is the largest possible value such that L(x)<γ PX|H0 (x) ≤ α.
Neyman-Pearson Test
Example 14
Continuing the disk drive factory test of Example 12, design a Neyman-Pearson test
such that the false alarm probability satisfies PFA ≤ α = 0.01. Calculate the
resulting miss and false alarm probabilities.
Solution
■ The Neyman-Pearson test is
PN |H0 (n)
n ∈ A0 if L(n) = ≥ γ; n ∈ A1 otherwise.
PN |H1 (n)
■ We see from Equation (13) that this is the same as the MAP test with
P (H1 )/P (H0 ) replaced by γ.
■ Thus, just like the MAP test, the Neyman-Pearson test must be a threshold test of
the form
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 ∗ 79/96
HANOI – 2024 79 / 96
4.3 Binary Hypothesis Testing 4.3.4 Four Methods of Choosing A0
Neyman-Pearson Test
■ Since
∗ −1
PFA = P (N ≤ n∗ − 1|H0 ) = 1 − (1 − q0 )n ≤ α.
This implies
ln(1 − α) ln(0.99)
n∗ ≤ 1 + =1+ = 101.49.
ln(1 − q0 ) ln(0.9)
■ Thus, we can choose n∗ = 101 and still meet the false alarm probability constraint.
■ The error probabilities are:
Neyman-Pearson Test
✍ Remark
We see that a one percent false alarm probability yields a dramatic reduction in the
probability of a miss. Although the Neyman-Pearson test minimizes neither the overall
probability of a test error nor the expected cost E(C), it may be preferable to either the
MAP test or the minimum cost test. In particular, customers will judge the quality of the
disk drives and the reputation of the factory based on the number of defective drives
that are shipped. Compared to the other tests, the Neyman-Pearson test results in a
much lower miss probability and far fewer defective drives being shipped.
Introduction
■ Similar to the Neyman-Pearson test, the maximum likelihood (ML) test is another
mathematically.
Remark
■ Comparing Theorem 2 and Definition 3, we see that in the absence of information
Remark
When the decision statistic of the experiment is a random vector X, we can express the
ML rule in terms of conditional PMFs or PDFs, just as we did for the MAP rule.
Theorem 7
If an experiment produces a random vector X, the ML decision rule states
PX|H0 (x)
Discrete: x ∈ A0 if ≥ 1; x ∈ A1 otherwise (21)
PX|H1 (x)
fX|H0 (x)
Continuous: x ∈ A0 if ≥ 1; x ∈ A1 otherwise (22)
fX|H1 (x)
Example 15
Continuing the disk drive test of Example 12, design the maximum likelihood test
for the factory state based on the decision statistic N , the number of drives tested
up to and including the first failure.
Solution
■ The ML hypothesis test corresponds to the MAP test with P (H ) = P (H ) = 0.5.
0 1
■ In this case, Equation (14) implies n∗ = 66.62 or A = {n ≥ 67}.
0
■ The conditional error probabilities under the ML rule are
Problems
Exercise 4
The duration of a voice telephone call is an exponential random variable V with
expected value E(V ) = 3 minutes. The duration of a data call is an exponential
random variable D with expected value E(D) = µD > 3 minutes. The null
hypothesis of a binary hypothesis test is H0 : a call is a voice call. The alternative
hypothesis is H1 : a call is a data call. The probability of a voice call is P (V ) = 0.8.
The probability of a data call is P (D) = 0.2. A binary hypothesis test measures T
minutes, the duration of a call. The decision is H0 if T ≤ t0 minutes. Otherwise, the
decision is H1 .
(a) Write a formula for the false alarm probability as a function of t0 and µD .
(b) Write a formula for the miss probability as a function of t0 and µD .
(c) Calculate the maximum likelihood decision time t0 = tML for µD = 6 minutes
and µD = 10 minutes.
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 87/96
HANOI – 2024 87 / 96
4.3 Binary Hypothesis Testing 4.3.5 Problems
Problems
Solution
Likelihood functions fT |H0 (t) = 13 e−t/3 , t ≥ 0 and fT |H1 (t) = µ1D e−t/µD , t ≥ 0.
The acceptance regions are A0 = {T | T ≤ t0 } and A1 = {T | T > t0 }.
Z ∞
(a) PFA = P (A1 |H0 ) = fT |H0 (t)dt, and PFA =?
t0
Z t0
(b) PMISS = P (A0 |H1 ) = fT |H1 (t)dt, and PMISS =?
0
ln(µD /3)
(c) t ∈ A0 if t ≤ tML = 1/3−1/µD . When µD = 6, tML =? When µD = 10, tML =?
ln(4µD /3)
(d) t ∈ A0 if t ≤ tMAP = 1/3−1/µ D
. When µD = 6, tMAP =? When µD = 10, tMAP =?
Problems
Exercise 5
In the radar system of Example 8, the probability that a target is present is
P (H1 ) = 0.01. In the case of a false alarm, the system issues an unnecessary alert at
the cost of C10 = 1 unit. The cost of a miss is C01 = 104 units because the target
could cause a lot of damage. When the target is present, the voltage is X = 4 + N ,
a Gaussian (4, 12 ) random variable. When there is no target present, the voltage is
X = N , the Gaussian (0, 12 ) random variable. In a binary hypothesis test, the
acceptance sets are A0 = {X ≤ x0 } and A1 = {X > x0 }.
(a) What is the average cost of the MAP policy?
(b) What is the average cost of the minimum cost policy?
Problems
Solution
(a) Given H0 , X is Gaussian (0, 12 ). Given H1 , X is Gaussian (4, 2
1 ).
The MAP rule simplifies to x ∈ A0 if x ≤ xMAP = 2 − 41 ln PP (H 1)
(H0 )
and xMAP =?
So PFA = P (X > xMAP |H0 ) =? and PMISS = P (X ≤ xMAP |H1 ) =?
Hence E(CMAP ) = C10 PFA P (H0 ) + C01 PMISS P (H1 ) =?
(b) The minimum cost test is x ∈ A0 if x ≤ xMC = 2 − 41 ln C 01 P (H1 )
C10 P (H0 )
and xMC =?
So PFA = P (X ≥ xMC |H0 ) =? and PMISS = P (X < xMC |H1 ) =?
Hence E(CMC ) = C10 PFA P (H0 ) + C01 PMISS P (H1 ) =?
CONTENT
1 4.1 BASIC CONCEPTS OF HYPOTHESIS TESTING
4.1.1 Random Sample and Sample Mean
4.1.2 Central Limit Theorem
4.1.3 Statistical Inference
4.1.4 Two Categories of Statistical Inference
2 4.2 Significance Testing
4.2.1 Significance Level
4.2.2 Design a Significance Test
4.2.3 Two Kinds of Errors
4.2.4 Problems
3 4.3 Binary Hypothesis Testing
4.3.1 Likelihood Functions
4.3.2 Type I Error and Type II Error
4.3.3 Design of a Binary Hypothesis Test
4.3.4 Four Methods of Choosing A0
Nguyễn Thị Thu Thủy (SAMI-HUST) ProSta-CHAP4 91/96
HANOI – 2024 91 / 96
4.4 Multiple Hypothesis Test
Introduction
■ There are many applications in which an experiment can conform to more than two
known probability models, all with the same sample space S. A multiple hypothesis
test is a generalization of a binary hypothesis test.
■ There are M hypothetical probability models: H , H , . . . , H
0 1 M −1 . We perform an
experiment and based on the outcome, we come to the conclusion that a certain
Hm is the true probability model.
■ The design of the experiment consists of dividing S into an event space consisting
Example 16
A computer modem is capable of transmitting 16 different signals. Each signal
represents a sequence of four bits in the digital bit stream at the input to the
modem. The modem receiver examines the received signal and produces four bits in
the bit stream at the modem’s output. The design of the modem considers the task
of the receiver to be a test of 16 hypotheses H0 , H1 , . . . , H15 , where H0 represents
0000, H1 represents 0001,. . . , and H15 represents 1111. The sample space of the
experiment is an ensemble of possible received signals. The test design places each
outcome s in a set Ai such that the event s ∈ Ai leads to the output of the four-bit
sequence corresponding to Hi .
Remark
■ For a multiple hypothesis test, the MAP hypothesis test and the ML hypothesis test
M
X −1
PCORRECT = P (Hi )P (Ai |Hi ) (23)
i=0
Theorem 8
For an experiment that produces a random variable X, the MAP multiple
hypothesis test is
Discrete xi ∈ Am if P (Hm )PX|Hm (xi ) ≥ P (Hj )PX|Hj (xi ) for all j (24)
Continuous x ∈ Am if P (Hm )fX|Hm (x) ≥ P (Hj )fX|Hj (x) for all j (25)