0% found this document useful (0 votes)
31 views50 pages

Slides On Hypotheses Testing

Uploaded by

Vishal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views50 pages

Slides On Hypotheses Testing

Uploaded by

Vishal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

Review

!
! ± !!/! !" !"#$$%& !" ! ± !!/! !. ! (!)
!
100(1-α) % confidence interval estimator of population mean µ
!
! ± !!/!
!
!
The Length of this interval ! !!/!
!
Will be less than or equal to b when the sample size n is such that
! !
! ≥ ! !!/!
!
Review
A 100(1-α) upper confidence bound for μ is

!
! + !!
!

A 100(1-α) lower confidence bound for μ is

!
! − !!
!

95% Confidence interval level is commonly called as margin of error


Review
If Standard Deviation of population is unknown
!
! ± !!!!,!/!
!
A 100(1-α) upper confidence bound for μ is
!
! + !!!!,!
!

A 100(1-α) lower confidence bound for μ is

!
! − !!!!,!
!
Review
Confidence interval estimate of p
!(! − !)
! ± !!/!
!
Ideally we should use t-distribution as we are using sample for
Estimation of error, but z-distribution is also fine as sample size is
generally large for estimation of p

Length of confidence interval


Review
Estimation of Standard Deviation

Chi-Squared Distribution:

!
(! − 1)!
!! =
!!
Testing Statistical Hypotheses

Statistical inference is the science of drawing conclusions about a


population based on information

We must then decide whether this hypothesis is consistent with


data obtained in a sample
Testing Statistical Hypotheses

A statistical hypothesis is a statement about the nature of a population

It is stated in terms of a population parameter and not sample


parameter

To test a statistical hypothesis,

we must decide whether that hypothesis appears to be consistent


with the data of the sample
Testing Statistical Hypotheses
The value of the population parameter specified in the null hypothesis is usually determined
in one of three ways.

Ø First, it may result from past experience or knowledge of the process or even from
previous tests or experiments. The objective of hypothesis testing, then, is usually to
determine whether the parameter value has changed.

Ø Second, this value may be determined from some theory or model regarding the process
under study. Here the objective of hypothesis testing is to verify the theory or model

Ø Third, the value of the population parameter results from external considerations, such
as design or engineering specifications, or from contractual obligations. In this situation,
the usual objective of hypothesis testing is conformance testing.
Testing Statistical Hypotheses

Suppose that a tobacco firm claims that it has discovered a new way of
curing tobacco leaves that will result in a mean nicotine content of a
cigarette of 1.5 milligrams or less.

Now a researcher is skeptical of this claim and indeed believes that the
mean will exceed 1.5 milligrams.

To disprove the claim of the tobacco firm, the researcher has decided to
test its hypothesis that the mean is less than or equal to 1.5 milligrams.
Testing Statistical Hypotheses

The statistical hypothesis to be tested, which is called the null hypothesis


and is denoted by H0, is thus that the mean nicotine content is less than
or equal to 1.5milligrams.

H0: μ ≤ 1.5

The alternative to the null hypothesis, which the tester is actually trying to
establish, is called the alternative hypothesis and is designated by H1. For
our example, H1 is the hypothesis that the mean nicotine content exceeds
1.5 milligrams, which can be written symbolically as

H1: μ > 1.5


Testing Statistical Hypotheses

The null hypothesis, denoted by H0, is a statement about a population


parameter.

The alternative hypothesis is denoted by H1.

The null hypothesis will be rejected if it appears to be inconsistent with the


sample data and will not be rejected otherwise.

The decision of whether to reject the null hypothesis is based on the


value of a test statistic.
Testing Statistical Hypotheses
A test statistic is a statistic whose value is determined from the sample data.

Depending on the value of this test statistic (TS), the null hypothesis will be
rejected or not.

The critical region, also called the rejection region (C), is that set of values of
the test statistic for which the null hypothesis is rejected.

Reject H0 if TS is in C
Do not reject H0 if TS is not in C
Testing Statistical Hypotheses

Reject H0 if TS is in C
Do not reject H0 if TS is not in C
Testing Statistical Hypotheses
H0: μ = 50 and H1: μ ≠ 50
Now take a sample, measure the sample average and decide:

Accept Reject
Reject

48 50 52

The rejection of the null hypothesis H0 is a strong statement that H0 does


not appear to be consistent with the observed data.

The result that H0 is not rejected is a weak statement that should be


interpreted to mean that H0 is consistent with the data (“Fail to Reject H0”
rather than “Accept H0”).
Testing Statistical Hypotheses
In testing a given null hypothesis, two different types of errors can
result

False Negative (Type I or a error):


Test rejects H0 when H0 is true

False Positive (Type II or b error):


Test does not reject H0 when H0 is false

Decision H0 is True H0 is False


Fail to Reject H0 No Error Type II or b Error
Reject H0 Type I or a Error No Error
Type I (a) and Type II Errors (b)

Ø The size of the critical region, and consequently the probability of a type I error α, can
always be reduced by appropriate selection of the critical values.

Ø Type I and type II errors are related. A decrease in the probability of one type of error
always results in an increase in the probability of the other provided that the sample size
n does not change.

Ø An increase in sample size reduces β provided that α is held constant.

Ø When the null hypothesis is false, β increases as the true value of the parameter
approaches the value hypothesized in the null hypothesis. The value of β decreases as the
difference between the true mean and the hypothesized value increases.
Power of a Statistical Test

The power of a statistical test is the probability of rejecting the

null hypothesis H0 when the alternative hypothesis is true.

The power is computed as 1 − β, and power can be interpreted

as

the probability of correctly rejecting a false null hypothesis.


Testing Statistical Hypotheses

The objective of a statistical test of the null hypothesis H0 is not to


determine whether H0 is true, but rather to determine if its truth is
consistent with the resultant data.

The classical procedure for testing a null hypothesis is to fix a small


significance level α and then require that the probability of rejecting H0 when
H0 is true is less than or equal to α wherein α could be equal to 0.10 or 0.05
or 0.01.
Testing Statistical Hypotheses

If you are trying to establish a certain hypothesis, then that hypothesis


should be designated as the alternative hypothesis.

Similarly, if you are trying to discredit a hypothesis, that hypothesis should


be designated the null hypothesis.
Testing Statistical Hypotheses

Consider a trial in which a judge must decide between hypothesis


“A” that the defendant is guilty and hypothesis “B” that he or she is
innocent.

(a) In the framework of hypothesis testing which of the hypotheses


should be the null hypothesis?

B is a null hypothesis as the person is not guilty unless proved


otherwise.
Testing Statistical Hypotheses
Tests Concerning the Mean of a normal population when variance is known

Suppose that X1, . . . , Xn are a sample from a normal distribution having an


unknown mean μ and a known variance σ2, and suppose we want to test the null
hypothesis that the mean μ is equal to some specified value against the alternative
that it is not. That is, we want to test

H0: μ = μ0
against the alternative hypothesis
H1: μ ≠ μ0
for a specified value μ0.
Testing Statistical Hypotheses
Suppose we want the test to have significance level α.
Sample mean is the natural point estimator of the population mean μ, and
for population with known standard deviation,

100(1-α) % confidence interval estimator of population mean µ


!
! ± !!/!
!
!
The Length of this interval ! !!/!
!
Therefore, if the difference of sample mean and hypotheses mean is
!
within this range ±!!/!
!
for significance level α, null hypotheses is not rejected
Testing Statistical Hypotheses

Not Reject H0 Otherwise


Testing Statistical Hypotheses

!
! − !!
!
−!!/!

reject H0 reject H0
Do rot reject H0
Testing Statistical Hypotheses
Suppose that if a signal of intensity μ is emitted from a particular star, then the value received
at an observatory on earth is a normal random variable with mean μ and standard deviation 4.
In other words, the value of the signal emitted is altered by random noise, which is normally
distributed with mean 0 and standard deviation 4. It is suspected that the intensity of the signal
is equal to 10. Test whether this hypothesis is plausible if the same signal is independently
received 20 times and the average of the 20 values received is 11.6. Use the 5 percent level of
significance.
If μ represents the actual intensity of the signal emitted, then the null hypothesis we want
to test is
H0: μ = 10 against the alternative H1: μ ≠ 10

To begin, we compute the value of the statistic

! !"
! − !! = !!. ! − !" = !. !"
! !
Now we are interested in testing this at significance level 0.05 it means our mean should be
within z0.025 range on either side of mean and we know that

z0.025 = 1.96 which indicate our value of 1.79 is within this range and hence the hypotheses
is not rejected
Testing Statistical Hypotheses
Since this value is less than z0.025 = 1.96, the null hypothesis is not rejected.

In other words, we conclude that the data are consistent with the null
hypothesis that the value of the signal is equal to 10.

The reason for this is that a sample mean as far from the value 10 as the
one observed would occur, when H0 is true, over 5 percent of the time.

Note, however, that if the significance level were chosen to be α = 0.1,


as opposed to α = 0.05,
then the null hypothesis would be rejected (since zα/2 = z0.05 = 1.645).
Testing Statistical Hypotheses

It is important to note that the “correct” level of significance to use in


any given hypothesis-testing situation depends on the individual
circumstances of that situation.

If rejecting H0 resulted in a large cost that would be wasted if H0


were indeed true, then we would probably elect to be conservative
and choose a small significance level
P-Value

!
! − !!
!
−!!/!

reject H0 reject H0
Do rot reject H0
Remember each value of zα has a corresponding value on x-axis which is the probability.
So instead of calculating if the hypotheses is valid for a given α level, we can calculate the
probability of this interval, call it v, of the test statistic √n (X − μ0)/σ .

This will give us now a significance level up to which the hypotheses is valid. This is called
p value.
P-Value

! !"
! − !! = !!. ! − !" = !. !"
! !

Area = 3.67% Area = 3.67%


92.66 %

1.79 (Za/2) 1.79 (Za/2)

P-value is the area of the shaded region

P-Value = 𝟏 − 𝑷 −𝟏. 𝟕𝟗 < 𝒁 < 𝟏. 𝟕𝟗

= 1-.9266 = 0.0734 or 7.34% 29


Testing Statistical Hypotheses

The p value is the smallest significance level at which the data lead to
rejection of the null hypothesis.

It gives the probability that data as unsupportive of H0 as those observed


will occur when H0 is true.

A small p value (say, 0.05 or less) is a strong indicator that the null
hypothesis is not true.

The smaller the p value, the greater the evidence for the falsity of H0.

p value > 0.1 Data provide weak evidence against H0


p value ~ 0.05 Data provide moderate evidence against H0
p value < 0.01 Data provide strong evidence against H0
Testing Statistical Hypotheses

It is standard practice to report the observed P-value along with the decision that is
made regarding the null hypothesis.

Clearly, the P-value provides a measure of the credibility of the null hypothesis.

Specifically, it is the risk that we have made an incorrect decision if we reject the null
hypothesis H0.

The “P-value” is not the probability that the null hypothesis is false, nor is “1 − P”
the probability that the null hypothesis is true.

The null hypothesis is either true or false (there is no probability associated with
this), so the proper interpretation of the P-value is in terms of the risk of wrongly
rejecting the null hypothesis H0.
Testing Statistical Hypotheses
Suppose that if a signal of intensity μ is emitted from a particular star, then the
value received at an observatory on earth is a normal random variable with mean
μ and standard deviation 4. In other words, the value of the signal emitted is
altered by random noise, which is normally distributed with mean 0 and standard
deviation 4. It is suspected that the intensity of the signal is equal to 10. Test
whether this hypothesis is plausible if the same signal is independently received 20
times and the average of the 20 values received is 11.6. Use the 5 percent level of
significance.

Suppose that the average of the 20 values in previous example is equal to 10.8.
In this case the absolute value of the test statistic is

! !"
! − !! = !". ! − !" = !. !"#
! !
P {|Z| ≥ 0.894} = 2P{Z ≥ 0.894} = 0.374
Testing Statistical Hypotheses

On the other hand, if the value of the sample mean were 7.8,

then the absolute value of the test statistic would be 2.46

P {|Z| ≥ 2.46} = 2P{Z ≥ 2.46} = 0.014

Thus, H0 would be rejected at all significance levels above 0.014 and


would not be rejected for lower significance levels.
Testing Statistical Hypotheses

Intensity of the actual signal (null hypothesis) is equal to 10

Observed Value P-Value


11.6 .0734
10.8 .374
7.8 .014
In previous example, assuming a 0.05 significance level, what is the probability that
the null hypothesis (that the signal intensity is equal to 10) will not be rejected
when the actual signal value is 9.2?
For σ = 4 and n = 20, the significance-level-0.05 test of
H0 = 10 against H1 is not equal to 10
is to reject H0 if √#$ (
|' − *$| ≥ ,$.$#.
%

H0 will be rejected

Now if the original (population) mean is 9.2, we have to calculate the probability
that the sample mean will exceed 11.753 or lower than 8.247 with error in
Prediction of 4/(20)0.5 =0.894
! !"#"$%&'( !" !! =
! ! ≥ !!. !"# + ! ! ≤ !. !"# = !. !"#$
That is, when the true signal value is 9.2, there is an 85.5% chance that the 0.05
significance level test will not reject the null hypothesis that the signal value is equal to
10
Testing Statistical Hypotheses
Approach to Hypothesis Testing with Fixed Probability
1. State the null and alternative hypotheses.
2. Choose a fixed significance level α.
3. Choose an appropriate test statistic and establish the critical region based
on α.
4. Reject H0 if the computed test statistic is in the critical region. Otherwise, do
not reject.
5. Draw scientific or engineering conclusions.

Significance Testing (P-Value Approach)


1. State null and alternative hypotheses.
2. Choose an appropriate test statistic.
3. Compute the P-value based on the computed value of the test statistic.
4. Use judgment based on the P-value and knowledge of the scientific
system.
Testing Statistical Hypotheses

Two-Sided vs One-Sided Tests


Testing Statistical Hypotheses
One-Sided Tests
So far we have been considering two-sided hypothesis-testing problems in which the
null hypothesis is that μ is equal to a specified value μ0 and the test is to reject this
hypothesis if sample mean is either too much larger or too much smaller than μ0.

However, in many situations, the hypothesis we are interested in testing is that the mean
is less than or equal to some specified value μ0 versus the alternative that it is greater
than that value. That is, we are often interested in testing

H0: μ ≤ μ0 against the alternative H1: μ > μ0


Testing Statistical Hypotheses

Since we would want to reject H0 only when the sample mean is much
larger than μ0 (and no longer when it is much smaller), it can be shown,
in exactly same fashion as was done in the two-sided case, that the
Significant-level-α test is to

!
Reject H0 if ! − !! ≥ !!
!

Not reject H0 Otherwise


Testing Statistical Hypotheses

Testing H0: μ ≤ μ0 against H1: μ > μ0

!
! − !!
!
!!
Do rot reject H0 reject H0
This test can be carried out alternatively by first computing the value of the test
statistic

The p value is then equal to the probability that a standard normal random variable is
at least as large as this value. That is, if the value of the test statistic is v, then

p value = P{Z ≥ v}

The null hypothesis is then rejected at any significant level greater that or equal to the
p value.
Testing Statistical Hypotheses

In similar fashion, we can test the null hypothesis

H0: μ ≥ μ0 against H1: μ < μ0

By first computing the value of the test statistic v .

The p value then equals the probability that a standard normal is less
than of equal to this value, and the null hypothesis is rejected if the
significant level is at least as large as the p value.
All cigarettes presently being sold have an average nicotine content of at least 1.5 mg per
cigarette. A firm that produces cigarettes claims that it has discovered a new technique for
curing tobacco leaves that results in an average nicotine content of a cigarette of less than 1.5
mg. To test this claim, a sample of 20 of the firm’s cigarettes was analyzed. If it were known
that the standard deviation of a cigarette’s nicotine content was 0.7 mg, what
conclusions could be drawn, at the 5 percent level of significance, if the average nicotine
content of these 20 cigarettes were 1.42 mg?
To see if the results establish the firm’s claim, let us see if they would lead to rejection of the
hypothesis that the firm’s cigarettes do not have an average nicotine content lower than 1.5
milligrams. That is, we should test

H0: μ ≥ 1.5 against H1: μ < 1.5


Since the value of the test statistic

! !"
!
! − !! = !.!
!. !" − !. ! = −!. !""

it follows that the p value is

p value = P{Z≤ -0.511} = 0.305


Since the p value exceeds 0.05, the foregoing data do not enable us to
reject the null hypothesis and conclude that the mean content per
cigarette is less than 1.5 milligrams.

In other words, even though the evidence supports the cigarette


producer’s claim(since the average nicotine content of those cigarettes
tested was indeed less than 1.5 milligrams), that evidence is not strong
enough to prove the claim. This is because a result at least as
supportive of the alternative hypothesis H1 as that obtained would be
expected to occur 30.5 percent of the time when the mean nicotine
content was 1.5 milligrams per cigarette.
Testing Statistical Hypotheses
Hypothesis tests concerning the mean μ of a normal population with
known variance σ2.
Testing Statistical Hypotheses
We have assumed so far that the underlying population distribution is
the normal distribution. However, we have only used this assumption
to conclude that √n(X − μ) /σ has a standard normal distribution. But
by the central limit theorem this result will approximately hold, no
matter what the underlying population distribution, as long as n is
reasonably large.

A rule of thumb is that a sample size of n ≥ 30 will almost always


suffice. Indeed, for many population distributions, a value of n as
small as 4 or 5 will result in a good approximation. Thus, all the
hypothesis tests developed so far can often be used even when the
underlying population distribution is not normal.

Problem Set P 406


Testing Statistical Hypotheses
Hypothesis tests concerning the mean μ of a normal population with
unknown variance σ2.
Testing Statistical Hypotheses
Hypothesis Tests Concerning Population Proportion

tests concerning the proportion of members of a population that possess


a certain characteristic

We suppose that the population is very large (in theory, of infinite size),
and we let p denote the unknown proportion of the population with the
characteristic. We will be interested in testing the null hypothesis

H0: p ≤ p0 against the alternative

H1: p ≥ p0 for a specified value p0


Testing Statistical Hypotheses
If a random selection of n elements of the population is made, then X, the number with
the characteristic, will have a binomial distribution with parameters n and p. Now it
should be clear that we want to reject the null hypothesis that the proportion is less than
or equal to p0 only when X is sufficiently large. Hence, if the observed value of X is x,
then the p value of these data will equal the probability that at least as large a value
would have been obtained if p had been equal to p0 (which is the largest possible value
of p under the null hypothesis). That is, if we observe that X is equal to x, then

p value = P{X ≥ x}

where X is a binomial random variable with parameters n and p0.

The p value can now be computed by using the normal approximation.


The null hypothesis should then be rejected at any significance level
that is greater than or equal to the p value.
Testing Statistical Hypotheses

Hypothesis Tests Concerning p, the Proportion of a Large Population


that Has a Certain Characteristic
Testing Statistical Hypotheses
Hypothesis Tests on Population Variance

You might also like