0% found this document useful (0 votes)
13 views30 pages

IFT Notes R06 Hypothesis Testing

CFA Level 1 2022

Uploaded by

heartcontract_
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views30 pages

IFT Notes R06 Hypothesis Testing

CFA Level 1 2022

Uploaded by

heartcontract_
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

R06 Hypothesis Testing 2022 Level I Notes

R06 Hypothesis Testing

1. Introduction ......................................................................................................................................................3
2. The Process of Hypothesis Testing...........................................................................................................3
2.1 Stating the Hypotheses ..........................................................................................................................3
2.2 Two-Sided vs. One-Sided Hypotheses ..............................................................................................4
2.3 Selecting the Appropriate Hypotheses ............................................................................................4
3. Identify the Appropriate Test Statistic ...................................................................................................5
3.1 Test Statistics .............................................................................................................................................5
3.2 Identifying the Distribution of the Test Statistic ..........................................................................5
4. Specify the Level of Significance ................................................................................................................6
5. State the Decision Rule..................................................................................................................................7
5.1 Determining Critical Values .................................................................................................................7
5.2 Decision Rules and Confidence Intervals ........................................................................................8
5.3 Collect the Data and Calculate the Test Statistic ..........................................................................8
6. Make a Decision ...............................................................................................................................................8
7. The Role of p-Values ......................................................................................................................................9
8. Multiple Tests and Interpreting Significance .......................................................................................9
9. Tests Concerning a Single Mean ................................................................................................................9
10. Test Concerning Differences Between Means with Independent Samples ......................... 11
11. Test Concerning Differences Between Means with Dependent Samples ............................ 12
12. Testing Concerning Tests of Variances (Chi-Square Test) ........................................................ 14
12.1 Tests of a Single Variance ................................................................................................................ 14
12.2 Test Concerning the Equality of Two Variances (F-Test) ................................................... 16
13. Parametric vs. Nonparametric Tests.................................................................................................. 17
14. Tests Concerning Correlation ............................................................................................................... 17
14.1 Parametric Test of a Correlation .................................................................................................. 18
14.2 Tests Concerning Correlation: The Spearman Rank Correlation Coefficient .............. 19
15. Test of Independence Using Contingency Table Data ................................................................. 19
Summary .............................................................................................................................................................. 20
Practice Questions ............................................................................................................................................ 25

This document should be read in conjunction with the corresponding reading in the 2022 Level I CFA®
Program curriculum. Some of the graphs, charts, tables, examples, and figures are copyright
2021, CFA Institute. Reproduced and republished with permission from CFA Institute. All rights
reserved.

© IFT. All rights reserved 1


R06 Hypothesis Testing 2022 Level I Notes

Required disclaimer: CFA Institute does not endorse, promote, or warrant the accuracy or quality of
the products or services offered by IFT. CFA Institute, CFA®, and Chartered Financial Analyst® are
trademarks owned by CFA Institute.

Version 1.2

© IFT. All rights reserved 2


R06 Hypothesis Testing 2022 Level I Notes

1. Introduction
Hypothesis testing is the process of making judgments about a larger group (a population)
on the basis of observing a smaller group (a sample). The results of such a test then help us
evaluate whether our hypothesis is true or false.
For example, let’s say you are a researcher and you believe that the average return on all
Asian stocks was greater than 2%. To test this belief, you can draw samples from a
population of all Asian stocks and employ hypothesis testing procedures. The results of this
test can tell you if your belief is statistically valid.
In this reading we will look at hypothesis tests concerning the mean, variance and
correlation.
2. The Process of Hypothesis Testing
A hypothesis is defined as a statement about one or more populations. In order to test a
hypothesis, we follow these steps:
1. State the hypothesis.
2. Identify the appropriate test statistic and its probability distribution.
3. Specify the significance level.
4. State the decision rule.
5. Collect data and calculate the test statistic.
6. Make a decision.
2.1 Stating the Hypotheses
For each hypothesis test, we always state two hypotheses: the null hypothesis (H0) and the
alternative hypothesis (Ha).
Null hypothesis (H0): It is the hypothesis that the researcher wants to reject.
Alternative hypothesis (Ha): It is the hypothesis that the researcher wants to prove. If the
null hypothesis is rejected then the alternative hypothesis is considered valid.
Suppose you are a researcher and believe that the average return on all Asian stocks was
greater than 2%. In this case, you are making a statement about the population mean (µ) of
all Asian stocks.
For this example, the null and alternative hypotheses are:
H0: µ ≤ 2%
Ha: µ > 2%
(The value 2% is known as µ0, the hypothesized value of the population mean.)
Instructor’s Note:
An easy way to differentiate between the two hypotheses is to remember that the null
hypothesis always contains some form of the equal sign.

© IFT. All rights reserved 3


R06 Hypothesis Testing 2022 Level I Notes

2.2 Two-Sided vs. One-Sided Hypotheses


The alternative hypothesis can be one-sided or two-sided depending on the proposition
being tested. A one-sided test is also called a one-tailed test, and a two-sided test is also
called a two-tailed test.
If we want to determine whether the estimated value of a population parameter is less than
(or greater than) a hypothesized value we use a one-tailed test. However, if we want to
determine whether the estimated value of a population parameter is different than a
hypothesized value, we use a two tailed test.
Two-sided test: Suppose we want to test if the population mean is equal to 2%. The null and
alternative hypothesis can be expressed as:
H0: µ = 2%
Ha: µ ≠ 2%
Since the alternative hypothesis contains a ≠ sign this is a two-sided test.
One-sided test (right side): Suppose we want to test if the population mean is greater than
2%. The null and alternative hypothesis can be expressed as:
H0: µ ≤ 2%
Ha: µ > 2%
Since the alternative hypothesis contains a > sign this is a one-sided test, and we are
interested in the right side.
Instructor’s Note
The sign in the alternative hypothesis points to the direction of the tail that we should use in
our test. Since in our example the alternative hypothesis has a ‘>’ sign it points to the right,
therefore we are interested in the right tail.
One-sided test (left side): Suppose we want to test if the population mean is less than 2%.
The null and alternative hypothesis can be expressed as:
H0: µ ≥ 2%
Ha: µ < 2%
Since the alternative hypothesis contains a < sign this is a one-sided test, and we are
interested in the left side.
2.3 Selecting the Appropriate Hypotheses
The easiest approach is to specify the alternative hypothesis first and then specify the null.
Using a ‘<’ or ‘>’ sign in the alternative hypothesis instead of a ‘≠’ sign reflects that belief of
the researcher more strongly. However, a researcher may sometimes select a two-sided test
to depict neutrality in his beliefs.

© IFT. All rights reserved 4


R06 Hypothesis Testing 2022 Level I Notes

3. Identify the Appropriate Test Statistic


3.1 Test Statistics
A test statistic is calculated from sample data and is compared to a critical value to decide
whether or not we can reject the null hypothesis. The test statistic that should be used
depends on what we are testing. For example, the test statistic for the test of a population
mean is calculated as:
sample statistic − value of the parameter under H0 ̅X − μ0
test statistic = = σ
standard error of the sample statistic
√n

Continuing with our Asian stocks example, suppose we want to test if the population mean is
greater than a particular hypothesized value. We draw 36 observations and get a sample
mean of 4. We are also told that the standard deviation of the population is 4. If the
hypothesized value of the population mean is 2, the test statistic is calculated as:
̅
X − μ0 4−2
test statistic = σ = 4 =3
√n √36

However, if the hypothesized value of the population mean is 0, then the test statistic is
calculated as:
̅
X − μ0 4−0
test statistic = σ = 4 =6
√n √36

3.2 Identifying the Distribution of the Test Statistic


Exhibit 4 from the curriculum shows which test-statistics should be used depending on what
we want to test and their corresponding distributions.
Instructor’s Note: You will understand this table better, after you finish reading the
remaining sections.
What We Want to Probability Distribution Degrees of
Test Test Statistic of the Statistic Freedom
Test of a single X − 0 t-Distributed n−1
t =
mean s
n

Test of the ( X1 − X 2 ) − (1 −  2 ) t-Distributed n1 + n2 − 2


t =
difference in means s 2p s 2p
+
n1 n2

© IFT. All rights reserved 5


R06 Hypothesis Testing 2022 Level I Notes

Test of the mean of d − d 0 t-Distributed n−1


t =
differences sd

Test of a single s 2 (n − 1) Chi-square distributed n−1


2 =
variance 02

Test of the s12 F-distributed n1 − 1, n2 − 1


F =
difference in s22
variances
Test of a r n−2 t-Distributed n−2
t =
correlation 1 − r2

Test of m (O ij − E ij )2 Chi-square distributed (r − 1)(c − 1)


independence 2 =  E ij
i =1
(categorical data)

4. Specify the Level of Significance


In reaching a statistical decision, we can make two possible errors:
• Type I error: We may reject a true null hypothesis.
• Type II error: We may fail to reject a false null hypothesis.
The following table shows the possible outcomes of a test.
True condition
Decision
H0 true H0 false
Do not reject H0 Correct decision Type II error
Reject H0 (accept Ha) Type I error Correct decision
The probability of a Type I error is also known as the level of significance of a test and is
denoted by ‘α’. A related term, confidence level is calculated as (1 - α ). For example, a level
of significance of 5% for a test means that there is a 5% probability of rejecting a true null
hypothesis and corresponds to the 95% confidence level.
Controlling the two types of errors involves a trade-off. If we decrease the probability of a
Type I error by specifying a smaller significance level (for e.g., 1% instead of 5%), we
increase the probability of a Type II error. The only way to reduce both types of error
simultaneously is by increasing the sample size, n.
The probability of a Type II error is denoted by ‘β’. The power of test is calculated as (1 - β).
It represents the probability of correctly rejecting the null when it is false.
The different probabilities associated with the hypothesis testing decisions are presented in
the table below:

© IFT. All rights reserved 6


R06 Hypothesis Testing 2022 Level I Notes

True condition
Decision
H0 true H0 false
Do not reject H0 Confidence level (1 - α ) β
Reject H0 (accept Ha) Level of significance (α) Power of the test (1 - β)
The most commonly used levels of significance are: 10%, 5% and 1%.
5. State the Decision Rule
A decision rule involves determining the critical values based on the level of significance;
and comparing the test statistic with the critical values to decide whether to reject or not
reject the null hypothesis. When we reject the null hypothesis, the result is said to be
statistically significant.
5.1 Determining Critical Values
One-tailed test:
Continuing with our Asian stocks example, suppose we want to test if the population mean is
greater than 2%. Say we want to test our hypothesis at the 5% significance level. This is a
one-tailed test and we are only interested in the right tail of the distribution. (If we were
trying to assess whether the population mean is less than 2%, we would be interested in the
left tail.)
The critical value is also known as the rejection point for the test statistic. Graphically, this
point separates the acceptance and rejection regions for a set of values of the test statistic.
This is shown below:

The region to the left of the test statistic is the ‘acceptance region’. This represents the set of
values for which we do not reject (accept) the null hypothesis. The region to the right of the
test statistic is known as the ‘rejection region’.
Using the Z –table and 5% level of significance, the critical value = Z0.05= 1.65

© IFT. All rights reserved 7


R06 Hypothesis Testing 2022 Level I Notes

Two-tailed test:
In a two-tailed test, two critical values exist, one positive and one negative. For a two-sided
test at the 5% level of significance, we split the level of significance equally between the left
0.05
and right tail i.e. = 0.025 in each tail.
2
This corresponds to rejection points of +1.96 and -1.96. Therefore, we reject the null
hypothesis if we find that the test statistic is less than -1.96 or greater than +1.96. We fail to
reject the null hypothesis if -1.96 ≤ test statistic ≤ +1.96. Graphically, this can be shown as:

5.2 Decision Rules and Confidence Intervals


The above figure also illustrates the relationship between confidence intervals and
hypothesis tests. A confidence interval specifies the range of values that may contain the
hypothesized value of the population parameter. The 5% level of significance in the
hypothesis tests corresponds to a 95% confidence interval. When the hypothesized value of
the population parameter is outside the corresponding confidence interval, the null
hypothesis is rejected. When the hypothesized value of the population parameter is inside
the corresponding confidence interval, the null hypothesis is not rejected.
5.3 Collect the Data and Calculate the Test Statistic
In this step we first ensure that the sampling procedure does not include biases, such as
sample selection or time bias. Then, we cleanse the data by removing inaccuracies and other
measurement errors in the data. Once we are convinced that the sample data is unbiased and
accurate, we use it to calculate the appropriate test statistic.
6. Make a Decision
A statistical decision simply consists of rejecting or not rejecting the null hypothesis. If the
test statistic lies in the rejection region, we will reject H0. On the other hand, if the test
statistic lies in the acceptance region, then we cannot reject H0.

© IFT. All rights reserved 8


R06 Hypothesis Testing 2022 Level I Notes

An economic decision takes into consideration all economic issues relevant to the decision,
such as transaction costs, risk tolerance, and the impact on the existing portfolio. Sometimes
a test may indicate a result that is statistically significant, but it may not be economically
significant.
7. The Role of p-Values
The p-value is the smallest level of significance at which the null hypothesis can be rejected.
It can be used in the hypothesis testing framework as an alternative to using rejection points.
• If the p-value is lower than our specified level of significance, we reject the null
hypothesis.
• If the p-value is greater than our specified level of significance, we do not reject the
null hypothesis.
For example, if the p-value of a test is 4%, then the hypothesis can be rejected at the 5% level
of significance, but not at the 1% level of significance.
Relationship between test-statistic and p-value
A high test-statistic implies a low p-value.
A low test-statistic implies a high p-value.
8. Multiple Tests and Interpreting Significance
A Type I error represents a false positive result – rejecting the null when it is true. When
multiple hypothesis tests are run on the same population, some tests will give false positive
results. The expected portion of false positives is called the false discovery rate (FDR). For
example, if you run 100 tests and use a 5% level of significance, you will get five false
positives on average. This issue is called the multiple testing problem.
To overcome this issue, the false discovery approach is used to adjust the p-values when
you run multiple tests. The researcher first ranks the p-values from the various tests from
lowest to highest. He then makes the following comparison, starting with the lowest p-value
(with k = 1), p(1):
Rank of i
p(1) ≤ α Number of tests

This comparison is repeated until we find the highest ranked p(k) for which this condition
holds. For example, say we perform this check for k=1, k=2, k=3, and k=4; and we find that
the condition holds true for k=4. Then we can say that the first four tests ranked on the basis
of the lowest p-values are significant.
9. Tests Concerning a Single Mean
One of the decisions we need to make in hypothesis testing is deciding which test statistic
and which corresponding probability distribution to use. We use the following table to make
this decision:

© IFT. All rights reserved 9


R06 Hypothesis Testing 2022 Level I Notes

Small sample Large sample


Sampling from
size (n<30) size (n≥30)
Normal Variance known z z
distribution Variance unknown t t (or z)
Non –normal Variance known NA z
distribution Variance unknown NA t (or z)
If the population variance is known and our sample size is large, we can use the z-statistic
and z-distribution to compute the critical value. However, if we do not know the population
variance and we have a small sample size, then we have to use the t-statistic and t-
distribution to compute the critical values.
Example
An analyst believes that the average return on all Asian stocks was less than 2%. The sample
size is 36 observations with a sample mean of -3. The standard deviation of the population is
4. Will he reject the null hypothesis at the 5% level of significance?
Solution:
In this case, our null and alternative hypotheses are:
H0: µ ≥ 2
Ha: µ < 2
σ 4
The standard error of the sample is: σx̅ = = = 0.67
√n √36

The test statistic is:


sample statistic − value of the parameter under H0 −3 − 2
test statistic = = = −7.5
standard error of the sample statistic 0.67
The critical values corresponding to a 5% level of significance is -1.65.
When we consider the left tail of the distribution, our decision rule is as follows: Reject the
null hypothesis if the test statistic is less than the critical value and vice versa. Since our
calculated test statistic of -7.5 is less than the critical value of -1.65, we reject the null
hypothesis.

Example
Fund Alpha has been in existence for 20 months and has achieved a mean monthly return of
2% with a sample standard deviation of 5%. The expected monthly return for a fund of this
nature is 1.60%. Assuming monthly returns are normally distributed, are the actual results
consistent with an underlying population mean monthly return of 1.60%?
Solution:
The null and alternative hypotheses for this example will be:
H0: µ = 1.60 versus Ha: µ ≠ 1.60

© IFT. All rights reserved 10


R06 Hypothesis Testing 2022 Level I Notes

̅ − μ0
X 2 − 1.60
test statistic = s = 5 = 0.36
√n √20
Using this formula, we see that the value of the test statistic is 0.36.
The critical values at a 0.05 level of significance can be calculated from the t-distribution
table. Since this is a two-tailed test, we should look at a 0.05/2 = 0.025 level of significance
with df = n - 1 = 20 – 1 = 19. This gives us two values of -2.1 and +2.1.
Since our test statistic of 0.35 lies between -2.1 and +2.1, i.e., the acceptance region, we do
not reject the null hypothesis.
10. Test Concerning Differences Between Means with Independent
Samples
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
In this section, we will learn how to calculate the difference between the means of two
independent and normally distributed populations. We perform this test by drawing a
sample from each group. If it is reasonable to believe that the samples are normally
distributed and also independent of each other, we can proceed with the test. We may also
assume that the population variances are equal or unequal. However, the curriculum focuses
on tests under the assumption that the population variances are equal.
The test statistic is calculated as:
̅1 − ̅
(X X2 ) − (μ1 − μ2 )
t= s2 s2p
(np + )1/2
1 n2

The term 𝑠𝑝2 is known as the pooled estimator of the common variance. It is calculated by the
following formula:
(n1 − 1)s12 + (n2 − 1)s22
sp2 =
n1 + n2 − 2
The number of degrees of freedom is n1 + n2 – 2.
Example
(This is based on Example 9 from the curriculum.)
An analyst wants to test if the returns for an index are different for two different time
periods. He gathers the following data:
Period 1 Period 2
Mean 0.01775% 0.01134%
Standard deviation 0.31580% 0.38760%
Sample size 445 days 859 days

© IFT. All rights reserved 11


R06 Hypothesis Testing 2022 Level I Notes

Note that these periods are of different lengths and the samples are independent; that is,
there is no pairing of the days for the two periods.
Test whether there is a difference between the mean daily returns in Period 1 and in Period
2 using a 5% level of significance.

Solution:
The first step is to formulate the null and alternative hypotheses. Since we want to test
whether the two means were equal or different, we define the hypotheses as:
H0: µ1 - µ2 = 0
Ha: µ1 - µ2 ≠ 0
We then calculate the test statistic:
(n1 − 1)s12 + (n2 − 1)s22 (445 − 1)09973 + (859 − 1)15023
sp2 = = = 0.1330
n1 + n2 − 2 445 + 859 − 2
̅1 − X
(X ̅ 2 ) − (μ1 − μ2 ) (0.01775 − 0.01134) − 0
t= = 0.1330 0.1330 1/2 = 0.3099
s2p s2p
(n + )1/2 ( + )
n2 445 859
1

For a 0.05 level of significance, we find the t-value for 0.05/2 = 0.025 using df = 445 + 859 -
2=1302. The critical t-values are ±1.962. Since our test statistic of 0.3099 lies in the
acceptance region, we fail to reject the null hypothesis.
We conclude that there is insufficient evidence to indicate that the returns are different for
the two time periods.
11. Test Concerning Differences Between Means with Dependent Samples
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
In the previous section, in order to perform hypothesis tests on differences between means
of two populations, we assumed that the samples were independent. What if the samples are
not independent? For example, suppose you want to conduct tests on the mean monthly
return on Toyota stock and mean monthly return on Honda stock. These two samples are
believed to be dependent, as they are impacted by the same economic factors.
In such situations, we conduct a t-test that is based on data arranged in paired
observations. Paired observations are observations that are dependent because they have
something in common.
We will now discuss the process for conducting such a t-test.
Example:
Suppose that we gather data regarding the mean monthly returns on stocks of Toyota and
Honda for the last 20 months, as shown in the table below:

© IFT. All rights reserved 12


R06 Hypothesis Testing 2022 Level I Notes

Month Mean return of Mean monthly return Difference in mean monthly


Toyota stock of Honda stock returns (di)
1 0.5% 0.4% 0.1%
2 0.7% 1.0% -0.3%
3 0.3% 0.7% -0.4%
… … … …
20 0.9% 0.6% 0.3%
Average 0.750% 0.600% 0.075%
Here is a simplified process for conducting the hypothesis test:
Step 1: Define the null and alternate hypotheses
We believe that the mean difference is not 0. Hence the null and alternate hypotheses are:
H0 : µd = µd0 versus Ha : µd ≠ µd0
µd stands for the population mean difference and µd0 stands for the hypothesized value for
the population mean difference.
Step 2: Calculate the test-statistic
Determine the sample mean difference using:
n
1
d̅ = ∑ di
n
i=0
For the data given, the sample mean difference is 0.075%.
Calculate the sample standard deviation. The process for calculating the sample standard
deviation has been discussed in an earlier reading. The simplest method is to plug the
numbers (0.1, -0.3, -0.4…0.3) into a financial calculator. The entire data set has not been
provided. We’ll take it as a given that the sample standard deviation is 0.150%.
Use this formula to calculate the standard error of the mean difference:
sd
sd̅ =
√n
For our data this is 0.150 / √20 = 0.03354.
We now have the required data to calculate the test statistic using a t-test. This is calculated
using the following formula using n - 1 degrees of freedom:
d̅ − μd0
t=
sd̅
0.075 – 0
For our data, the test statistic is = 2.24.
0.03354

Step 3: Determine the critical value based on the level of significance

© IFT. All rights reserved 13


R06 Hypothesis Testing 2022 Level I Notes

We will use a 5% level of significance. Since this is a two-tailed test we have a probability of
2.5% (0.025) in each tail. This critical value is determined from a t-table using a one-tailed
probability of 0.025 and df = 20 – 1 = 19. This value is 2.093.
Step 4: Compare the test statistic with the critical value and make a decision
In our case, the test statistic (2.23) is greater than the critical value (2.093). Hence we will
reject the null hypothesis.
Conclusion: The data seems to indicate that the mean difference is not 0.
The hypothesis test presented above is based on the belief that the population mean
difference is not equal to 0. If µd0 is the hypothesized value for the population mean
difference, then we can formulate the following hypotheses:
1. If we believe the population mean difference is greater than 0:
H0 : µd ≤ µd0 versus Ha : µd > µd0
2. If we believe the population mean difference is less than 0:
H0 : µd ≥ µd0 versus Ha : µd < µd0
3. If we believe the population mean difference is not 0:
H0 : µd = µd0 versus Ha : µd ≠ µd0
12. Testing Concerning Tests of Variances (Chi-Square Test)
Instructor’s Note:
Focus on the basics of this topic, the probability of being tested on the details is low.
12.1 Tests of a Single Variance
In tests concerning the variance of a single normally distributed population, we use the chi-
square test statistic, denoted by χ2.
Properties of the chi-square distribution
The chi-square distribution is asymmetrical and like the t-distribution, is a family of
distributions. This means that a different distribution exists for each possible value of
degrees of freedom, n - 1. Since the variance is a squared term, the minimum value can only
be 0. Hence, the chi-square distribution is bounded below by 0. The graph below shows the
shape of a chi-square distribution:

© IFT. All rights reserved 14


R06 Hypothesis Testing 2022 Level I Notes

There are three hypotheses that can be formulated (σ2 represents the true population
variance and σ02 represents the hypothesized variance):
1. H0 : σ2 = σ20 versus Ha : σ2 ≠ σ20 . This is used when we believe the population
variance is not equal to 0, or it is different from the hypothesized variance. It is a two-
tailed test.
2. H0 : σ2 ≥ σ20 versus Ha : σ2 < σ20 . This is used when we believe the population
variance is less than the hypothesized variance. It is a one-tailed test.
3. H0 : σ2 ≤ σ20 versus Ha : σ2 > σ20 . This is used when we believe the population variance
is greater than the hypothesized variance. It is a one-tailed test.
After drawing a random sample from a normally distributed population, we calculate the
test statistic using the following formula using n - 1 degrees of freedom:
(n − 1)(s 2 )
χ2 =
σ20
where:
n = sample size
s = sample variance
We then determine the critical values using the level of significance and degrees of freedom.
The chi-square distribution table is used to calculate the critical value.
Example
Consider Fund Alpha which we discussed in an earlier example. This fund has been in
existence for 20 months. During this period the standard deviation of monthly returns was
5%. You want to test a claim by the fund manager that the standard deviation of monthly
returns is less than 6%.
Solution:
The null and alternate hypotheses are: H0: σ2 ≥ 36 versus Ha: σ2 < 36
Note that the standard deviation is 6%. Since we are dealing with population variance, we
will square this number to arrive at a variance of 36%.
We then calculate the value of the chi-square test statistic:
2 = (n - 1) s2 / σ02 = 19 x 25/36 = 13.19
Next, we determine the rejection point based on df = 19 and significance = 0.05. Using the
chi-square table, we find that this number is 10.117.
Since the test statistic (13.19) is higher than the rejection point (10.117) we cannot reject H0.
In other words, the sample standard deviation is not small enough to validate the fund
manager’s claim that population standard deviation is less than 6%.

© IFT. All rights reserved 15


R06 Hypothesis Testing 2022 Level I Notes

12.2 Test Concerning the Equality of Two Variances (F-Test)


In order to test the equality or inequality of two variances, we use an F-test which is the ratio
of sample variances.
The assumptions for a F-test to be valid are:
• The samples must be independent.
• The populations from which the samples are taken are normally distributed.
Properties of the F-distribution
The F-distribution, like the chi-square distribution, is a family of asymmetrical distributions
bounded from below by 0. Each F-distribution is defined by two values of degrees of
freedom, called the numerator and denominator degrees of freedom. As shown in the figure
below, the F-distribution is skewed to the right and is truncated at zero on the left hand side.

The rejection region is always in the right-side tail of the distribution.


When working with F-tests, there are three hypotheses that can be formulated:
1. H0 : σ12 = σ22 versus Ha : σ12 ≠ σ22 . This is used when we believe the two population
variances are not equal.
2. H0 : σ12 ≤ σ22 versus Ha : σ12 > σ22 . This is used when we believe the variance of the first
population is greater than the variance of the second population.
3. H0 : σ12 ≥ σ22 versus Ha : σ12 < σ22 . This is used when we believe the variance of the first
population is less than the variance of the second population.
The term σ12 represents the population variance of the first population and σ22 represents
the population variance of the second population.
The formula for the test statistic of the F-test is:
s12
F= 2
s2
where:
𝑠12 = the sample variance of the first population with n observations
𝑠22 = the sample variance of the second population with n observations

© IFT. All rights reserved 16


R06 Hypothesis Testing 2022 Level I Notes

A convention is to put the larger sample variance in the numerator and the smaller sample
variance in the denominator.
df1 = n1 – 1 numerator degrees of freedom
df2 = n2 – 1 denominator degrees of freedom
The test statistic is then compared with the critical values found using the two degrees of
freedom and the F-tables.
Finally, a decision is made whether to reject or not to reject the null hypothesis.
Example
You are investigating whether the population variance of the Indian equity market changed
after the deregulation of 1991. You collect 120 months of data before and after deregulation.
Variance of returns before deregulation was 13. Variance of returns after deregulation was
18. Check your hypothesis at a confidence level of 99%.
Solution:
Null and alternate hypothesis: H0: σ12 = σ22 versus HA: σ12 ≠ σ22
18
F-statistic: 13 = 1.4
df = 119 for the numerator and denominator
α = 0.01 which means 0.005 in each tail. From the F-table: critical value = 1.6
Since the F-stat is less than the critical value, do not reject the null hypothesis.
13. Parametric vs. Nonparametric Tests
The hypothesis-testing procedures we have discussed so far have two characteristics in
common:
• They are concerned with parameters, such as the mean and variance.
• Their validity depends on a set of assumptions.
Any procedure which has either of the two characteristics is known as a parametric test.
Nonparametric tests are not concerned with a parameter and/or make few assumptions
about the population from which the sample are drawn. We use nonparametric procedures
in three situations:
• Data does not meet distributional assumptions.
• Data has outliers
• Data are given in ranks. (Example: relative size of the company and use of
derivatives.)
• The hypothesis does not concern a parameter. (Example: Is a sample random or not?)
14. Tests Concerning Correlation
The strength of linear relationship between two variables is assessed through correlation
coefficient. The significance of a correlation coefficient is tested by using hypothesis tests

© IFT. All rights reserved 17


R06 Hypothesis Testing 2022 Level I Notes

concerning correlation.
There are two hypotheses that can be formulated (ρ represents the population correlation
coefficient):
• H0 : ρ = 0
• Ha : ρ ≠ 0
This test is used when we believe the population correlation is not equal to 0, or it is
different from the hypothesized correlation. It is a two-tailed test.
14.1 Parametric Test of a Correlation
As long as the two variables are distributed normally, we can use sample correlation, r for
our hypothesis testing. The formula for the t-test is
r √n − 2
t=
√1 − r 2
where: n – 2 = degrees of freedom if H0 is true.
The magnitude of r needed to reject the null hypothesis H0: ρ= 0 decreases as sample size n
increases due to the following:
i. As n increases, the number of degrees of freedom increases and the absolute value of
the critical value tc decreases.
ii. As n increases, the absolute value of the numerator increases, leading to larger-
magnitude t-values.
In other words, as n increases, the probability of Type-II error decreases, all else equal.
Example
The sample correlation between the oil prices and monthly returns of energy stocks in a
Country A is 0.7986 for the period from January 2014 through December 2018. Can we
reject a null hypothesis that the underlying or population correlation equals 0 at the 0.05
level of significance?
Solution:
H0 : ρ = 0 → true correlation in the population is 0.
Ha : ρ ≠ 0 → correlation in the population is different from 0.
From January 2014 through December 2018, there are 60 months, so n = 60. We use the
following statistic to test the above.
0.7986 √60−2 6.0820
t = √1− 0.79862
= 0.6019 = 10.1052

© IFT. All rights reserved 18


R06 Hypothesis Testing 2022 Level I Notes

At the 0.05 significance level, the critical level for this test statistic is 2.00 (n = 60, degrees of
freedom = 58). When the test statistic is either larger than 2.00 or smaller than 2.00, we can
reject the hypothesis that the correlation in the population is 0. The test statistic is 10.1052,
so we can reject the null hypothesis.
14.2 Tests Concerning Correlation: The Spearman Rank Correlation Coefficient
The Spearman rank correlation coefficient is equivalent to the usual correlation coefficient
but is calculated on the ranks of two variables within their respective samples.
15. Test of Independence Using Contingency Table Data
A chi-square distributed test statistic is used to test for independence of two categorical
variables. This nonparametric test compares actual frequencies with those expected on the
basis of independence.
The test statistic is calculated as:

(Oij − Eij )
2


m
 =
2
i =1
,
Eij

(Total row i)  (Total column j )


where: Eij = .
Overall total

This test statistic has degrees of freedom of (r − 1)(c − 2), where r is the number of
categories for the first variable and c is the number of categories of the second variable.

© IFT. All rights reserved 19


R06 Hypothesis Testing 2022 Level I Notes

Summary
LO.a: Define a hypothesis, describe the steps of hypothesis testing, and describe and
interpret the choice of the null and alternative hypotheses.
A hypothesis is a statement about the value of a population parameter developed for the
purpose of testing a theory.
In order to test a hypothesis, we follow these steps:
1. State the hypothesis.
2. Identify the appropriate test statistic and its probability distribution.
3. Specify the significance level.
4. State the decision rule.
5. Collect data and calculate the test statistic.
6. Make a decision.
The null hypothesis (H0) is the hypothesis that the researcher wants to reject. It should
always include some form of the ‘equal to’ condition.
The alternative hypothesis (Ha) is the hypothesis that the researcher wants to prove. If the
null hypothesis is rejected then the alternative hypothesis is considered valid.
LO.b: Compare and contrast one-tailed and two-tailed tests of hypotheses.
In one-tailed tests, we are assessing if the value of a population parameter is greater than or
less than a hypothesized value.
In two-tailed tests, we are assessing if the value of a population parameter is different from a
hypothesized value.
LO.c: Explain a test statistic, Type I and Type II errors, a significance level, how
significance levels are used in hypothesis testing, and the power of a test.
A test statistic is a quantity, calculated on the basis of a sample, and is used to decide
whether to reject or not to reject the null hypothesis. The formula for computing the test
statistic is:
sample statistic − value of the parameter under H0
test statistic =
standard error of the sample statistic
In reaching a statistical decision, we can make two possible errors: We may reject a true null
hypothesis (a Type I error), or we may fail to reject a false null hypothesis (a Type II error).
The level of significance of a test is the probability of a Type I error. As α gets smaller the
critical value gets larger and it becomes more difficult to reject the null hypothesis.
The power of a test is the probability of correctly rejecting the null (rejecting the null when it
is false). It is expressed as:
Power of a test = 1 – P (Type II error)

© IFT. All rights reserved 20


R06 Hypothesis Testing 2022 Level I Notes

LO.d: Explain a decision rule and the relation between confidence intervals and
hypothesis tests, and determine whether a statistically significant result is also
economically meaningful.
A decision rule consists of comparing the computed test statistic to the critical values
(rejection points) based on the level of significance to decide whether to reject or not to
reject the null hypothesis.
A confidence interval gives us the range of values within which a population parameter is
expected to lie. Confidence intervals and hypothesis tests are linked through critical values.
The null hypothesis will be rejected only if the test statistic lies outside the confidence
interval.
The statistical decision consists of rejecting or not rejecting the null hypothesis. The
economic decision takes into consideration all economic issues relevant to the decision.
LO.e: Explain and interpret the p-value as it relates to hypothesis testing.
The p-value is the smallest level of significance at which the null hypothesis can be rejected.
It can be used in the hypothesis testing framework as an alternative to using rejection points.
• If the p-value is lower than our specified level of significance, we reject the null
hypothesis.
• If the p-value is greater than our specified level of significance, we do not reject the
null hypothesis.
LO.f: Describe how to interpret the significance of a test in the context of multiple
tests.
The false discovery approach is used to adjust the p-values when you run multiple tests. The
researcher first ranks the p-values from the various tests from lowest to highest. He then
makes the following comparison, starting with the lowest p-value (with k = 1), p(1):
Rank of i
p(1) ≤ α Number of tests

This comparison is repeated until we find the highest ranked p(k) for which this condition
holds. If, say, k is 4, then the first four tests (ranked on the basis of the lowest p-values) are
said to be significant.
LO.g: Identify the appropriate test statistic and interpret the results for a hypothesis
test concerning the population mean of both large and small samples when the
population is normally or approximately normally distributed and the variance is (1)
known or (2) unknown.
We use the following table to decide which test statistic and which corresponding
probability distribution to use for hypothesis testing.

© IFT. All rights reserved 21


R06 Hypothesis Testing 2022 Level I Notes

Small sample Large sample


Sampling from
size (n<30) size (n≥30)
Normal Variance known z z
distribution Variance unknown t t (or z)
Non –normal Variance known NA z
distribution Variance unknown NA t (or z)
LO.h: Identify the appropriate test statistic and interpret the results for a hypothesis
test concerning the equality of the population means of two at least approximately
normally distributed populations based on independent random samples with equal
assumed variances.
When we can assume that the two populations are normally distributed and that the
unknown population variances are equal, the t-test based on independent random samples
is given by:
̅1 − X
(X ̅ 2 ) − (μ1 − μ2 )
t= s2 s2
(np + np )1/2
1 2

The number of degrees of freedom is n1 + n2 – 2. The term 𝑠𝑝2 is known as the pooled
estimator of the common variance. A pooled estimate is an estimate drawn from the
combination of two different samples. It is calculated as:
(n1 − 1)s12 + (n2 − 1)s22
sp2 =
n1 + n2 − 2
LO.i: Identify the appropriate test statistic and interpret the results for a hypothesis
test concerning the mean difference of two normally distributed populations.
In cases where we have a test concerning the mean difference of two normally distributed
populations that are dependent, we conduct a t-test that is based on data arranged in paired
observations.
The hypothesis is formed on the difference between means of two populations e.g. H0: µd =
µd0 versus Ha: µd ≠ µd0
In order to arrive at the test statistic, we first determine the sample mean difference using:
n
1
d̅ = ∑ di
n
i=0

And the standard error of the mean difference is computed as follows:


sd
sd̅ =
√n
Once we have these two values, we can calculate the test statistic using a t-test. This is
calculated using the following formula using n - 1 degrees of freedom:

© IFT. All rights reserved 22


R06 Hypothesis Testing 2022 Level I Notes

d̅ − μd0
t=
sd̅
The value of calculated test statistic is compared with the t-distribution values in the usual
manner to arrive at a decision on our hypothesis.
LO.j: Identify the appropriate test statistic and interpret the results for a hypothesis
test concerning (1) the variance of a normally distributed population and (2) the
equality of the variances of two normally distributed populations based on two
independent random samples.
In tests concerning the variance of a single normally distributed population, we use the chi-
square test statistic, denoted by χ2. After drawing a random sample from a normally
distributed population, we calculate the test statistic using the following formula using n - 1
degrees of freedom:

2
(n − 1)(s 2 )
χ =
σ20
We then determine the critical values using the level of significance and degrees of freedom.
The chi-square distribution table is used to calculate the critical value.
In order to test the equality or inequality of two variances, we use an F-test. The critical
value is computed as:
s12
F=
s22
The test statistic is then compared with the critical values found using the two degrees of
freedom and the F-tables. Finally, a decision is made whether to reject or not to reject the
null hypothesis.
LO.k: Compare and contrast parametric and nonparametric tests, and describe
situations where each is the more appropriate type of test.
A parametric test is a hypothesis test concerning a parameter or a hypothesis test based on
specific distributional assumptions. In contrast, a nonparametric test is either not concerned
with a parameter or makes minimal assumptions about the population from which the
sample is drawn.
A nonparametric test is primarily used in three situations: when data do not meet
distributional assumptions, when data is given in ranks, or when the hypothesis we are
addressing does not concern a parameter.
LO.l: Explain parametric and nonparametric tests of the hypothesis that the
population correlation coefficient equals zero, and determine whether the hypothesis
is rejected at a given level of significance.

© IFT. All rights reserved 23


R06 Hypothesis Testing 2022 Level I Notes

Parametric test: The significance of a correlation coefficient is tested by using hypothesis


tests concerning correlation. The formula for the t-test is:
r √n − 2
t=
√1 − r 2
where: n – 2 = degrees of freedom
Non-parametric test: The Spearman rank correlation coefficient is equivalent to the usual
correlation coefficient but is calculated on the ranks of two variables within their respective
samples.
LO.m: Explain tests of independence based on contingency table data.
A chi-square distributed test statistic is used to test for independence of two categorical
variables. This nonparametric test compares actual frequencies with those expected on the
basis of independence.
This test statistic has degrees of freedom of (r − 1)(c − 2), where r is the number of
categories for the first variable and c is the number of categories of the second variable.

© IFT. All rights reserved 24


R06 Hypothesis Testing 2022 Level I Notes

Practice Questions
1. David Jones is a researcher and wants to test if the mean returns of his bond investments
is more than 5% per year. In this case, the null and alternative hypothesis would be best
defined as:
A. H0: µ = 5 versus Ha: µ ≠ 5.
B. H0: µ ≤ 5 versus Ha: µ > 5.
C. H0: µ ≥ 5 versus Ha: µ < 5.

2. Which of the following statements requires a two-tailed test?


A. Ho: µ ≤ 0 versus Ha: µ > 0.
B. Ho: µ ≥ 0 versus Ha: µ < 0.
C. Ho: µ = 0 versus Ha: µ ≠ 0.

3. All else being equal, specifying a smaller significance level in a hypothesis test will most
likely increase the probability of:
A. type I error.
B. type II error.
C. both type I and type II errors.

4. All else being equal, increasing the sample size for a hypothesis test will most likely
decrease the probability of:
A. type I error.
B. type II error.
C. both type I and type II errors.

5. If the significance level of a test is 0.05 and the probability of a Type II error is 0.2. What is
the power of the test?
A. 0.05.
B. 0.80.
C. 0.95.

6. A researcher formulates a null hypothesis that the mean of a distribution is equal to 10. He
obtains a p-value of 0.02. Using a 5% level of significance, the best conclusion is to:
A. reject the null hypothesis.
B. accept the null hypothesis.
C. decrease the level of significance.

7. Which of the following statistic is most likely used for the mean of a non-normal
distribution with unknown variance and a small sample size?
A. z-test statistic.

© IFT. All rights reserved 25


R06 Hypothesis Testing 2022 Level I Notes

B. t-test statistic.
C. There is no test statistic for such a scenario.

8. You believe that the average returns of all stocks in the S&P 500 is greater than 10%. You
draw a sample of 49 stocks. The average return of these 49 stocks is 12%. The standard
deviation of returns of all stocks in the S&P 500 is 4. Using a 5% level of significance, which
of the following conclusions is most appropriate?
A. We can conclude that the average returns of all stocks in the S&P 500 is greater than
10%.
B. We can conclude that the average returns of all stocks in the S&P 500 is less than 10%.
C. We can conclude that the average returns of all stocks in the S&P 500 is equal to 10%.

9. The appropriate test statistic to test the hypothesis that the variance of a normally
distributed population is equal to 8 is the:
A. t-test.
B. F-test.
C. χ2 test.

10. A researcher drew two samples from two normally distributed populations. The mean and
standard deviation of the first sample were 5 and 52 respectively. The mean and standard
deviation of the second sample were 7 and 56 respectively. The number of observations in
the first sample was 32 and second sample was 35. Given a null hypothesis of 𝜎12 = 𝜎22
versus an alternate hypothesis of 𝜎12 ≠ 𝜎22 , which of the following is most likely to be the
test statistic?
A. 0.021.
B. 0.862.
C. 1.160.

11. An investment analyst will most likely use a non-parametric test in which of the following
situations?
A. When the data provided is not given in ranks.
B. When the data does not meet distributional assumptions.
C. When the hypothesis being addressed concerns a parameter.

12. The following table shows the sample correlations of the monthly returns for two different
mutual funds with the S&P 500. The correlations are based on 36 monthly observations.
The funds are as follows:
Fund 1: Small cap fund
Fund 2: Emerging equity fund.
S&P 500: US domestic stock index

© IFT. All rights reserved 26


R06 Hypothesis Testing 2022 Level I Notes

S&P 500
Fund 1 0.32
Fund 2 0.36
S&P 500 1
Using a 5 percent significance level, which of the following conclusions is most accurate?
(Critical t-value for 34 df, using a 5 percent significance level and a two tailed test is 2.032)
A. Fund 1 is correlated to S&P 500.
B. Fund 2 is correlated to S&P 500.
C. Both funds are correlated to S&P 500.

© IFT. All rights reserved 27


R06 Hypothesis Testing 2022 Level I Notes

Solutions

1. B is correct. The null hypothesis is what the researcher wants to reject. The alternative
hypothesis is what the researcher wants to prove, and it is accepted when the null
hypothesis is rejected.

2. C is correct. A two-tailed test for the population mean is structured as: Ho: µ = 0 versus Ha:
µ ≠ 0.

3. B is correct. Specifying a smaller significance level decreases the probability of a Type I


error (rejecting a true null hypothesis), but increases the probability of a Type II error
(not rejecting a false null hypothesis). As the level of significance decreases, the null
hypothesis is less frequently rejected.

4. C is correct. The only way to avoid the trade-off between the two types of errors is to
increase the sample size; increasing sample size (all else being equal) reduces the
probability of both types of errors.

5. B is correct. The power of a test = 1- P(Type II error). = 1 – 0.2 = 0.8

6. A is correct. The p-value for a hypothesis is the smallest significance level for which the
hypothesis would be rejected. As the p-value is less than the stated level of significance,
we reject the null hypothesis.

7. C is correct. The statistic for a small sample size of a non-normal distribution with
unknown variance is not available. z-test statistic is used for a large sample size of a non-
normal distribution with known variance while t-test statistic is used for large sample
size of a non-normal distribution with unknown variance.
Sampling From Small Sample Large
Size Sample Size
Normal Distribution Variance known z z

Variance unknown t t (or z)


Non–normal Variance known NA z
Distribution
Variance unknown NA t (or z)

8. A is correct.
Step 1: State the hypothesis
H0: µ ≤ 10%
Ha: µ > 10%

© IFT. All rights reserved 28


R06 Hypothesis Testing 2022 Level I Notes

Step 2: Calculate the test statistic


The population variance is known hence we will use z-statistic.
̅ − μ0
X 12 − 10
z − statistic = σ = 4 = 3.5
√n √49
Step 3: Calculate the critical value
This is a one-tailed test and we will be looking at the right tail. Using the Z–table and 5%
level of significance
Critical value = Z0.05= 1.65
Step 4: Decision
Since the test statistic (3.5) > critical value (1.65), we reject H0. Hence at 5% level of
significance, your belief that the average returns of all stocks in the S&P 500 is greater
than 10% is correct.

9. C is correct. In tests concerning the variance of a single normally distributed population,


we use the chi-square test statistic, denoted by χ2.
Types of Test Statistics
Hypothesis test of Use
One population mean t-statistic or z-statistic
Two population mean t-statistic
One population variance Chi-square statistic
Two-population variance F-statistic

10. C is correct. The test that compares the variances using two independent samples from
two different populations makes use of the F-distributed t-statistic:
The formula for the test statistic of the F-test is:
𝜎12
F= 2
𝜎2
where:
𝜎12 = the sample variance of the first population with n observations
𝜎22 = the sample variance of the second population with n observations
A convention is to put the larger sample variance in the numerator and the smaller sample
variance in the denominator.
df1 = n1 – 1 numerator degrees of freedom
df2 = n2 – 1 denominator degrees of freedom
The test statistic is then compared with the critical values found using the two degrees of
freedom and the F-tables.
The smaller variance is the denominator, thus:
562
= 1.16.
522
522
Option B is incorrect because it uses 562 = 0.862

© IFT. All rights reserved 29


R06 Hypothesis Testing 2022 Level I Notes

11. B is correct. We use nonparametric procedures in three situations:


• Data does not meet distributional assumptions.
• Data are given in ranks. (Example: relative size of the company and use of
derivatives.)
• The hypothesis does not concern a parameter. (Example: Is a sample random or not?)

12. B is correct.
The critical t-value for n − 2 = 34 df, using a 5 percent significance level and a two-tailed
test, is 2.032.
First take the correlation between Fund 1 and S&P. Its calculated t-value is:
𝑟√𝑛 − 2 0.32√36 − 2
𝑡= = = 1.96
√1 − 𝑟 2 √1 − 0.322
Since the t value is not more than 2.032, we cannot conclude that the correlation of this
fund with S&P 500 is significantly greater than 0.
Now take the correlation between Fund 2 and S&P 500. Its calculated t-value is:
𝑟√𝑛 − 2 0.36√36 − 2
𝑡= = = 2.25
√1 − 𝑟 2 √1 − 0.362
Since the t value is more than 2.032, we can conclude that the correlation of this fund
with S&P 500 is significantly greater than 0.

© IFT. All rights reserved 30

You might also like