0% found this document useful (0 votes)
7 views

Math 10 Winter 2016 Reg

Uploaded by

Saragih Hans
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Math 10 Winter 2016 Reg

Uploaded by

Saragih Hans
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 193

Math 10 (Sec 28) – Statistics – Winter 2016 Syllabus

Instructor: Maurice (Mo) Geraghty Office Location/Phone: S-49A (408) 864-5383


Email: [email protected] Office Hours: M 12:30-1:20 Tu 6:20-7:00
Website: http://nebula2.deanza.edu/~mo W 11:30-12:20 Th 11:30-1:00 (in LCW110)

Required Materials: Textbook – Collaborative Statistics by Illowsky/Dean (online or printed copy)


Textbook – Inferential Statistics and Hypothesis Testing by Geraghty (online only)

Calculator – Scientific Calculator is sufficient. Cell phone calculators are not allowed on
exams.

Access to a computer outside of class; we will be using the computer lab and Minitab . Also,
you will need an e-mail address and access to the Internet. Course topics, homework, exam
information, handouts, data sets, and other information will be posted on the website.

Grading: Grading will be based on the following criteria. Grades are not negotiable.

* * * * * * * * * *Grading Scale (points) * * * * * * * * * Grading Criteria


460 - 446 = A+ 445 - 427 = A 426 - 414 = A- Exams: 200 pts
413 - 400= B+ 399 - 381 = B 380 - 368 = B- Final: 100 pts
367 - 345 = C+ 344 - 322 = C 321 - 299 = D+ Labs: 110 pts
298 - 276 = D 0 - 275 = F Homework: 50 pts

Homework: Completed Homework must be turned in by the due date, but should be completely daily. Homework
assignments may also be posted on the website. There is no credit for late homework.

Exams: There will be two exams during the quarter. Your final exam score will replace your lowest scoring exam if
it improves your grade. There are no make-up exams.

Final Exam: A comprehensive exam will be given on the final exam date.

Computer Lab: Lab classes will be held in the math computer lab: S44. You will use Mintab and other statistical
software in analyzing data, learning statistical models and working on the class material Computer
labs can be done in groups of no more than four people for a common grade and be turned in by
email on the due date. There is no credit for late labs received after midnight on the due
date.

Adding/Dropping: If you choose not to complete the course, it is your responsibility to officially drop or withdraw
from the course by the deadline date. I will not sign late drop or withdrawal forms.

Attendance: It is expected that you attend both the lecture and labs. Attendance means arriving on time and
staying the entire scheduled period.

Changes: Information in this syllabus may be changed during the quarter, but you will be informed in advance.

Other Information: All students are expected to understand the college policy on cheating as outlined in the
student handbook. Plagiarism (submitting another’s work as your own) will result in an
immediate failure for the course for your entire group.

Cell phones and and other electronic devices need to be turned off or silenced. Please arrive
on time and stay the entire period.

Read the Frequently Asked Questions on the website for other policies and procedures.
Student Learning Outcomes (SLO's) are also posted on the class website.
If you feel that you may need an accommodation based on the impact of a disability, you
should contact me privately to discuss your specific needs. Also, please contact Disability
Support Services (864-8753) or Educational Diagnostic Center (864-8839) for information or
questions about eligibility, services and accommodations for physical (DSS), psychological
(DSS) or learning (EDC) disabilities.
Tentative Schedule - Math 10 - Sec 28
Winter Quarter - 2016
`
Monday Tuesday Wednesday Thursday Friday
Jan 4 5 6 7 8
Part 1 Part 1
HW 0
Lab 1 Due
Jan 11 12 13 14 15
Part 1/2 Part 2 Drop Deadline
HW 1 (Jan 17)
Lab 2 Due
Jan 18 19 20 21 22
Holiday Part 3
HW 2
Lab 3 Due
Jan 25 26 27 28 29
Part 3/4 Part 4/Review
HW 3
Lab 4 Due
Feb 1 2 3 4 5
Exam 1 Part 5
Part 4/5 HW 4
Lab 5 Due
Feb 8 9 10 11 12
Part 5/6 Part 6 Holiday
HW 5
Lab 6 Due
Feb 15 16 17 18 19
Part 6 Part 6

Lab 7 Due
Feb 22 23 24 25 26
Holiday Part 7 Withdraw Deadline
HW 6
Lab 8 Due
Feb/Mar 29 1 2 3 4
Part 7 Review/Part 8

Lab 9 Due
Mar 7 8 9 10 11
Exam 2 Part 8
Part 8
HW 7 Lab 10 Due
Mar 14 15 16 17 18
Part 8/9 Part 9
HW 8
Review Lab 11 Due
Mar 21 22 23 24 25
Final Exam
4:00-6:00
HW 9
Slides Topic Illowsky/Dean Geraghty
1 (all), 2 (all), 6.3, 12.4, 12.6,
Part 1 Descriptive Statistics 12.7 Sec 4 - outliers

Part 2 Probability 3 (all)

Part 3 Discrete Random Variables 4 (omit 4.7)

Continuous Random Vars/Central


Part 4 Limit Theorem 5 (all), 6 (all), 7 (omit 7.3) Sec 4

Part 5 Confidence Intervals 8 (all) Sec 5


1 Population Hypothesis
Part 6 Testing/Power 9 (all), 11.6 Sec 6

Part 7 2 Population Hypothesis Testing 10 (omit 10.4), 13.5 Sec 7

Part 8 Chi-Square tests/ANOVA 11 (all), 13 (all) Sec 8

Part 9 Regression 12
DE ANZA COLLEGE – DEPARTMENT OF MATHEMATICS

Inferential Statistics and


Hypothesis Testing
A Holistic Approach
Maurice A. Geraghty
1/1/2016

Supplementary material for an introductory lower division course in Probability and Statistics
Page |1

Inference and Hypothesis Testing – A Holistic Approach


Supplementary Material for an Introductory Lower Division
Course in Probability and Statistics
Maurice A. Geraghty, De Anza College
originally published September 1, 2009
revised January 1, 2016

1. Introduction – a Classroom Story and an Inspiration……………………………..………….……..Page 02


2. The Six Blind Men and the Elephant………………………………………………………………….………Page 05
3. Two News Stories of Research…………………………………………………………………..…….……….Page 06

4. Review and Central Limit Theorem……………………………………………………………….………….Page 08


5. Point Estimation and Confidence Intervals……………………………………………………………….Page 13
6. One Population Hypothesis Testing………………………………………………………………………….Page 20
7. Two Population Inference..................................……………………………………………………….Page 41
8. Chi-square Tests and One Factor Analysis of Variance (ANOVA)…………….………………..Page 51
9. Glossary of Statistical Terms used in Inference..……………………………………………………....Page 66

10. Flash Animations……………………………………………………………………………………………………...Page 72


11. PowerPoint Slides…………………………………..……………………….……………………………………….Page 73
12. Notes and Sources…………………………………..……………………….…………………………………..….Page 74
Page |2

1. Introduction - A Classroom Story and an Inspiration


Several years ago, I was teaching an introductory Statistics course at De Anza College where I had
several achieving students who were dedicated to learn the material and who frequently asked me
questions during class and office hours. Like many students, they were able to understand the material
on descriptive statistics and interpreting graphs. Unlike many introductory Statistics students, they had
excellent math and computer skills and went on to master probability, random variables and the Central
Limit Theorem.

However, when the course turned to inference and hypothesis testing, I watched these students’
performance deteriorate. One student asked me after class to again explain the difference between the
Null and Alternative Hypotheses. I tried several methods, but it was clear these students never really
understood the logic or the reasoning behind the procedure. These students could easily perform the
calculations, but they had difficulty choosing the correct model, setting up the test, and stating the
conclusion.

These students, (to their credit) continued to work hard; they wanted to understand the material, not
simply pass the class. Since these students had excellent math skills, I went deeper into the explanation
of Type II error and the statistical power function. Although they could compute power and sample size
for different criteria, they still didn’t conceptually understand hypothesis testing.

On my long drive home, I was listening to National Public Radio’s Talk of the Nation 1 where there was a
discussion on the difference between the reductionist and holistic approaches to the sciences, which the
commentator described as the western tradition vs. the eastern tradition. The reductionist or western
method of analyzing a problem, mechanism or phenomenon is to look at the component pieces of the
system being studied. For example, a nutritionist breaks a potato down into vitamins, minerals,
carbohydrates, fats, calories, fiber and proteins. Reductionist analysis is prevalent in all the sciences,
including Inferential Statistics and Hypothesis Testing.

Holistic or eastern tradition analysis is less concerned with the component parts of a problem,
mechanism or phenomenon but instead how this system operates as a whole, including its surrounding
environment. For example, a holistic nutritionist would look at the potato in its environment: when it
was eaten, with what other foods, how it was grown, or how it was prepared. In holism, the potato is
much more than the sum of its parts.

Consider these two renderings of fish:

The first image is a drawing of fish anatomy by


John Cimbaro used by the La Crosse Fish Health
Center. 2 This drawing tells us a lot about how a
fish is constructed, and where the vital organs
are located. There is much detail given to the
scales, fins, mouth and eyes.
Page |3

The second image is a watercolor by


the Chinese artist Chen Zheng-
Long3. In this artwork, we learn very
little about fish anatomy seeing only
minimalistic eyes, scales and fins.
However, the artist shows how fish
are social creatures, how their fins
move to swim and the type of
plants they like. Unlike the first
drawing, we learn much more about
the interaction of the fish in its
surrounding environment and much less about how a fish is built.

This illustrative example shows the difference between reductionist and holistic analyses. Each
rendering teaches something important about the fish: The reductionist drawing of the fish anatomy
helps explain how a fish is built and the holistic watercolor helps explain how a fish relates to its
environment. Both the reductionist and holistic methods add to knowledge and understanding, and
both philosophies are important. Unfortunately, much of Western science has been dominated by the
reductionist philosophy, including the backbone of the scientific method, Inferential Statistics.

Although science has traditionally been reluctant to embrace, often hostile to including holistic
philosophy in the scientific method, there have been many who now support a multicultural or multi-
philosophical approach. In his book Holism and Reductionism in Biology and Ecology 4, Looijen claims
that “holism and reductionism should be seen as mutually dependent, and hence co-operating
research programs than as conflicting views of nature or of relations between sciences.” Holism
develops the “macro-laws” that reductionism needs to “delve deeper” into understanding or explaining
a concept or phenomena. I believe this claim applies to the study of Statistics as well.

I realize that the problem of my high-achieving students being unable to comprehend hypothesis testing
could be cultural – these were international students who may have been schooled under a more
holistic philosophy. The Introductory Statistics curriculum and most texts give an incomplete
explanation of the logic of Hypothesis Testing, eliminating or barely explaining such topics as Power, the
consequence of Type II error or Bayesian alternatives. The problem is how to supplement an
Introductory Statistics course with a holistic philosophy without depriving the students of the required
reductionist course curriculum – all in one quarter or semester!

I believe it is possible to teach the concept of Inferential Statistics holistically. This course material is a
result of that inspiration, which was designed to supplement, not replace, a traditional course textbook
or workbook. This supplemental material includes:

• Examples of deriving research hypotheses from general questions and explanatory conclusions
consistent with the general question and test results.
• An in-depth explanation of statistical power and type II error.
Page |4

• Techniques for checking that validity of model assumptions and indentifying potential outliers
using graphs and summary statistics.
• Replacement of the traditional step-by-step “cookbook” for hypothesis testing with interrelated
procedures.
• De-emphasis of algebraic calculations in favor of a conceptual understanding using computer
software to perform tedious calculations.
• Interactive Flash animations to explain the Central Limit Theorem, inference, confidence
intervals, and the general hypothesis testing model including Type II error and power.
• PowerPoint Slides of the material for classroom demonstration.
• Excel Data sets for use with computer projects and labs.

This material is limited to one population hypothesis testing but could easily be extended to other
models. My experience has been that once students understand the logic of hypothesis testing, the
introduction of new models is a minor change in the procedure.
Page |5

2. The Six Blind Man and the Elephant

This old story from China or India was made into the poem The Blind Man and the Elephant by John
Godfrey Saxe 5. Six blind men find excellent empirical evidence from different parts of the elephant and
all come to reasoned inferences that match their observations. Their research is flawless and their
conclusions are completely wrong, showing the necessity of including holistic analysis in the scientific
process.

Here is the poem in its entirety:

It was six men of Indostan, to learning much inclined,


who went to see the elephant (Though all of them were blind),
that each by observation, might satisfy his mind.

The first approached the elephant, and, happening to fall,


against his broad and sturdy side, at once began to bawl:
"God bless me! but the elephant, is nothing but a wall!"

The second feeling of the tusk, cried: "Ho! what have we here,
so very round and smooth and sharp? To me tis mighty clear,
this wonder of an elephant, is very like a spear!"

The third approached the animal, and, happening to take,


the squirming trunk within his hands, "I see," quoth he,
the elephant is very like a snake!"

The fourth reached out his eager hand, and felt about the knee:
"What most this wondrous beast is like, is mighty plain," quoth he;
"Tis clear enough the elephant is very like a tree."

The fifth, who chanced to touch the ear, Said; "E'en the blindest man
can tell what this resembles most; Deny the fact who can,
This marvel of an elephant, is very like a fan!"

The sixth no sooner had begun, about the beast to grope,


than, seizing on the swinging tail, that fell within his scope,
"I see," quothe he, "the elephant is very like a rope!"

And so these men of Indostan, disputed loud and long,


each in his own opinion, exceeding stiff and strong,
Though each was partly in the right, and all were in the wrong!

So, oft in theologic wars, the disputants, I ween,


tread on in utter ignorance, of what each other mean,
and prate about the elephant, not one of them has seen!

-John Godfrey Saxe


Page |6

3. Two News Stories of Research

The first story is about a drug that was thought to be effective in research, but was pulled from the
market when it was found to be ineffective in practice.

FDA Orders Trimethobenzamide Suppositories Off the market 6

FDA today ordered makers of unapproved suppositories containing trimethobenzamide hydrochloride


to stop manufacturing and distributing those products.

Companies that market the suppositories, according to FDA, are Bio Pharm, Dispensing Solutions,
G&W Laboratories, Paddock Laboratories, and Perrigo New York. Bio Pharm also distributes the
products, along with Major Pharmaceuticals, PDRX Pharmaceuticals, Physicians Total Care,
Qualitest Pharmaceuticals, RedPharm, and Shire U.S. Manufacturing.

FDA had determined in January 1979 that trimethobenzamide suppositories lacked "substantial
evidence of effectiveness" and proposed withdrawing approval of any NDA for the products.

"There's a variety of reasons" why it has taken FDA nearly 30 years to finally get the suppositories
off the market, Levy said.

At least 21 infant deaths have been associated with unapproved carbinoxamine-containing products,
Levy noted.

Many products with unapproved labeling may be included in widely used pharmaceutical reference
materials, such as the Physicians' Desk Reference, and are sometimes advertised in medical journals,
he said.

Regulators urged consumers using suppositories containing trimethobenzamide to contact their health
care providers about the products.

The second story is about promising research that was abandoned because the test data showed no
significant improvement for patients taking the drug.

Drug Found Ineffective Against Lung Disease 7

Treatment with interferon gamma-1b (Ifn-g1b) does not improve survival in people with a fatal lung
disease called idiopathic pulmonary fibrosis, according to a study that was halted early after no
benefit to participants was found.

Previous research had suggested that Ifn-g1b might benefit people with idiopathic pulmonary fibrosis,
particularly those with mild to moderate disease.

The new study included 826 people, ages 40 to 79, who lived in Europe and North America. They
were given injections of either 200 micrograms of Ifn-g1b (551 people) or a placebo (275) three times
a week.
Page |7

After a median of 64 weeks, 15 percent of those in the Ifn-g1b group and 13 percent in the placebo
group had died. Symptoms such as flu-like illness, fatigue, fever and chills were more common
among those in the Ifn-g1b group than in the placebo group. The two groups had similar rates of
serious side effects, the researchers found.

"We cannot recommend treatment with interferon gamma-1b since the drug did not improve survival
for patients with idiopathic pulmonary fibrosis, which refutes previous findings from subgroup
analyses of survival in studies of patients with mild-to-moderate physiological impairment of
pulmonary function," Dr. Talmadge E. King Jr., of the University of California, San Francisco, and
colleagues wrote in the study published online and in an upcoming print issue of The Lancet.

The negative findings of this study "should be regarded as definite, [but] they should not discourage
patients to participate in one of the several clinical trials currently underway to find effective
treatments for this devastating disease," Dr. Demosthenes Bouros, of the Democritus University of
Thrace in Greece, wrote in an accompanying editorial.

Bouros added that people deemed suitable "should be enrolled early in the transplantation list, which
is today the only mode of treatment that prolongs survival."

Although these are both stories of failures in using drugs to treat diseases, they represent two different
aspects of hypothesis testing. In the first story, the suppositories were thought to effective in treatment
from the initial trials, but were later shown to be ineffective in the general population. This is an
example of what statisticians call Type I Error, supporting a hypothesis (the suppositories are effective)
that later turns out to be false.

In the second story, researchers chose to abandon research when the interferon was found to be
ineffective in treating lung disease during clinical trials. Now this may have been the correct decision,
but what if this treatment was truly effective and the researchers just had an unusual group of test
subjects? This would be an example of what statisticians call Type II Error, failing to support a
hypothesis (the interferon is effective) that later turns out to be true. Unlike the first story, we will never
get to find out the answer to this question since the treatment will not be released to the general public.

In a traditional Introductory Statistics course, very little time is spent analyzing the potential error shown
in the second story. However, both types of error are important and will be explored in this course
material.
Page |8

4. Review and Central Limit Theorem


4.1 Empirical Rule

A student asked me about the distribution of exam scores after she saw her score of 87 out of 100. I told
her the distribution of test scores were approximately bell-shaped with a mean score of 75 and a
standard deviation of 10. Most people would have an intuitive grasp of the mean score as being the
“average student’s score” and would say this student did better than average. However, having an
intuitive grasp of standard deviation is more challenging. The Empirical Rule is a helpful tool in
explaining standard deviation.

The standard deviation is a measure of variability or spread from the center of the data as defined by
the mean. The empirical rules states that for bell-shaped data:

68% of the data is within 1 standard


deviation of the mean.

95% of the data is within 2 standard


deviations of the mean.

99.7% of the data is within 3 standard


deviations of the mean.

In the example, our interpretation would be:

68% of students scored between 65 and 85.


95% of students scored between 55 and 95.
99.7% of students scored between 45 and 105.

The student who scored an 87 would be in the upper 16% of the class, more than one standard
deviation above the mean score.

4.2 The Z-score

Related to the Empirical Rule is the Z-score which measures how many standard deviations a particular
data point is above or below the mean. Unusual observations would have a Z-score over 2 or under -2.
Extreme observations would have Z-scores over 3 or under -3 and should be investigated as potential
outliers.

Xi − X
Formula for Z-score: Z=
s
Page |9

The student who received an 87 on the exam would have a Z-score of 1.2, meaning her score was well
above average, but not highly unusual.

Interpreting Z-score for Several Students

Test Score Z-score Interpretation___


87 +1.2 well above average
71 -0.4 slightly below average
99 +2.4 unusually above average
39 -3.6 extremely below average

4.3 The Sample Mean as a Random Variable – Central Limit Theorem

In the section on descriptive statistics, we studied the sample mean, 𝑋�, as measure of central tendency.
Now we want to consider 𝑋� as a Random Variable.

We start with a Random Sample X1, X2, …, Xn where each of the random variables Xi has the same
probability distribution and are mutually independent of each other. The sample mean is a function of
these random variables (add them up and divide by the sample size), so 𝑋� is a random variable. So what
is the Probability Distribution Function (PDF) of 𝑋� ?

To answer this question, conduct the following experiment. We will roll samples of n dice, determine
the mean roll, and create a PDF for different values of n.

For the case n=1, the distribution of the sample mean is the same as the distribution of the random
variable. Since each die has the same chance of being chosen, the distribution is rectangular shaped
centered at 3.5:
P a g e | 10

For the case n=2, the distribution of the sample mean starts to take on a triangular shape since some
values are more likely to be rolled than others. For example, there six ways to roll a total of 7 and get a
sample mean of 3.5, but only one way to roll a total of 2 and get a sample mean of 1. Notice the PDF is
still centered at 3.5.

For the case n=10, the PDF of the sample mean now takes on a familiar bell shape that looks like a
Normal Distribution. The center is still at 3.5 and the values are now more tightly clustered around the
mean, implying that the standard deviation has decreased.
P a g e | 11

Finally, for the case n=30, the PDF continues to look like the Normal Distribution centered around the
same mean of 3.5, but more tightly clustered than the prior example:

This die-rolling example demonstrates the Central Limit Theorem’s three important observations about
the PDF of 𝑋� compared to the PDF of the original random variable.

1. The mean stays the same.


2. The standard deviation gets smaller.
3. As the sample size increase, the PDF of 𝑋� is approximately Normal.

Central Limit Theorem

If X1, X2, …, Xn is a random sample from a population that has a mean 𝜇


and a standard deviation 𝜎, and n is sufficiently large then:

1. 𝜇𝑋� = 𝜇
𝜎
2. 𝜎𝑋� = 𝑛

3. The Distribution of 𝑋� is approximately Normal.

𝑋�−𝜇
Combining all of the above into a single formula: 𝑍=𝜎
� 𝑛

where Z represents the Standard Normal Distribution.

This powerful result allows us to use the sample mean 𝑋� as an estimator of the population mean 𝜇. In
fact, most inferential statistics practiced today would not be possible without the Central Limit
Theorem.
P a g e | 12

Example:

The mean height of American men (ages 20-29) is µ = 69.2 inches. If a


random sample of 60 men in this age group is selected, what is the
probability the mean height for the sample is greater than 70 inches?
Assume σ = 2.9”.

Due to the Central Limit Theorem, we know the distribution of the


Sample will have approximately a Normal Distribution:

 (70 − 69.2) 
P ( X > 70) = P Z >  = P ( Z > 2.14)= 0.0162
 2.9 60 

Compare this to the much larger probability that one male chosen will be over 70 inches tall:

 (70 − 69.2)  = P ( Z > 0.28)= 0.3897


P ( X > 70) = P Z > 
 2.9 

This example demonstrates how the sample mean will cluster towards the population mean as the
sample size increases.
P a g e | 13

5. Point Estimation and Confidence Intervals


5.1 Inferential Statistics

The reason we conduct statistical research is to obtain an understanding about phenomena in a


population. For example, we may want to know if a potential drug is effective in treating a disease. Since
it is not feasible or ethical to distribute an experimental drug to the entire population, we instead must
study a small subset of the population called a sample. We then analyze the sample and make an
inference about the population based on the sample. Using probability theory and the Central Limit
Theorem, we can then measure the reliability of the inference.

Example: Lupe is trying to sell her house and needs to determine the market value of the home. The
population in this example would be all the homes that are similar to hers in the neighborhood.

Lupe’s realtor chooses for the sample nine recent homes in this neighborhood that sold in the last six
months. The realtor then adjusts some of the sales prices to account for differences between Lupe’s
home and the sold homes.

Sampled Homes Adjusted Sales Price__


$420,000 $440,000 $470,000
$430,000 $450,000 $470,000
$430,000 $460,000 $480,000
P a g e | 14

Next the realtor takes the mean of the adjusted sample and recommends to Lupe a market value for
Lupe’s home of $450,000. The realtor has made an inference about the mean value of the population.

To measure the reliability of the inference, the realtor should look at factors like: the sample size being
small, values of homes may have changed in the last six months, or that Lupe’s home is not exactly like
the sampled homes.

5.2 Point Estimation

The example above is an example of Estimation, a branch of Inferential Statistics where sample statistics
are used to estimate the values of a population parameter. Lupe’s realtor was trying to estimate the
population mean (𝜇) based on the sample mean (𝑋�).

Sample Population
Statistics Parameters
Mean 𝑋� ⟶ 𝜇
Standard Deviation s ⟶ 𝜎
Proportion 𝑝̂ ⟶ 𝑝

In the example above, Lupe’s realtor estimated the population mean of similar homes in Lupe’s
neighborhood by using the sample mean of $450,000 from the adjusted price of the sampled homes.

Interval Estimation

A point estimate is our “best” estimate of a population parameter, but will most likely not exactly equal
the parameter. Instead, we will choose a range of values called an Interval Estimate that is likely to
include the value of the population parameter.

If the Interval Estimate is symmetric, the distance


from the Point Estimator to either endpoint of the
Interval Estimate is called the Margin of Error.

In the example above, Lupe’s realtor could instead


say the true population mean is probably between
$425,000 and $475,000, allowing a $25,000
Margin of Error from the original estimate of
$450,000. This Interval estimate could also be
reported as $450,000 ± $25,000.

5.3 Confidence Intervals

Using probability and the Central Limit Theorem, we can design an Interval Estimate called a Confidence
Interval that has a known probability (Level of Confidence) of capturing the true population parameter.
P a g e | 15

5.3.1 Confidence Interval for Population Mean

To find a confidence interval for the population mean (𝜇) when the population standard deviation (𝜎) is
known, and n is sufficiently large, we can use the Standard Normal Distribution probability distribution
function to calculate the critical values for the Level of Confidence:

Constructing Confidence Intervals for 𝝁

c=Level of Zc=Critical
Confidence Value
90% 1.645
95% 1.960
99% 2.578

Example: The Dean wants to estimate the mean number of hours worked per week by students. A
sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. The point
estimate is 24 hours (sample mean). What is the 95% confidence interval for the average number of
hours worked per week by the students?

1.96∙4
24 ± = 24 ± 1.12 = (22.88, 25.12) hours per week
√49

The margin of error for the confidence interval is 1.12 hours. We can say with 95% confidence that mean
number of hours worked by students is between 22.88 and 25.12 hours per week.

If the level of confidence is increased, then the margin of error will also increase. For example, if we
increase the level of confidence to 99% for the above example, then:

2.578∙4
24 ± = 24 ± 1.47 = (22.53, 25.47) hours per week
√49

Some important points about Confidence Intervals

• The confidence interval is constructed from random variables


calculated from sample data and attempts to predict an unknown
but fixed population parameter with a certain level of confidence.
• Increasing the level of confidence will always increase the margin of
error.
• It is impossible to construct a 100% Confidence Interval without
taking a census of the entire population.
• Think of the population mean like a dart that always goes to the same spot, and the confidence
interval as a moving target that tries to “catch the dart.” A 95% confidence interval would be like
a target that has a 95% chance of catching the dart.
P a g e | 16

5.3.2 Confidence Interval for Population Mean using Sample Standard Deviation – Student’s t
Distribution

The formula for the confidence interval for the mean requires the knowledge of the population standard
deviation (𝜎). In most real-life problems, we do not know this value for the same reasons we do not
know the population mean. This problem was solved by the Irish statistician William Sealy Gosset, an
employee at Guiness Brewing. Gosset, however, was prohibited by Guiness in using his own name in
publishing scientific papers. He published under the name “A Student”, and therefore the distribution
he discovered was named "Student's t-distribution" 8.

Characteristics of Student’s t Distribution

• It is continuous, bell-shaped, and


symmetrical about zero like the z
distribution.
• There is a family of t-distributions sharing a
mean of zero but having different standard
deviations based on degrees of freedom.
• The t-distribution is more spread out and
flatter at the center than the Z-distribution,
but approaches the Z-distribution as the
sample size gets larger.

Confidence Interval for 𝝁

𝒔
� ± 𝒕𝒄
𝑿 with degrees of freedom = n - 1
√𝒏

Example

Last year Sally belonged to an Health Maintenance Organization (HMO) that had a population average
rating of 62 (on a scale from 0-100, with ‘100’ being best); this was based on records accumulated about
the HMO over a long period of time. This year Sally switched to a new HMO. To assess the population
mean rating of the new HMO, 20 members of this HMO are polled and they give it an average rating of
65 with a standard deviation of 10. Find and interpret a 95% confidence interval for population average
rating of the new HMO.

The t distribution will have 20-1 =19 degrees of freedom. Using table or technology, the critical value for
the 95% confidence interval will be tc=2.093

2.093∙10
65 ± = 65 ± 4.68 = (60.32, 69.68) HMO rating
√20
P a g e | 17

With 95% confidence we can say that the rating of Sally’s new HMO is between 60.32 and 69.68. Since
the quantity 62 is in the confidence interval, we cannot say with 95% certainty that the new HMO is
either better or worse than the previous HMO.

5.3.3 Confidence Interval for Population Proportion

Recall from the section on random variables the binomial distribution where 𝑝 represented the
proportion of successes in the population. The binomial model was analogous to coin-flipping, or yes/no
question polling. In practice, we want to use sample statistics to estimate the population proportion (𝑝).

The sample proportion ( 𝑝̂ ) is the proportion of successes in the sample of size n and is the point
estimator for 𝑝. Under the Central Limit Theorem, if 𝑛𝑝 > 5 and 𝑛(1 − 𝑝) > 5, the distribution of the
sample proportion 𝑝̂ will have an approximately Normal Distribution.

� if Central Limit Theorem conditions are met.


Normal Distribution for 𝒑
𝑝(1−𝑝)
𝜇𝑝� = 𝑝 𝜎𝑝� = �
𝑛

Using this information we can construct a confidence interval for 𝑝, the population proportion:

𝑝(1−𝑝) 𝑝�(1−𝑝�)
Confidence interval for 𝒑: 𝑝̂ ± 𝑍� ≈ 𝑝̂ ± 𝑍�
𝑛 𝑛

Example

200 California drivers were randomly sampled and it was


discovered that 25 of these drivers were illegally talking
on the cell phone without the use of a hands-free device.
Find the point estimator for the proportion of drivers who
are using their cell phones illegally and construct a 99%
confidence interval.
25
The point estimator for 𝑝 is 𝑝̂ = = .125 or 12.5%.
200

A 99% confidence interval for 𝑝 is:


.125(1−.125)
0.125 ± 2.576� = .125 ± .060
200

The margin of error for this poll is 6% and we can say with 99% confidence that true percentage of
drivers who are using their cell phones illegally is between 6.5% and 18.5%
P a g e | 18

5.3.4 Point Estimator for Population Standard Deviation

We often want to study the variability, volatility or consistency of a population. For example, two
investments both have expected earnings of 6% per year, but one investment is much riskier, having
higher ups and downs. To estimate variation or volatility of a data set, we will use the sample standard
deviation (𝑠) as a point estimator of the population standard deviation (𝜎).

Example

Investments A and B are both known to have a rate of return of 6% per year. Over the last 24 months,
Investment A has sample standard deviation of 3% per month, while for Investment B, the sample
standard deviation is 5% per month. We would say that Investment B is more volatile and riskier than
Investment A due to the higher estimate of the standard deviation.

To create a confidence interval for an estimate of standard deviation, we need to introduce a new
distribution, called the Chi-square (𝜒 2 ) distribution.

The Chi-square � 𝝌𝟐 � Distribution

The Chi-square distribution is a family of distributions related to the Normal Distribution as it represents
a sum of independent squared standard Normal Random Variables. Like the Student’s t distribution, the
degrees of freedom will be n-1 and determine the shape of the distribution. Also, since the Chi-square
represents squared data, the inference will be about the variance rather than the standard deviation.

Characteristics of Chi-square � 𝝌𝟐 � Distribution

• It is positively skewed
• It is non-negative
• It is based on degrees of freedom (n-1)
• When the degrees of freedom change,
a new distribution is created
(𝑛−1)𝑠2
• 𝜎2
will have Chi-square distribution.

5.3.5 Confidence Interval for Population Variance and Standard Deviation

Since the Chi-square represents squared data, we can construct confidence intervals for the population
variance (𝜎 2 ), and take the square root of the endpoints to get a confidence interval for the population
standard deviation. Due to the skewness of the Chi-square distribution the resulting confidence interval
will not be centered at the point estimator, so the margin of error form used in the prior confidence
intervals doesn’t make sense here.
P a g e | 19

Confidence Interval for population variance (𝝈𝟐 )

• Confidence is NOT symmetric since


chi-square distribution is not symmetric.
• Take square root of both endpoints to get
confidence interval the population
standard deviation (𝜎).

 (n − 1)s 2 (n − 1)s 2 
 , 
 χR χL 
2 2

Example

In performance measurement of investments, standard deviation is


a measure of volatility or risk. Twenty monthly returns from a
mutual fund show an average monthly return of 1% and a sample
standard deviation of 5%. Find a 95% confidence interval for the
monthly standard deviation of the mutual fund.

The Chi-square distribution will have 20-1 =19 degrees of freedom.


Using technology, the two critical values are 𝜒𝐿2 = 9.90655 and 𝜒𝑅2 = 32.8523.

 (19 )52 (19 )52 


Formula for confidence interval for 𝜎 is:  , = (3.8,7.3)
 32.8523 8.90655 
 

One can say with 95% confidence that the standard deviation for this mutual fund is between 3.8% and
7.3% per month.
P a g e | 20

6. One Population Hypothesis Testing


In the prior section we used statistical inference to make an estimate of a population parameter and
measure the reliability of the estimate through a confidence interval. In this section, we will explore in
detail the use of statistical inference in testing a claim about a population parameter, which is the heart
of the scientific method used in research.

6.1 Procedures of Hypotheses Testing and the Scientific Method

The actual conducting of a hypothesis test is


only a small part of the scientific method.
After formulating a general question, the
scientific method consists of: the designing
of an experiment, the collecting of data
through observation and experimentation,
the testing of hypotheses, and the reporting
of overall conclusions. The conclusions
themselves lead to other research ideas
making this process a continuous flow of
adding to the body of knowledge about the phenomena being studied.

Others may choose a more formalized and detailed set of procedures, but the general concepts of
inspiration, design, experimentation, and conclusion allow one to see the whole process.

6.2 Formulate General Research Questions

Most general questions start with an inspiration or an idea about a topic or phenomenon of interest.
Some examples of general questions:

• (Health Care) Would a public single payer health care system be more effective than the current
private insurance system?
• (Labor) What is the effect of undocumented immigration and outsourcing of jobs on the current
unemployment rate.
• (Economy) Is the federal economic stimulus package effective in lessening the impact of the
recession?
• (Education) Are colleges too expensive for students today?

It is important to not be so specific in choosing these general questions. Based on available or


potentially available data, we can decide later what specific research hypotheses will be formulated and
tested to address the general question. During the data collection and testing process other ideas may
come up and we may choose to redefine the general question. However, we always want to have an
overriding purpose for our research.
P a g e | 21

6.3 Design Research Hypotheses and Experiment

After developing a general question and having some


sense of the data that is available or to be collected, it
is time to design and an experiment and set of
hypotheses.

6.3.1 Hypotheses and Hypothesis Testing

For purposes of testing, we need to design


hypotheses that are statements about population
parameters. Some examples of hypotheses:

• At least 20% of juvenile offenders are caught and sentenced to prison.


• The mean monthly income for college graduates is $5000.
• The mean standardized test score for schools in Cupertino is the same as the mean scores for
Los Altos.
• The lung cancer rates in California are lower than the rates in Texas.
• The standard deviation of the New York Stock Exchange today is greater than 10 percentage
points per year.

These same hypotheses could be written in symbolic notation:

• 𝑝 > 0.20
• 𝜇 > 5000
• 𝜇1 = 𝜇2
• 𝑝1 < 𝑝2
• 𝜎 > 10

Hypothesis Testing is a procedure, based on sample evidence and probability theory, used to determine
whether the hypothesis is a reasonable statement and should not be rejected, or is unreasonable and
should be rejected. This hypothesis that is tested is called the Null Hypothesis designated by the symbol
Ho. If the Null Hypothesis is unreasonable and needs to be rejected, then the research supports an
Alternative Hypothesis designated by the symbol Ha.

Null Hypothesis (Ho): A statement about the value of a population


parameter that is assumed to be true for the purpose of testing.

Alternative Hypothesis (Ha): A statement about the value of a population


parameter that is assumed to be true if the Null Hypothesis is rejected
during testing.
P a g e | 22

From these definitions it is clear that the Alternative Hypothesis will necessarily contradict the Null
Hypothesis; both cannot be true at the same time. Some other important points about hypotheses:

• Hypotheses must be statements about population parameters, never about sample statistics.
• In most hypotheses tests, equality ( =, ≤, ≥ ) will be associated with the Null Hypothesis while
non-equality (≠, <, > ) will be associated with the Alternative Hypothesis.
• It is the Null Hypothesis that is always tested in attempt to “disprove” it and support the
Alternative Hypothesis. This process is analogous in concept to a “proof by contradiction” in
Mathematics or Logic, but supporting a hypothesis with a level of confidence is not the same as
an absolute mathematical proof.

Examples of Null and Alternative Hypotheses:

• 𝐻𝑜 : 𝑝 ≤ 0.20 𝐻𝑎 : 𝑝 > 0.20


• 𝐻𝑜 : 𝜇 ≤ 5000 𝐻𝑎 : 𝜇 > 5000
• 𝐻𝑜 : 𝜇1 = 𝜇2 𝐻𝑎 : 𝜇1 ≠ 𝜇2
• 𝐻𝑜 : 𝑝1 ≥ 𝑝2 𝐻𝑎 : 𝑝1 < 𝑝2
• 𝐻𝑜 : 𝜎 ≤ 10 𝐻𝑎 : 𝜎 > 10

6.3.2 Statistical Model and Test Statistic

To test a hypothesis we need to use a statistical model that describes the behavior for data and the type
of population parameter being tested. Because of the Central Limit Theorem, many statistical models
are from the Normal Family, most importantly the Z, t, χ2, and F distributions. Other models that are
used when the Central Limit Theorem is not appropriate are called non-parametric Models and will not
be discussed here.

Each chosen model has requirements of the data called model assumptions that should be checked for
appropriateness. For example, many models require the sample mean has approximately a Normal
Distribution, which may not be true for some smaller or heavily skewed data sets.

Once the model is chosen, we can then determine a test statistic, a value derived from the data that is
used to decide whether to reject or fail to reject the Null Hypothesis.

Some Examples of Statistical Models and Test Statistics

Statistical Model Test Statistic

𝑋�−𝜇𝑜
Mean vs. Hypothesized Value 𝑡= 𝑠
� 𝑛

𝑝�−𝑝𝑜
Proportion vs. Hypothesized Value 𝑍=
𝑝 (1−𝑝0 )
� 𝑜
𝑛

(𝑛−1)𝑠2
Variance vs. Hypothesized Value 𝜒2 = 𝜎2
P a g e | 23

6.3.3 Errors in Decision Making

Whenever we make a decision or support a position, there is always a chance we make the wrong
choice. The hypothesis testing process requires us to either to reject the Null Hypothesis and support
the Alternative Hypothesis or fail to reject the Null Hypothesis. This creates the possibility of two types
of error:

• Type I Error
Rejecting the null hypothesis when
it is actually true.

• Type II Error
Failing to reject the null hypothesis
when it is actually false.

In designing hypothesis tests, we need to carefully consider the probability of making either one of
these errors.

Example:

Recall the two news stories discussed earlier in Section 3. In the first story, a drug company marketed a
suppository that was later found to be ineffective (and often dangerous) in treatment. Before marketing
the drug, the company determined that the drug was effective in treatment, which means the company
rejected a Null Hypothesis that the suppository had no effect on the disease. This is an example of Type I
error.

In the second story, research was abandoned when the testing showed Interferon was ineffective in
treating a lung disease. The company in this case failed to reject a Null Hypothesis that the drug was
ineffective. What if the drug really was effective? Did the company make Type II error? Possibly, but
since the drug was never marketed, we have no way of knowing the truth.

These stories highlight the problem of statistical research: errors can be analyzed using probability
models, but there is often no way of indentifying specific errors. For example, there are unknown
innocent people in prison right now because a jury made Type I error in wrongfully convicting
defendants. We must be open to the possibility of modification or rejection of currently accepted
theories when new data is discovered.

In designing an experiment, we set a maximum probability of making Type I error. This probability is
called the level of significance or significance level of the test and designated by the Greek letter α.

The analysis of Type II error is more problematic as there many possible values that would satisfy the
Alternative Hypothesis. For a specific value of the Alternative Hypothesis, the design probability of
making Type II error is called Beta (β) which will be analyzed in detail later in this section.
P a g e | 24

6.3.4 Critical Value and Rejection Region

Once the significance level of the test is chosen, it is then possible to find region(s) of the probability
distribution function of the test statistic that would allow the Null Hypothesis to be rejected. This is
called the Rejection Region and the boundry between the Rejection Region and the “Fail to Reject” is
called the Critical Value.

There can be more than one critical value and rejection region. What matters is that the total area of the
rejection region equals the significance level α.

One-tailed Hypothesis Test Two-tailed Hypothesis Test

6.3.5 One and Two tailed Tests

A test is one-tailed when the Alternative Hypothesis, Ha , states a direction, such as:

H0: The mean income of females is less than or equal to the mean income of male.
Ha : The mean income of females is greater than males.

Since equality is usually part of the Null Hypothesis, it is the Alternative Hypothesis which determines
which tail to test.

A test is two-tailed when no direction is specified in the alternate hypothesis Ha , such as:

H0 : The mean income of females is equal to the mean income of males.


Ha : The mean income of females is not equal to the mean income of the males.

In a two tailed-test, the significance level is split into two parts since there are two rejection regions. In
hypothesis testing where the statistical model is symmetrical ( eg: the Standard Normal Z or Student’s t
distribution) these two regions would be equal. There is a relationship between a confidence interval
and a two-tailed test: If the level of confidence for a confidence interval is equal to 1-α, where α is the
significance level of the two-tailed test, the critical values would be the same.
P a g e | 25

Here are some examples for testing the mean µ against a hypothesized value µ0:

Ha: µ>µ0 means test the upper tail and is also called a right-tailed test.
Ha: µ<µ0 means test the lower tail and is also called a left-tailed test.
Ha: µ≠µ0 means test both tails.

Deciding when to conduct a one or two-tailed test is often controversial and many authorities even go
so far as to say that only two-tailed tests should be conducted. Ultimately, the decision depends on the
wording of the problem. If we want to show that a new diet reduces weight, we would conduct a lower
tailed test since we don’t care if the diet causes weight gain. If instead, we wanted to determine if mean
crime rate in California was different from the mean crime rate in the United States, we would run a
two-tailed test, since different means greater than or less than.

6.4 Collect and Analyze Experimental Data

After designing the experiment, the next


procedure would be to actually collect and
verify the data. For the purposes of statistical
analysis, we will assume that all sampling is
either random, or uses an alternative
technique that adequately simulates a
random sample.

6.4.1 Data Verification

After collecting the data but before running the test, we need to verify the data. First, get a picture of
the data by making a graph (histogram, dot plot, box plot, etc.) Check for skewness, shape and any
potential outliers in the data.

6.4.2 Working with Outliers

An outlier is data point that is far removed from the other entries in the data set. Outliers could be
caused by:

• Mistakes made in recording data


• Data that don’t belong in population
• True rare events

The first two cases are simple to deal with as we can correct errors or remove data that that does not
belong in the population. The third case is more problematic as extreme outliers will increase the
standard deviation dramatically and heavily skew the data.

In The Black Swan, Nicholas Taleb argues that some populations with extreme outliers should not be
analyzed with traditional confidence intervals and hypothesis testing. 9 He defines a Black Swan to be an
P a g e | 26

unpredictable extreme outlier that causes dramatic effects on the population. A recent example of a
Black Swan was the catastrophic drop in the value of unregulated Credit Default Swap (CDS) real estate
insurance investments which caused the near collapse of international banking system in 2008. The
traditional statistical analysis that measured the risk of the CDS investments did not take into account
the consequence of a rapid increase in the number of foreclosures of homes. In this case, statistics that
measure investment performance and risk were useless and created a false sense of security for large
banks and insurance companies.

Example

Here are the quarterly home sales for 10 realtors

2 2 3 4 5 5 6 6 7 50

With outlier Without Outlier


Mean 9.00 4.44
Median 5.00 5.00
Standard Deviation 14.51 1.81
Interquartile Range 3.00 3.50

In this example, the number 50 is an outlier. When calculating summary statistics, we can see that the
mean and standard deviation are dramatically affected by the outlier, while the median and the
interquartile range (which are based on the ranking of the data) are hardly changed. One solution when
dealing with a population with extreme outliers is to use inferential statistics using the ranks of the data,
also called non-parametric statistics.

Using Box Plot to find outliers

• The “box” is the region between the 1st and 3rd quartiles.
• Possible outliers are more than 1.5 IQR’s from the box (inner fence)
• Probable outliers are more than 3 IQR’s from the box (outer fence)
• In the box plot below of the realtor example, the dotted lines represent the “fences” that are
1.5 and 3 IQR’s from the box. See how the data point 50 is well outside the outer fence and
therefore an almost certain outlier.
P a g e | 27

6.4.3 The Logic of Hypothesis Testing

After the data is verified, we want to conduct the hypothesis test and come up with a decision, whether
or not to reject the Null Hypothesis. The decision process is similar to a “proof by contradiction” used in
mathematics:

• We assume Ho is true before observing data and design Ha to be the complement of Ho.
• Observe the data (evidence). How unusual are these data under Ho?
• If the data are too unusual, we have “proven” Ho is false: Reject Ho and support Ha (strong
statement).
• If the data are not too unusual, we fail to reject Ho. This “proves” nothing and we say data are
inconclusive. (weak statement) .
• We can never “prove” Ho , only “disprove” it.
• “Prove” in statistics means support with (1-α)100% certainty. (example: if α=.05, then we are at
least 95% confident in our decision to reject Ho.

6.4.4 Decision Rule – Two methods, Same Decision

Earlier we introduced the idea of a test statistic which is a value calculated from the data under the
appropriate Statistical Model from the data that can be compared to the critical value of the Hypothesis
test. If the test statistic falls in the rejection region of the statistical model, we reject the Null
Hypothesis.

Recall that the critical value was determined by design based on the chosen level of significance α. The
more preferred method of making decisions is to calculate the probability of getting a result as extreme
as the value of the test statistic. This probability is called the p-value, and can be compared directly to
the significance level.

• p-value: the probability, assuming that the null hypothesis is true, of getting a value of the test
statistic at least as extreme as the computed value for the test.
• If the p-value is smaller than the significance level α, H0 is rejected.
• If the p-value is larger than the significance level α, H0 is not rejected.

Comparing p-value to α

Both the p-value and α are probabilities of getting results as extreme as the data assuming Ho is true.

The p-value is determined by the data is related to the actual probability of making Type I error
(Rejecting a True Null Hypothesis). The smaller the p-value, the smaller the chance of making Type I
error and therefore, the more likely we are to reject the Null Hypothesis.

The significance level α is determined by design and is the maximum probability we are willing to accept
of rejecting a true H0.
P a g e | 28

Two Decision Rules lead to the same decision.

1. If the test statistic lies in the rejection region, reject Ho. (critical value method)
2. If the p-value < α, reject Ho. (p-value method)

This p-value method of comparison is preferred to the critical value method because the rule is the
same for all statistical models: Reject Ho if p-value < α.

Let’s see why these two rules are equivalent by analyzing a test of mean vs. hypothesized value.

Decision is Reject Ho
• Ho: µ = 10
Ha: µ > 10
• Design: Critical value is determined by
significance level α.
• Data Analysis: p-value is determined by
test statistic
• Test statistic falls in rejection region.
• p-value (blue) < α (purple)
• Reject Ho.
• Strong statement: Data supports the
Alternative Hypothesis.

In this example, the test statistic lies in the rejection region (the area to the right of the critical value).
The p-value (the area to the right of the test statistic) is less than the significance level (the area to the
right of the critical value). The decision is Reject Ho.

Decision is Fail to Reject Ho


• Ho: µ = 10
Ha: µ > 10
• Design: critical value is determined by
significance level α.
• Data Analysis: p-value is determined by
test statistic
• Test statistic does not fall in the rejection region.
• p-value (blue) > α (purple)
• Fail to Reject Ho.
• Weak statement: Data is inconclusive and does
not support the Alternative Hypothesis.

In this example, the Test Statistic does not lie in the Rejection Region. The p-value (the area to the right
of the test statistic) is greater than the significance level (the area to the right of the critical value). The
decision is Fail to Reject Ho.
P a g e | 29

6.5 Report Conclusions in Non-statistical Language

The hypothesis test has been conducted and we have reached a decision. We must now communicate
these conclusions so they are complete, accurate, and understood by the targeted audience. How a
conclusion is written is open to subjective analysis, but here are a few suggestions:

6.5.1 Be consistent with the results of the Hypothesis Test.

Rejecting Ho requires a strong statement in support of Ha, while failing to reject Ho does NOT support
Ho, but requires a weak statement of insufficient evidence to support Ha.

Example: A researcher wants to support the claim that, on


average, students send more than 1000 text messages per month
and the research hypotheses are Ho: µ=1000 vs. Ha: µ>1000

Conclusion if Ho is rejected: The mean number of text messages


sent by students exceeds 1000.

Conclusion if Ho is not rejected: There is insufficient evidence to


support the claim that the mean number of text messages sent
by students exceeds 1000.

6.5.2 Use language that is clearly understood in the context of the problem.

Do not use technical language or jargon, but instead refer back to the language of the original general
question or research hypotheses. Saying less is better than saying more.

Example: A test supported the Alternative Hypothesis


that housing prices and size of homes in square feet Housing Prices and Square Footage

were positively correlated. Compare these two 200

conclusions and decide which is clearer: 180


160
140
• Conclusion 1: By rejecting the Null Hypothesis 120
we are inferring that the Alterative Hypothesis
Price

100

is supported and that there exists a significant 80


60
correlation between the independent and 40
dependent variables in the original problem 20

comparing home prices to square footage. 0


10 15 20 25 30
• Conclusion 2: Homes with more square Size

footage generally have higher prices.

6.5.3 Limit the inference to the population that was sampled.

Care must be taken to describe the population being sampled and understand that the any claim is
limited to this sampled population. If a survey was taken of a subgroup of a population, then the
inference applies only to the subgroup.
P a g e | 30

For example, studies by pharmaceutical companies will only test adult patients, making it difficult to
determine effective dosage and side effects for children. “In the absence of data, doctors use their
medical judgment to decide on a particular drug and dose for children. ‘Some doctors stay away from
drugs, which could deny needed treatment,’ Blumer says. ‘Generally, we take our best guess based on
what's been done before.’ The antibiotic chloramphenicol was widely used in adults to treat infections
resistant to penicillin. But many newborn babies died after receiving the drug because their immature
livers couldn't break down the antibiotic.” 10 We can see in this example that applying inference of the
drug testing results on adults to the un-sampled children led to tragic results.

6.5.4 Report sampling methods that could question the integrity of the random sample assumption.

In practice it is nearly impossible to choose a random sample, and scientific sampling techniques that
attempt to simulate a random sample need to be checked for bias caused by under-sampling.

Telephone polling was found to under-sample young people during the 2008 presidential campaign
because of the increase in cell phone only households. Since young people were more likely to favor
Obama, this caused bias in the polling numbers. Additionally, caller ID has dramatically reduced the
percentage of successful connections with people being surveyed. The pollster Jay Leve of SurveyUSA
said telephone polling was “doomed” and said his company was already developing new methods for
polling. 11

Sampling that didn’t occur over the weekend may exclude many full time workers while self-selected
and unverified polls (like ratemyprofessors.com) could contain immeasurable bias.

6.5.5 Conclusions should address the potential or necessity of further research, sending the process
back to the first procedure.

Answers often lead to new questions. If changes are recommended in a researcher’s conclusion, then
further research is usually needed to analyze the impact and effectiveness of the implemented changes.
There may have been limitations in the original research project (such as funding resources, sampling
techniques, unavailability of data) that warrants more a comprehensive study.

For example, a math department modifies its curriculum based on a performance statistics for an
experimental course. The department would want to do further study of student outcomes to assess the
effectiveness of the new program.

6.6 Test of Mean vs. Hypothesized Value – A Complete Example

A food company has a policy that the stated contents of a product


match the actual results. A General Question might be “Does the
stated net weight of a food product match the actual weight?” The
quality control statistician decides to test the 16 ounce bottle of Soy
Sauce and must now design the experiment.
P a g e | 31

The quality control statistician has been given the authority to sample 36 bottles of soy sauce and knows
from past testing that the population standard deviation is 0.5 ounces. The model will be a test of
population mean vs. hypothesized value of 16 oz. A two-tailed test is selected since the company is
concerned about both overfilling and underfilling the bottles as the stated policy is the stated weight
match the actual weight of the product.

Research Hypotheses: Ho: µ=16 (The filling machine is operating properly)


Ha: µ ≠16 (The filling machine is not operating properly)

𝑋�−𝜇
Since the population standard deviation is known the test statistic will be 𝑍 = 𝜎 . This model is
� 𝑛

appropriate since the sample size assures the distribution of the sample mean is approximately Normal
from the Central Limit Theorem.

Type I error would be to reject the Null Hypothesis and say the machine is not running properly when in
fact it was operating properly. Since the company does not want to needlessly stop production and
recalibrate the machine, the statistician chooses to limit the probability of Type I error by setting the
level of significance (α) to 5%.

The statistician now conducts the


experiment and samples 36 bottles in the
last hour and determines from a box plot of
the data that there is one unusual
observation of 17.56 ounces. The value is
rechecked and kept in the data set.

Next, the sample mean and the test statistic are calculated.

𝟏𝟔.𝟏𝟐−𝟏𝟔
� = 𝟏𝟔. 𝟏𝟐 ounces
𝑿 𝒁= = 𝟏. 𝟒𝟒
𝟎.𝟓�
√𝟑𝟔

The decision rule under the critical value method


would be to reject the Null Hypothesis when the
value of the test statistic is in the rejection region.
In other words, reject Ho when Z >1.96 or Z<-1.96.

Based on this result, the decision is fail to reject Ho


since the test statistic does not fall in the rejection
region.
P a g e | 32

Alternatively (and preferably) the statistician would use the p-value method of decision rule. The p-value
for a two-tailed test must include all values (positive and negative) more extreme than the Test Statistic,
so in this example we find the probability that Z < -1.44 or Z > 1.44 (the area shaded blue).

Using a calculator, computer software or a Standard Normal table, the p-value=0.1498. Since the p-
value is greater than α, the decision again is fail to reject Ho.

Finally the statistician must report the conclusions


and make a recommendation to the company’s
management:

“There is insufficient evidence to conclude


that the machine that fills 16 ounce soy
sauce bottles is operating improperly. This
conclusion is based on 36 measurements
taken during a single hour’s production run.
I recommend continued monitoring of the
machine during different employee shifts to
account for the possibility of potential
human error.”

The statistician makes the weak statement and is not stating that the machine is running properly, only
that there is not enough evidence to state machine is running improperly. The statistician also reporting
concerns about the sampling of only one shift of employees (restricting the inference to the sampled
population) and recommends repeating the experiment over several shifts.

6.7 Type II Error and Statistical Power

In the prior example, the statistician failed to reject


the Null Hypothesis because the probability of
making Type I error (rejecting a true Null
Hypothesis) exceeded the significance level of 5%.
However, the statistician could have made Type II
error if the machine is really operating improperly.
One of the important and often overlooked tasks is
to analyze the probability of making Type II error (β).
Usually statisticians look at statistical power which is
the complement of β.

Beta (β): The probability of failing to reject


the null hypothesis when it is actually false.
Power (or Statistical Power): The probability of rejecting
the null hypothesis when it is actually false.

Both beta and power are calculated for specific possible


values of the Alternative Hypothesis.
P a g e | 33

If a hypothesis test has low power, then it would difficult to reject Ho, even if Ho were false; the research
would be a waste of time and money. However, analyzing power is difficult in that there are many
values of the population parameter that support Ha. For example, in the soy sauce bottling example, the
Alternative Hypothesis was that the mean was not 16 ounces. This means the machine could be filling
the bottles with a mean of 16.0001 ounces, making Ha technically true. So when analyzing power and
Type II error we need to choose a value for the population mean under the Alternative Hypothesis (µa)
that is “practically different” from the mean under the Null Hypothesis (µo). This practical difference is
called the effect size.

µo: The value of the population mean under the Null Hypothesis
µa: The value of the population mean under the Alternative Hypothesis
Effect Size: The “practical difference” between µo and µa = | 𝜇𝑜 − 𝜇𝑎 |

Suppose we are conducting a one-tailed test of the population mean:

Ho: µ = µ0 Ha: µ > µ0

Consider the two graphs shown to the


right. The top graph is the distribution
of the sample mean under the Null
Hypothesis that we covered in an
earlier section. The area to the right of
the critical value is the rejection region.

We now add the bottom graph which


represents the distribution of the
sample mean under the Alternative
Hypothesis for the specific value µa.

We can now measure the Power of the


test (the area in green) and beta (the
area in purple) on the lower graph.

There are several methods of


increasing Power, but they all have trade-offs:
P a g e | 34

Example

Bus brake pads are claimed to last on average at least 60,000 miles and the company wants to test this
claim. The bus company considers a “practical” value for purposes of bus safety to be that the pads last
at least 58,000 miles. If the standard deviation is 5,000 and the sample size is 50, find the power of the
test when the mean is really 58,000 miles. (Assume α = .05)

First, find the critical value of the test.

Reject Ho when Z < -1.645

Next, find the value of X that corresponds to the critical value.


X = µo + = 60000 − (1.645)(5000) / 50 = 58837
n

Ho is rejected when X < 58837

Finally, find the probability of rejecting Ho if Ha is true.

 (58837 − µ a ) 
P ( X < 58837) = P Z < 
 σ/ n 
 (58837 − 58000) 
= P Z <  = P ( Z < 1.18) = .8810
 5000 / 50 

Therefore, this test has 88% power and β would be 12%

Power Calculation Values

Input Values
𝜇𝑜 = 60,000 miles
𝜇𝑎 = 58,000 miles
𝛼 = 0.05
𝑛 = 50
𝜎 = 5000 miles

Calculated Values
Effect Size = 2000 miles
Critical Value = 58,837 miles
𝛽 = 0.1190 or about 12%
Power = 0.8810 or about 88%
P a g e | 35

6.8 New Models for One Population Inference, Similar Procedures

The procedures outlined for the test of population mean vs. hypothesized value with known population
standard deviation will apply to other models as well. All that really changes is the test statistic.

Examples of some other one population models:

• Test of population mean vs. hypothesized value, population standard deviation unknown.
• Test of population proportion vs. hypothesized value.
• Test of population standard deviation (or variance) vs. hypothesized value.

6.8.1 Test of population mean with unknown population standard deviation

The test statistic for the one sample case changes to a Student’s t distribution with degrees of freedom
� −𝜇
𝑋
equal to n-1: 𝑡= 𝑠⁄√𝑛
𝑜

The shape of the t distribution is similar to the Z, except the tails are fatter, so the logic of the decision
rule is the same as the Z test statistic.

Example

Humerus bones from the same species have


approximately the same length-to-width ratios. When
fossils of humerus bones are discovered,
archaeologists can determine the species by
examining this ratio. It is known that Species A has a
mean ratio of 9.6. A similar Species B has a mean ratio
of 9.1 and is often confused with Species A. 21
humerus bones were unearthed in an area that was
originally thought to be inhabited Species A. (Assume
all unearthed bones are from the same species.)

1. Design a hypotheses where the alternative claim would be the humerus bones were not from
Species A.

Research Hypotheses
Ho: µ = 9.6 (The humerus bones are from Species A)
Ha: µ ≠ 9.6 (The humerus bones are not from Species A)

Significance level: α =.05

Test Statistic (Model): t-test of mean vs. hypothesized value, unknown standard deviation
Model Assumptions: we may need to check the data for extreme skewness as the distribution of the
sample mean is assumed to be approximately the Normal Distribution.
P a g e | 36

2. Determine the power of this test if the bones actually came from Species B (assume a standard
deviation of 0.7)

Information needed for Power Calculation Results using Online Power Calculator 12
• µo = 9.6 (Species A) • Power =.8755
• µa = 9.1 (Species B) • β = 1 - Power = .1245
• Effect Size =| mo - ma | = 0.5 • If humerus bones are from Species B,
• s = 0.7 (given) test has an 87.55% chance of correctly
• α = .05 rejecting Ho and a maximum Type II
• n = 21 (sample size) error of 12.55%
• Two tailed test

3. Conduct the test using at a 5% significance level


and state overall conclusions.

6 8 10 12

From MegaStat 13, p-value = .0308 and α =.05.


Since p-value < α , Ho is rejected and we support Ha.

Conclusion: The evidence supports the claim (p-value<.05) that the humerus bones are not from Species
A. The small sample size limited the power of the test, which prevented us from making a more
definitive conclusion. Recommend testing to see if bones are from Species B or other unknown species.
We are assuming since the bones were unearthed in the same location, they came from the same
species.
P a g e | 37

6.8.2 Test of population proportion vs. hypothesized value.

When our data is categorical and there are only two possible choices (for example a yes/no question on
a poll), we may want to make a claim about a proportion or a percentage of the population (𝑝) being
compared to a particular value (𝑝𝑜 ). We will then use the sample proportion (𝑝̂ )to test the claim.

Test of proportion vs. hypothesized value

𝒑 = population proportion 𝒑𝒐 = population proportion under Ho

� = sample proportion
𝒑 𝒑𝒂 = population proportion under Ha

𝑝�−𝑝𝑜
Test Statistic: 𝑍= Requirement for Normality Assumption: 𝑛𝑝(1 − 𝑝) > 5
(1−𝑝𝑜 )
�𝑝𝑜
𝑛

Example

In the past, 15% of the mail order solicitations for a certain


charity resulted in a financial contribution. A new
solicitation letter has been drafted and will be sent to a
random sample of potential donors. A hypothesis test will
be run to determine if the new letter is more effective.
Determine the sample so that (1) the test will be run at
the 5% significance level and (2) If the letter has an 18%
success rate, (an effect size of 3%), the power of the test
will be 95%. After determining the sample size, conduct
the test.

• Ho: p ≤ 0.15 (The new letter is not more effective.)


• Ha: p > 0.15 (The new letter is more effective.)
• Test Statistic – Z-test of proportion vs. hypothesized value.

Information needed for Sample Size Results using online Power Calculator and
Calculation Megastat

• po = 0.15 (current letter) • Sample size = 1652


• pa = 0.18 (potential new letter)
• Effect Size =| pa - po | = 0.03 • The charity sent out 1652 new
• Desired Power = 0.95 solicitation letters to potential donors
• α = .05 and ran the test, receiving 286 positive
• One tailed test responses.

• p-value for test = 0.0042


P a g e | 38

Since p-value < α, reject Ho and support Ha. Since the p-value is actually less than 0.01, we would go
further and say that the data supports rejecting Ho for α = .01.

Conclusion: The evidence supports the claim that the new letter is more effective. The 1652 test letters
were selected as a random sample from the charity’s mailing list. All letters were sent at the same time
period. The letters needed to be sent in a specific time period, so we were not able to control for
seasonal or economic factors. We recommend testing both solicitation methods over the entire year to
eliminate seasonal effects and to create a control group.
P a g e | 39

6.8.3 Test of population standard deviation (or variance) vs. hypothesized value.

We often want to make a claim about the variability, volatility or consistency of a population random
variable. Hypothesized values for population variance σ2 or standard deviation s are tested with the Chi-
square (χ2) distribution.

Examples of Hypotheses:

• Ho: σ = 10 Ha: σ ≠ 10
• Ho: σ2 = 100 Ha: σ2 > 100

The sample variance s2 is used in calculating the Chi-square Test Statistic.

Test of variance vs. hypothesized value

𝝈𝟐 = population variance 𝝈𝟐𝒐 = population variance under Ho

𝒔𝟐 = sample variance

(𝑛−1)𝑠2
Test Statistic: 𝜒2 = 𝒏 − 𝟏 = degrees of freedom
𝜎𝑜2

Example

A state school administrator claims that the


standard deviation of test scores for 8th grade
students who took a life-science assessment test is
less than 30, meaning the results for the class show
consistency. An auditor wants to support that claim
by analyzing 41 students recent test scores. The
test will be run at 1% significance level.

Design:

Research Hypotheses:

• Ho: Standard deviation for test scores


equals 30.
• Ha: Standard deviation for test scores is less than 30.

Hypotheses In terms of the population variance:

• Ho: σ2 = 900
• Ha: σ2 < 900
P a g e | 40

Results:

Decision: Reject Ho

Conclusion:

The evidence supports the claim (p-value<.01) that the standard deviation for 8th grade test scores is less
than 30. The 40 test scores were the results of the recently administered exam to the 8th grade students.
Since the exams were for the current class only, there is no assurance that future classes will achieve
similar results. Further research would be to compare results to other schools that administered the
same exam and to continue to analyze future class exams to see if the claim is holding true.
P a g e | 41

7. Two Population Inference

In this section we consider expanding the concepts from the prior section to design and conduct
hypothesis testing with two samples. Although the logic of hypothesis testing will remain the same, care
must be taken to choose the correct model. We will first consider comparing two population means.

7.1 Independent vs. dependent sampling

In designing a two population test of means, first determine whether the experiment involves data that
is collected by independent or dependent sampling.

7.1.1 Independent sampling

The data is collected by two simple random samples from separate and unrelated populations. This data
will then be used to compare the two population means. This is typical of an experimental or treatment
population versus a control population.

INDEPENDENT SAMPLING

• n1 is the sample size from population 1.


• n2 is the sample size from population 2.
• X 1 is the sample mean from population 1.
• X 2 is the sample mean from population 2.
• s1 is the sample standard deviation from
population 1.
• s 2 is the sample standard deviation from
population 2.

Example

A community college mathematics department wants to know if an experimental algebra course has
higher success rates when compared to a traditional course. The mean grade points for 80 students in
the experimental course (treatment) is compared to the mean grade points for 100 students in the
traditional course (control).

7.1.2 Dependent sampling

The data consists of a single population and two measurements. A simple random sample is taken from
the population and pairs of measurement are collected. This is also called related sampling or matched
pair design. Dependent sampling actually reduces to a one population model of differences.
P a g e | 42

DEPENDENT SAMPLING

• n is the sample size from the population, the


number of pairs

• X d is the sample mean of the differences of


each pair.

• s d is the sample standard deviation of the


differences of each pair.

Example

An instructor of a statistics course wants to know if student scores are different on the second midterm
compared to the first exam. The first and second midterm scores for 35 students is taken and the mean
difference in scores is determined.

7.2 Independent sampling models

We will first consider the case when we want to compare the population means of two populations
using independent sampling.

7.2.1 Distribution of the difference of two sample means

Suppose we wanted to test the hypothesis 𝑯𝒐: 𝝁𝟏= 𝝁𝟐 . We have point estimators for both 𝜇1 and 𝜇2 ,
namely 𝑋�1 and 𝑋�2 , which have approximately Normal Distributions under the Central Limit Theorem, but
it would useful to combine them both into a single estimator. Fortunately it is known that if two random
variables have a Normal Distribution, then so does the sum and difference. Therefore we can restate the
hypothesis as 𝑯𝒐: 𝝁𝟏− 𝝁𝟐 = 𝟎 and use the difference of sample means 𝑋�1 − 𝑋�2 as a point estimator for
the difference in population means 𝜇1 − 𝜇2 .

�𝟏 − 𝑿
Distribution of 𝑿 � 𝟐 under the Central Limit Theorem

𝜎2 𝜎2
𝝁𝑿�𝟏 −𝑿�𝟐 = 𝜇1− 𝜇2 𝝈𝑿�𝟏 −𝑿�𝟐 = �𝑛1 + 𝑛2
1 2

(𝑋�1 −𝑋�2 )−(𝜇1− 𝜇2 )


𝑍= if n1 and n2 are sufficiently large.
𝜎2 𝜎2
� 1+ 2
𝑛1 𝑛2
P a g e | 43

7.2.2 Comparing two means, independent sampling: Model when population variances known

When the population variances are known, the test statistic for the Hypothesis 𝑯𝒐: 𝝁𝟏= 𝝁𝟐 can be tested
with Normal distribution Z test statistic shown above. Also, if both sample size n1 and n2 exceed 30, this
model can also be used.

Example

Are larger homes more likely to have pools? The square


footage (size) data for single family homes in California was
separated into two populations: Homes with pools and
homes without pools. We have data from 130 homes with
pools and 95 homes without pools.

Example - Design
Research Hypotheses: Ho: µ1≤µ2 (Homes with pools do not have more mean square footage)
Ha: µ1>µ2 (Homes with pools do have more mean square footage)

Since both sample sizes are over 30, the model will be a Large sample Z test comparing two population
means with independent sampling. This model is appropriate since the sample sizes assures the
distribution of the sample mean is approximately Normal from the Central Limit Theorem. A one-tailed
test is selected since we want to support the claim that homes with pools are larger. The test statistic
(𝑋�1 −𝑋�2 )−(𝜇1− 𝜇2 )
will be = .
𝜎2 𝜎2
� 1+ 2
𝑛1 𝑛2

Type I error would be to reject the Null Hypothesis and claim home with pools are larger, when they are
not larger. It was decided to limit this error by setting the level of significance (α) to 1%.

The decision rule under the critical value method would be to reject the Null Hypothesis when the value
of the test statistic is in the rejection region. In other words, reject Ho when Z > 2.326. The decision
under the p-value method is to reject Ho if the p-value is < α.

Example - Data/Results

Since the test statistic (Z = 4.19) is greater than the critical value (2.326), Ho is rejected. Also the p-value
(0.000013) is less than α (0.01), the decision is Reject Ho.
P a g e | 44

Example - Conclusion

The researcher makes the strong statement that homes with pools have a significantly higher mean
square footage than home without pools.

7.2.3 Model when population variances unknown, but assumed to be equal

In the case when the population standard deviations are unknown, it seems logical to simply replace the
population standard deviations for each population with the sample standard deviations and use a t-
distribution as we did for the one population case. However, this is not so simple when the sample size
for either group is under 30.

We will consider two models. This first model (which we prefer to use since it has higher power)
assumes the population variances are equal and is called the pooled variance t-test. In this model we
combine or “pool” the two sample standard deviations into a single estimate called the pooled standard
deviation, sp . If the central limit theorem is working, we then can substitute sp for s1 and s2 get a t-
distribution with n1 +n2 -2 degrees of freedom:

Pooled variance t-test to compare the means for two independent populations

Model Assumptions Test Statistic


(𝑋�1 −𝑋�2 )−(𝜇1 −𝜇2 ) (𝑛1 −1)𝑠12 −(𝑛2 −1)𝑠22
• Independent Sampling 𝑡= 𝑠𝑝 = �
𝑠𝑝 �
1
+
1 𝑛1 +𝑛2 −2
• 𝑋�1 −𝑋�2 approximately Normal 𝑛1 𝑛2
• 𝜎12 = 𝜎22 Degrees of freedom = 𝑛1 + 𝑛2 − 2

Example

A recent EPA study compared the highway fuel economy of


domestic and imported passenger cars. A sample of 15 domestic
cars revealed a mean of 33.7 MPG (mile per gallon) with a standard
deviation of 2.4 mpg. A sample of 12 imported cars revealed a
mean of 35.7 mpg with a standard deviation of 3.9. At the .05
significance level can the EPA conclude that the MPG is higher on
the imported cars?

Example - Design
It is best to associate the subscript 2 with the control group, in this case we will let domestic cars be
population 2.
Research Hypotheses: Ho: µ1≤µ2 (Imported compact cars do not have a higher mean MPG)
Ha: µ1>µ2 (Imported compact cars have a higher mean MPG)

We will assume the population variances are equal 𝜎12 = 𝜎22 , so the model will be a Pooled variance t-
test. This model is appropriate if the distribution of the differences of sample means is approximately
Normal from the Central Limit Theorem. A one-tailed test is selected based on Ha.
P a g e | 45

Type I error would be to reject the Null Hypothesis and claim imports has a higher mean MPG, when
they do not have higher MPG. The test will be run at a level of significance (α) of 5%.

The degrees of freedom for this test is 25, so the decision rule under the critical value method would be
to reject Ho when t > 1.708. The decision under the p-value method is to reject Ho if the p-value is < α.

Example - Data/Results

(12−1)3.862 −(12−1)2.162 (35.76−33.59)−0


𝑠𝑝 = � = 3.03 𝑡= = 1.85
15+12−2 1 1
3.03� +
12 15

Since 1.85 > 1.708, the decision would be to Reject Ho. Also the p-value is calculated to be .0381 which
again shows that the result is significant at the 5% level.

Example - Conclusion

Imported compact cars have a significantly higher mean MPG rating when compared to domestic cars.

7.2.4 Model when population variances unknown, but assumed to be unequal

In the prior example, we assumed the population variances were equal. However, when looking at the
box plot of the data or the sample standard deviations, it appears that the import cars have more
variability MPG than domestic cars, which would violate the assumption of equal variances required for
the Pooled Variance t-test.

Fortunately, there is an alternative model that has been developed for when population variances are
unequal, called the Behrens-Fisher model 14, or the unequal variances t-test.

Unequal variance t-test to compare the means for two independent populations

Model Assumptions Test Statistic


2
𝑠 𝑠 2 2
• Independent Sampling � 1+ 2�
(𝑋�1 −𝑋�2 )−(𝜇1 −𝜇2 ) 𝑛1 𝑛2
• 𝑋�1 −𝑋�2 approximately Normal 𝑡′ = 𝑑𝑓 = 2 2
2 2
𝑠 𝑠2 2 � 𝑠 �𝑛 � � 𝑠 �𝑛 �
• 𝜎12 ≠ 𝜎22 � 1+ 2 � 1 1 + 2 2 �
𝑛1 𝑛2 �𝑛1 −1� �𝑛2 −1�
P a g e | 46

The degrees of freedom will be less then or equal to 𝑛1 + 𝑛2 − 2, so this test will usually have less power
than the pooled variance t-test.

Example

We will repeat the prior example to see if we can support the claim that imported compact cars have
higher mean MPG when compared to domestic compact cars. This time we will assume that the
population variances are not equal.

Example - Design
Again we will let domestic cars be population 2.
Research Hypotheses: Ho: µ1≤µ2 (Imported compact cars do not have a higher mean MPG)
Ha: µ1>µ2 (Imported compact cars have a higher mean MPG)

We will assume the population variances are unequal 𝜎12 ≠ 𝜎22 , so the model will be an unequal
variance t-test. This model is appropriate if the distribution of the differences of sample means is
approximately Normal from the Central Limit Theorem. A one-tailed test is selected based on Ha.

Type I error would be to reject the Null Hypothesis and claim imports has a higher mean MPG, when
they do not have higher MPG. The test will be run at a level of significance (α) of 5%.

The degrees of freedom for this test is 16 (see calculation below), so the decision rule under the critical
value method would be to reject Ho when t > 1.746. The decision under the p-value method is to reject
Ho if the p-value is < α.

Example - Data/Results
2
2.162 3.862
� + �
15 12
𝑑𝑓 = 2 2 = 16
2
�2.16 �15�
2
�3.86 �12�
� (15−1)
+ (12−1)

(35.76−33.59)−0
𝑡= = 1.74
2 2
�2.16 +3.86
15 12

Since 1.74 <1.708, the decision would be Fail to Reject Ho. Also the p-value is calculated to be .0504
which again shows that the result is not significant (barely) at the 5% level.

Example - Conclusion

Insufficient evidence to claim imported compact cars have a significantly higher mean MPG rating when
compared to domestic cars.

You can see the lower power of this test when compared to the pooled variance t-test example where
Ho was rejected. We always prefer to run the test with higher power when appropriate.
P a g e | 47

7.3 Dependent sampling – matched pairs t-test

The independent models shown above compared samples that were not related. However, it is often
advantageous to have related samples that are paired up – Two measurements from a single
population. The model we will consider here is called the matched pairs t-test also known as the paired
difference t-test. The advantage of this design is that we can eliminate variability due to other factors
not being studied, increasing the power of the design.

In this model we take the difference of each pair and create a new population of differences, so if effect,
the hypothesis test is a one population test of mean that we already covered in the prior section.

Matched pairs t-test to compare the means for two dependent populations

Model Assumptions Test Statistic

• Dependent Sampling � −𝜇
𝑋
• 𝑋𝑑 = 𝑋1 − 𝑋2
𝑡= 𝑑
𝑠𝑑 ⁄√𝑛
𝑑
𝑑𝑓 = 𝑛 − 1
• 𝑋�𝑑 = 𝑋�1 −𝑋�2 approximately Normal

Example

An independent testing agency is


comparing the daily rental cost for renting
a compact car from Hertz and Avis. A
random sample of 15 cities is obtained
and the following rental information
obtained.

At the .05 significance level can the


testing agency conclude that there is a
difference in the rental charged?

Notice in this example that cities are the single population being sampled and two measurements (Hertz
and Avis) are being taken from each city. Using the matched pair design, we can eliminate the variability
due to cities being differently priced (Honolulu is cheap because you can’t drive very far on Oahu!)

Example - Design

Research Hypotheses: Ho: µ1=µ2 (Hertz and Avis have the same mean price for compact cars.)
Ha: µ1≠µ2 (Hertz and Avis do not have the same mean price for compact cars.)

Model will be matched pair t-test and these hypotheses can be restated as: Ho: µd=0 Ha: µd≠0

The test will be run at a level of significance (α) of 5%.

Model is two tailed matched pairs t-test with 14 degrees of freedom. Reject Ho if t < -2.145 or t >2.145.
P a g e | 48

Example - Data/Results

We take the difference for each pair and find the sample mean and
standard deviation.
X d = 1.80
sd = 2.513
n = 15
1.80 − 0
𝑡= = 2.77
2.513⁄√15

Reject Ho under either the


critical value or p-value
method.

Example – Conclusion

There is a difference in mean price for compact cars between Hertz and Avis. Avis has lower mean
prices.

The advantage of the matched pair design is clear in this example. The sample standard deviation for the
Hertz prices is $5.23 and for Avis it is $5.62. Much of this variability is due to the cities, and the matched
pairs design dramatically reduces the standard deviation to $2.51, meaning the matched pairs t-test has
significantly more power in this example.

7.4 Independent sampling – comparing two population variances or standard deviations

Sometimes we want to test if two populations have the same spread or variation, as measured by
variance or standard deviation. This may be a test on its own or a way of checking assumptions when
deciding between two different models (e.g.: pooled variance t-test vs. unequal variance t-test). We will
now explore testing for a difference in variance between two independent samples.

7.4.1 F distribution

The F distribution is a family of distributions related to the Normal Distribution. There are two different
degrees of freedom, usually represented as numerator (dfnum) and denominator (dfden). Also, since the F
represents squared data, the inference will be about the variance rather than the standard deviation.

Characteristics of F Distribution

• It is positively skewed
• It is non-negative
• There are 2 different degrees of freedom (dfnum, dfden)
• When the degrees of freedom change,
a new distribution is created
• The expected value is 1.
P a g e | 49

7.4.2 F test for equality of variances

Suppose we wanted to test the Null Hypothesis that two


population standard deviations are equal, 𝐻𝑜: 𝜎1 = 𝜎2 .
This is equivalent to testing that the population
variances are equal: 𝜎12 = 𝜎22 . We will now instead write
𝜎12 𝜎22
these as an equivalent ratio: 𝐻𝑜: = 1 or 𝐻𝑜: = 1. This is
𝜎22 𝜎12
the logic behind the F test; If two population variances are
equal, then the ratio of sample variances from each
population will have F distribution. F will always be an upper
tailed test in practice, so the larger variance goes in the
numerator. The test statistics are summarized in the table.

7.4.3 Example - Stand alone test

A stockbroker at brokerage firm, reported that the mean rate of return


on a sample of 10 software stocks (population 1)was 12.6 percent with
a standard deviation of 4.9 percent. The mean rate of return on a
sample of 8 utility stocks (population 2) was 10.9 percent with a
standard deviation of 3.5 percent. At the .05 significance level, can the
broker conclude that there is more variation in the software stocks?

Example - Design

Research Hypotheses: Ho: σ1≤σ2 (Software stocks do not have more variation)
Ha: σ1>σ2 (Software stocks do have more variation)

𝑠12
Model will be F test for variances and the test statistic from the table will be F= . The degrees of
𝑠22
freedom for numerator will be n1-1=9 and the degrees of freedom for denominator will be n2-1=7.

The test will be run at a level of significance (α) of 5%.

Critical Value for F with dfnum=9 and dfden=7 is 3.68. Reject Ho if F >3.68.

Example - Data/Results
2
𝐹 = 4.9 � 2 = 1.96, which is less than critical value, so Fail to Reject Ho.
3.5

Example – Conclusion

There is insufficient evidence to claim more variation in the software stock.


P a g e | 50

7.4.4 Example - Testing model assumptions

When comparing two means from independent samples, you have a choice between the more powerful
pooled variance t-test (assumption is 𝜎21 = 𝜎22 ) or the weaker unequal variance t-test (assumption is
𝜎12 ≠ 𝜎22 ). We can now design a hypothesis test to help us choose the appropriate model. Let us revisit
the example of comparing the mpg for import and domestic compact cars. Consider this example a "test
before the main test" to help choose the correct model for comparing means.

Example - Design

Research Hypotheses: Ho: σ1=σ2 (choose the pooled variance t-test to compare means)
Ha: σ1≠σ2 (choose the unequal variance t-test to compare means)

𝑠12
Model will be F test for variances and the test statistic from the table will be F= (s1 is larger). The
𝑠22
degrees of freedom for numerator will be n1-1=11 and the degrees of freedom for denominator will be
n2-1=14.

The test will be run at a level of significance (α) of 10%, but use the α=.05 table for a two-tailed test.

Critical Value for F with dfnum=11 and dfden=14 is 2.57. Reject Ho if F >2.57.

We will also run this test the p-value way in Megastat.

Example - Data/Results

𝐹 = 14.894�4.654 = 3.20, which is more than critical value, Reject Ho.

Also p-value = 0.0438 < 0.10 which also makes the result significant.

Example – Conclusion

Do not assume equal variances and run the unequal variance t-test to compare
population means

In Summary

This flowchart summarizes which


of the four models to choose
when comparing two population
means. In addition, you can use
the F-test for equality of variances
to make the decision between the
pool variance t-test and the
unequal variance t-test.
P a g e | 51

8. Chi-square Tests and Analysis of Variance (ANOVA)

Often we want to conduct tests claims about the characteristics of qualitative or categorical non-
numeric data. In Section 6, we covered a test of one population proportion. In reality, this was a test of a
categorical variable with 2 choices (success, failure). Now in this section, we will expand our study of
hypothesis tests involving categorical data to include categorical random variables with more than two
choices using a goodness-of-fit test. In addition, we will compare two categorical variables for
independence. Both of these models will use a Chi-square test statistic, by looking at deviations
between the observed values and expected values of the data.

8.1.1 Chi-Square Goodness-of-Fit test

A financial services company had anecdotal evidence that people were calling in sick on Monday and
Friday more frequently than Tuesday, Wednesday or Thursday. The speculation was that some
employees were using sick days to extend their weekends. A researcher for the company was asked to
determine if the data supported a significant difference in absenteeism due to the day of the week.

The categorical variable of interest here is “Day of Week” an employee called in sick (Monday through
Friday). This is an example of a multinomial random variable, where we will observe a fixed number of
trials (the total number of sick days sampled) and at least 2 possible outcomes. (A binomial random
variable is a special case of the multinomial random variable where there is exactly 2 possible outcomes
and was studied in Section 9 as a Z Test of Proportion.)

The Chi-square goodness-of-fit test is used to test if observed data from a categorical variable is
consistent with an expected assumption about the distribution of that variable.

Chi-square Goodness of Fit Test

Model Assumptions Test Statistic

• Oi = Observed in category i k
(Oi − Ei )
2

χ =∑
2
df = k-1
• pi = Expected proportion in category i i =1 Ei
• =
Ei np =i Expected in category i k = number of categories
• Ei ≥ 5 for each i n = sample size
P a g e | 52

8.1.2 Chi-Square Goodness-of-Fit test - Example 1 - equal expected frequencies

A researcher for the financial services company collected 400 records of what
day of the week employees called in sick to work. Can the researcher conclude
that proportion of employees who call in sick is not the same for each day of
the week? Design and conduct a hypothesis test at the 1% significance level.

Day of Week Frequency


Monday 95
Tuesday 65
Wednesday 60
Thursday 80
Friday 100
TOTAL 400

Research Hypotheses: Ho: There is a no difference in the proportion of employees who call in sick
due to the day of the week.
Ha: There is a difference in the proportion of employees who call in sick
due to the day of the week.

We can also state the hypotheses in terms of population parameters, pi for each category. Under the
null hypothesis we would expect 20% sick days would occur on each week day.

Research Hypotheses: Ho: p1 = p2 = p3 = p4 = p5 = 0.20


Ha: At least one pi is different than what was stated in Ho

Statistical Model: Chi-square goodness-of-fit test.

Important Assumption: The Expected Value of Each Category needs to be greater than or equal to 5. In
this example, E=
i np=
i ( 400 )(.20=) 80 ≥ 5 for each category, so the model is appropriate.

(Oi − Ei )
2
k
Test Statistic: χ =∑
2
df = 5-1=4
i =1 Ei

Decision Rule (Critical Value Method): Reject Ho if χ2>13.277 (α=.01, 4df)

Results:

Observed Expected Expected


(Oi − Ei )
2
Day of Frequency proportion Frequency
Week Oi pi Ei Ei
Monday 95 .20 80 2.8125
Tuesday 65 .20 80 2.8125
Wednesday 60 .20 80 5.0000
Thursday 80 .20 80 0.0000
P a g e | 53

Friday 100 .20 80 5.0000


TOTAL 400 1.00 400 15.625

Since the Test Statistic is in the Rejection Region, the decision is to Reject Ho. Under the p-value
method, Ho is also rejected since the p-value = P(χ2>15.625) = 0.004 which is less than the Significance
Level α of 1%.

Conclusion: There is a difference in the


proportion of employees who call in sick
due to the day of the week. Employees are
more likely to call in sick on days close to
the weekend.

8.1.3 Chi-Square Goodness-of-Fit test - Example 1 - different expected frequencies.

In the prior example, the Null Hypothesis was that all categories had the same proportion, in other
words there was no difference in counts due to the choices of a categorical variable. Another set of
hypotheses using this same Chi-square goodness-of-fit test can be used to compare current results of an
current experiment to prior results. In these tests, it is quite likely that prior proportions were not the
same.

In the 2010 United States census, data was


collected on how people get to work, their
method of commuting. The results are shown in
the graph to the right. Suppose you wanted to
know if people who live in the San Jose
metropolitan area (Santa Clara county) commute
with similar proportions as the United States. We
will sample 1000 workers from Santa Clara county
and conduct a Chi-square goodness-of-fit test.
Design and conduct a hypothesis test at the 1%
significance level.

Research Hypotheses: Ho: Workers in Santa Clara county choose methods of commuting that match
the United States averages.
Ha: Workers in Santa Clara county choose methods of commuting that do not
match the United States averages.
P a g e | 54

We can also state the hypotheses in terms of population parameters, pi for each category. Under the
null hypothesis we would expect 20% sick days would occur on each week day.

Research Hypotheses: Ho: p1 = .763 p2 = .098 p3 = .050 p4 = .028 p5 = .018 p6 = .043


Ha: At least one pi is different than what was stated in Ho

Statistical Model: Chi-square goodness-of-fit test.

Important Assumption: The Expected Value of Each Category needs to be greater than or equal to 5. In
this example check the lowest pi : E=
5 np=
5 (1000 )(.018=) 18 ≥ 5 , so the model is appropriate.

(Oi − Ei )
2
k
Test Statistic: χ2 = ∑ df = 6-1=5
i =1 Ei

Decision Rule (Critical Value Method): Reject Ho if χ2>11.071 (α=.05, 5 df)

After designing the experiment, we conducted the sample of Santa Clara County, shown in the Observed
Frequency Column of the table below. The Expected Proportion and Expected Frequency Columns are
calculated using the U.S. 2010 Census.

Results:

Method Of Observed Expected Expected


(Oi − Ei )
2
Commuting Frequency Proportion Frequency
Oi pi Ei Ei
Drive Alone 764 0.763 763 0.0013
Carpooled 105 0.098 98 0.5000
Public Transit 34 0.050 50 5.1200
Walked 20 0.028 28 2.2857
Other Means 30 0.018 18 8.0000
Worked from Home 47 0.043 43 0.3721
TOTAL 1000 1.000 1000 16.2791

Since the Test Statistic of 16.2791 exceeds


the critical value of 11.071, the decision is
to Reject Ho. Under the p-value method,
Ho is also rejected since the p-value =
P(χ2>16.2791) = 0.006 which is less than
the Significance Level α of 5%.

Conclusion: Workers in Santa Clara County


do not have the same frequencies of
P a g e | 55

method of commuting as the entire United States.

8.2.1 Chi-Square Test of Independence

In 2014, Colorado became the first state to legalize the recreational use of marijuana. Other states have
joined Colorado, while some have decriminalized or authorized the medical use of marijuana. The
question is should marijuana be legalized in all states. Suppose we took a poll of 1000 American adults
and asked "Should marijuana be legal or not legal for recreational use" and got the following results:

Marijuana
should be Count Percent
Legal 500 50%
Not Legal 450 45%
Don't know 50 5%
Total 1000 100%

The interpretation of this poll is that 50% of adults polled favored the legalization of marijuana for
recreational use, while 45% opposed it. The remaining 5% were undecided.

At this time, you might have questions and want to explore this poll in more depth. For example, are
younger people more likely to support legalization of marijuana? Do other demographic characteristics
such as gender, ethnicity, sexual orientation, religion affect people's opinions about legalization.

Let us explore the possibility of difference of opinion due to gender. Are men more likely (or less likely)
to oppose legalization of marijuana compared to women?

In the example above, suppose we have exactly 500 men and 500 women in the survey. What would we
expect to see in the data if there was no difference in opinion between men and women?

8.2.2 Two-way tables

Two-way or contingency tables are used to summarize two categorical variables, also known as
bivariate categorical data. In order to create a two-way table, the researcher must cross-tabulate the
two responses for each categorical questions.
P a g e | 56

In the example above, the two categorical variables are gender and opinion on marijuana legalization.
Gender has two choices (male or female) while opinion on marijuana legalization has three choices
(legal, not legal and unsure).

In the example above, suppose we have exactly 500 men and 500 women in the survey. What would we
expect to see in the data if there was no difference in opinion between men and women? We could then
simply apply the total percentages to each group.

Marijuana
To create a hypothetical
should be Men Women Total
two-way table if there was
Legal 50% 50% 50%
no difference in opinion Not Legal 45% 45% 45%
between men and women, Unsure 5% 5% 5%
apply the total percentages Total 100% 100% 100%
for each choice of Opinion
to the total number for Marijuana
each choice of Gender. should be Men Women Total
Legal 250 250 500
eg: Men/Legal would 50% Not Legal 225 225 450
of 500 or 250 people. Unsure 25 25 50
Total 500 500 1000

Let’s review from probability what independence means. If two events A an B are independent, then the
following statements are true:

P(A given B) = P(A)


P(B given A) = P(B)
P(A and B) = P(A)P(B)

You can pick any two events in the table above to verify that Gender and Opinion of Legalization of
Marijuana are independent events. For example, compare the events Not Legal and Men.

P(Not Legal given Men) = 225/500 = 45% same as P(Not Legal) = 45%
P(Men given Not Legal) = 225/450 = 50% same as P(Men) = 50%
P(Not Legal and Men) = 225/1000 = 22.5% same as P(Not Legal)P(Men) = (45%)(50%) = 22.5%

Based on these probability rules we can calculate the expected value of any pair of independent events
by using the following formula:
P a g e | 57

Expected Value = (Row Total)(Column Total)/(Grand Total)

For example, looking at the events Not Legal and Men:

Expected Value = (450)(500)/(1000) = 225

What if the events are not independent? Let's review the same survey. What would we expect to see in
the data if there was a difference in opinion between men and women? Let's say women were more
likely to support legalization. In that case, we would expect the 450 people who supported legalization
of marijuana to have a higher number of women (and a smaller number of men) compared to the first
table. Note we only change the first six boxes (shaded below), the totals must remain the same.

Marijuana
This is an example of a
should be Men Women Total
hypothetical two-way table Legal 40% 60% 50%
where women were more Not Legal 55% 35% 45%
likely to support Unsure 5% 5% 5%
legalization. Total 100% 100% 100%

Only the six boxes shaded


Marijuana
in yellow change from the
should be Men Women Total
prior example
Legal 200 300 500
Not Legal 275 175 450
Unsure 25 25 50
Total 500 500 1000

Now let's see the actual results of this survey and see what is happening:

Marijuana
Actual Poll of 500 men and
should be Men Women Total
500 women adults. Should Legal 54% 46% 50%
marijuana be legal for Not Legal 41% 49% 45%
recreational use? Unsure 5% 5% 5%
Total 100% 100% 100%

Marijuana
should be Men Women Total
Legal 270 230 500
Not Legal 205 245 450
Unsure 25 25 50
Total 500 500 1000
P a g e | 58

In this poll, a higher percentage of men support legalization of marijuana for recreational use compared
to women. Question: Is this evidence strong enough to support the claim that gender and opinion about
marijuana legalization are not independent events? This question can addressed by conducting a
hypothesis test using with the Chi-square Test for Independence model.
P a g e | 59

8.2.3 Chi-square test for Independence

Are Gender and Opinion about legalization of marijuana for recreational use independent events.
Conduct a hypothesis test with a significance level of 5%.

Chi-square Test for Independence

Model Assumptions Test Statistic

(O − E )
2
r c
• Oij = χ =∑ ∑
ij ij
Observed in category ij 2
df = (r-1)(c-1)
Eij
= =
( ColumnTotal )( RowTotal ) =i 1 =j 1
• Eij npij r = number of row categories
GrandTotal
c = number of column categories
Eij ≥ 5 for each ij n = sample size

Research Hypotheses: Ho: Gender and Opinion about legalization of marijuana for recreational use are
independent events.
Ha: Gender and Opinion about legalization of marijuana for recreational use are
dependent events.

Statistical Model: Chi-square Test of Independence

Results

Rows: Opinion about Marijuana


Columns: gender

1st value = Observed


2nd Value = Expected
3rd Value = Contribution to Chi-square

men women All

Legal 270 230 500


250 250
1.600 1.600

Not Legal 205 245 450


225 225
1.778 1.778

Unsure 25 25 50
25 25
0.000 0.000

All 500 500 1000


P a g e | 60

Important Assumption: The Expected Value of Each Category needs to be greater than or equal to 5. In
this example, the lowest expected value is 225 (Men, not legal) so the assumption is met.

(O − E )
2
r c
χ =∑ ∑
ij ij
Test Statistic:
2
df = (3-1)(2-1)=2
=i 1 =j 1 Eij

Decision Rule (Critical Value Method): Reject Ho if χ2>5.991 (α=.05, 2df)

χ 2 = 1.600 + 1.600 + 1.778 + 1.778 = 6.756

Since the Test Statistic exceeds the critical value, the decision is to Reject Ho. Under the p-value
method, Ho is also rejected since the p-value = P(χ2>6.756) = 0.034 which is less than the Significance
Level α of 5%.

Conclusion: Gender and Opinion about legalization of marijuana for recreational use are dependent
events. Men are more likely to support legalization of marijuana for recreational use.
P a g e | 61

8.3 One Factor Analysis of Variance (ANOVA)

In the Section 7 we used statistical inference to compare two population means under variety of models.
These models can be expanded to compare more than two populations using a technique called Analysis
of Variance, or ANOVA for short. There are many ANOVA models, but we limit our study to one of them,
the One Factor ANOVA model, also known as One Way ANOVA.

8.3.1 Comparing means from more than two Independent Populations

Suppose we wanted to compare the means of more than two (k) independent populations and want to
test the null hypothesis 𝑯𝒐: 𝝁𝟏 = 𝝁𝟐 = ⋯ = 𝝁𝒌 . If we can assume all population variances are equal,
we can expand the pooled variance t-test for two populations to one factor ANOVA for k populations.

8.3.2 The logic of ANOVA - How comparing variances test for a difference in means.

It may seem strange to use a test of “variances” to compare


means, but this graph demonstrates the logic of the test. If
the null hypothesis 𝐻𝑜: 𝜇1 = 𝜇2 = 𝜇3 is true, then each
population would have the same distribution and the
variance of the combined data would be approximately the
same. However, if the Null Hypothesis is false, then the
difference between centers would cause the combined data
to have an increased variance.

8.4 The One Factor ANOVA model

In ANOVA, we calculate the variance two different ways: The


mean square factor (MSF),also know as mean square between, measures the variability of the means
between groups, while the mean square within (MSE), also know as mean square within, measures the
variability within the population. Under the null hypothesis, the ratio of MSF/MSE should be close to 1
and has F distribution.

One Factor ANOVA model to compare the means of k independent populations

Model Assumptions Test Statistic

• The populations being sampled are normally 𝐹 = 𝑀𝑆


𝑀𝑆
𝐹𝑎𝑐𝑡𝑜𝑟
distributed. 𝐸𝑟𝑟𝑜𝑟
• The populations have equal standard
deviations. 𝑑𝑓𝑛𝑢𝑚 = 𝑘 − 1
• The samples are randomly selected and are
𝑑𝑓𝑑𝑒𝑛 = 𝑛 − 𝑘
independent.
P a g e | 62

8.5 Understanding the ANOVA table

When running Analysis of Variance, the data is usually organized into a special ANOVA table, especially
when using computer software.

Source of Sum of Degrees of


Mean Square (MS) F
Variation Squares (SS) freedom (df)
Factor
SSFactor k-1 MSFactor= SSFactor/k-1 F= MSFactor/MSError
(Between)
Error
SSError n-k MSError= SSError/n-k
(Within)
Total SSTotal n-1

Sum of Squares: The total variability of the numeric data being compared is broken into the variability
between groups (SSFactor) and the variability within groups (SSError). These formulas are the most tedious
part of the calculation. Tc represents the sum of the data in each population and nc represents the
sample size of each population. These formulas represent the numerator of the variance formula.

( ) (ΣXn )  T 2  (ΣX )
2 2
SSTotal = Σ X 2 − SS Factor = Σ c  − SSError = SSTotal − SSFactor
 nc  n
Degrees of freedom: The total degrees of freedom is also partitioned into the Factor and Error
components.

Mean Square: This represents calculation of the variance by dividing Sum of Squares by the appropriate
degrees of freedom.

F: This is the test statistic for ANOVA: the ratio of two sample variances (mean squares) that are both
estimating the same population value has an F distribution. Computer software will then calculate the p-
value to be used in testing the Null Hypothesis that all populations have the same mean.

Example

Party Pizza specializes in meals for students. Hsieh Li, President,


recently developed a new tofu pizza.

Before making it a part of the regular menu she decides to test it in


several of her restaurants. She would like to know if there is a
difference in the mean number of tofu pizzas sold per day at the
Cupertino, San Jose, and Santa Clara pizzerias. Data will be collected
for five days at each location.
P a g e | 63

At the .05 significance level can Hsieh Li conclude that there is a difference in the mean number of tofu
pizzas sold per day at the three pizzerias?

Example - Design

Research Hypotheses: Ho: µ1=µ2 =µ3 (Mean sales same at all restaurants)
Ha: At least µi is different (Means sales not the same at all restaurants)

We will assume the population variances are equal 𝜎12 = 𝜎22 = 𝜎32 , so the model will be One Factor
ANOVA. This model is appropriate if the distribution of the sample means is approximately Normal
from the Central Limit Theorem.

Type I error would be to reject the Null Hypothesis and claim mean sales are different, when they
actually are the same. The test will be run at a level of significance (α) of 5%.
𝑀𝑆𝐹𝑎𝑐𝑡𝑜𝑟
The test statistic from the table will be F= . The degrees of freedom for numerator will be 3-1=2
𝑀𝑆𝐸𝑟𝑟𝑜𝑟
and the degrees of freedom for denominator will be 13-1=12. (The total sample size turned out to be
only 13, not 15 as planned)

Critical Value for F at α of 5% with dfnum=2 and dfden=12 is 4.10. Reject Ho if F >4.10. We will also run this
test using the p-value method with statistical software, such as Minitab.

Example - Data/Results

Cupertino San Jose Santa Clara Total


13 10 18
182 2
12 12 16 SSTotal = 2634 − = 86
14 13 17 13
12 11 17
182 2
17
SS Factor = 2624.25 − = 76.25
T 51 46 85 182 13
n
Means
4
12.75
4
11.5
5
17
13
14
SSError = 86 − 76.25 = 9.75
Σ^2 653 534 1447 2634

𝐹 = 38.125�0.975 = 39.10, which is more than critical value of 4.10, Reject Ho.

Also from the Minitab output,


p-value = 0.000 < 0.05 which also supports
rejecting Ho.

Example – Conclusion

There is a difference in the mean number


of tofu pizzas sold at the three locations.
P a g e | 64

8.6 Post-hoc Analysis – Tukey’s Honestly Significant Difference (HSD) Test 15.

When the Null Hypothesis is rejected in one factor ANOVA, the conclusion is that not all means are the
same. This however leads to an obvious question: Which particular means are different? Seeking further
information after the results of a test is called post-hoc analysis.

8.6.1 The problem of multiple tests


One attempt to answer this question is to conduct multiple pairwise independent same t-tests and
determine which ones are significant. We would compare µ1 to µ2, µ1 to µ3, µ2 to µ3, µ1 to µ4, etc. There
is a major flaw in this methodology in that each test would have a significance level of α, so making Type
I error would be significantly more than the desired α. Furthermore, these pairwise tests would NOT be
mutually independent. There were several statisticians who designed tests that effectively dealt with
this problem of determining an "honest" significance level of a set of tests; we will cover the one
developed by John Tukey, the Honestly Significant Difference (HSD) test.

8.6.2 The Tukey HSD test

Tests: H o : µ i = µ j H a : µ i ≠ µ j where the subscripts i and j represent


two different populations

Overall significance level of α. This means that all pairwise tests can be run at the
same time with an overall significance level of α.

MSE
Test Statistic: HSD = q
nc
q = value from studentized range table

MSE = Mean Square Error from ANOVA table

nc = number of replicates per treatment. An adjustment is made for unbalanced


designs.

Decision: Reject Ho if X i − X j > HSD critical value

Computer software, such as Megastat, will calculate the critical values and test statistics for these series
of tests.
P a g e | 65

Example

Let us return to the Tofu pizza example where we


rejected the Null Hypothesis and supported the claim
that there was a difference in means among the three
restaurants.

In reviewing the graph of the sample means, it appears


that Santa Clara has a much higher number of sales than
Cupertino and San Jose. There will be three pairwise
post-hoc tests to run.

Example - Design

H o : µ1 = µ 2 H a : µ1 ≠ µ 2 H o : µ1 = µ 3 H a : µ1 ≠ µ 3 H o : µ 2 = µ3 H a : µ 2 ≠ µ3
These three tests will be conducted with an overall significance level of α = 5%.
The model will be the Tukey HSD test.

The Minitab approach for the decision rule will be to reject Ho for each pair that does not share a
common group.

Example - Data/Results/Conclusion

Refer to the Minitab output. Santa Clara is in


group A while Cupertino and San Jose are in
Group B.

Santa Clara has a significantly higher mean number of tofu pizzas sold compared to both San Jose and
Cupertino. There is no significant difference in mean sales between San Jose and Cupertino.

8.7 Factorial Design – an insight to other ANOVA procedures

A different way of looking at this model is considering a single population with one numeric and one
categorical variable being sampled. The numeric variable is called the response (tofu pizzas sold) and the
categorical variable is the factor (location of restaurant). The possible responses to the factor are called
the levels (Cupertino, San Jose and Sunnyvale). The number of observations per level are called the
replicates (n1=4, n2=4, n3=5 in our example). If the replicates are equal, the design is balanced. (our
example is not balanced).

By thinking of the model in this way, it easy to extend the concept to the multi-factor ANOVA models
that are prevalent in the research you will encounter in future studies.
P a g e | 66

9. Glossary of Statistical Terms used in Inference

Alpha (α) – see Level of Significance

Alternative Hypothesis (Ha)


A statement about the value of a population parameter that is assumed to be true if the Null Hypothesis
is rejected during testing.

Analysis of Variance (ANOVA)


A group of statistical tests used to determine if the mean of a numeric variable (the Response) is
affected by one or more categorical variables (Factors).

Beta (β)
The probability, set by design, of failing to reject the Null Hypothesis when it is actually false. Beta is
calculated for specific possible values of the Alternative Hypothesis.

Central Limit Theorem


A powerful theorem that allows us to understand the distribution of the sample mean, 𝑋�. If X1, X2, …, Xn
is a random sample from a probability distribution with mean = 𝜇 and standard deviation = 𝜎 and the
sample size is “sufficiently large”, then 𝑋� will have a Normal Distribution with the same mean a standard
deviation of 𝜎/√𝑛 (also known as the Standard Error). Because of this theorem, most statistical
inference is conducting using a sampling distribution from the Normal Family.

Chi-square Distribution (χ2)


A family of continuous random variables (based on degrees of freedom) with a probability density
function that is from the Normal Family of probability distributions. The Chi-square distribution is non-
negative and skewed to the right and has many uses in statistical inference such as inference about a
population variance, goodness-of-fit tests and test of independence for categorical data.

Confidence Interval
An Interval estimate that estimates a population parameter from a random sample using a
predetermined probability called the level of confidence.

Confidence Level – see Level of Confidence

Critical value(s)
The dividing point(s) between the region where the Null Hypothesis is rejected and the region where it is
not rejected. The critical value determines the decision rule.
P a g e | 67

Decision Rule
The procedure that determines what values of the result of an experiment will cause the Null Hypothesis
to be rejected. There are two methods that are equivalent decision rules:
1. If the test statistic lies in the Rejection Region, Reject Ho. (Critical Value method)
2. If the p-value < α, Reject Ho. (p-value method)

Dependent Sampling
A method of sampling where 2 or more variables are related to each other (paired or matched).
Examples would be the “Before and After” type models using the Matched Pairs t-test.

Effect Size: The “practical difference” between a population parameter under the Null Hypothesis and a
selected value of the population parameter under the Alternative Hypothesis.

Empirical Rule (Also known as the 68-95-99.7 Rule)


A rule used to interpret standard deviation for data that is approximately bell-shaped. The rule says
about 68% of the data is within one standard deviation of the mean, 95% of the data is within two
standard deviations of the mean, and about 99.7% of the data is within three standard deviations of the
mean.

Estimation
An inference process that attempts to predict the values of population parameters based on sample
statistics.

F Distribution
A family of continuous random variables (based on 2 different degrees of freedom for numerator and
denominator) with a probability density function that is from the Normal Family of probability
distributions. The F distribution is non-negative and skewed to the right and has many uses in statistical
inference such as inference about comparing population variances, ANOVA, and regression.

Factor
In ANOVA, the categorical variable(s) that break the numeric response variable into multiple populations
or treatments.

Hypothesis
A statement about the value of a population parameter developed for the purpose of testing.

Hypothesis Testing
A procedure, based on sample evidence and probability theory, used to determine whether the
hypothesis is a reasonable statement and should not be rejected, or is unreasonable and should be
rejected.
P a g e | 68

Independent Sampling
A method of sampling where 2 or more variables are not related to each other. Examples would be the
“Treatment and Control” type models using the independent samples t-test.

Inference – see Statistical Inference

Interval Estimate
A range of values based on sample data that used to estimate a population parameter.

Level
In ANOVA, a possible value that a categorical variable factor could be. For example, if the factor was
shirt color, levels would be blue, red, yellow, etc.

Level of Confidence
The probability, usually expressed as a percentage, that a Confidence Interval will contain the true
population parameter that is being estimated.

Level of Significance (α)


The maximum probability, set by design, of rejecting the Null Hypothesis when it is actually true
(maximum probability of making Type I error).

Margin of Error
The distance in a symmetric Confidence Interval between the Point Estimator and an endpoint of the
interval. For example a confidence interval for 𝜇 may be expressed as 𝑋� ± Margin of Error.

Model Assumptions

Criteria which must be satisfied to appropriately use a chosen statistical model. For example, a student’s
t statistic used for testing a population mean vs. a hypothesized value requires random sampling and
that the sample mean has an approximately Normal Distribution.

Normal Distribution
Often called the “bell-shaped” curve, the Normal Distribution is a continuous random variable which has
Probability Density Function 𝑋 = 𝑒𝑥𝑝[−(𝑥 − 𝜇)2 /2𝜎 2 ]/𝜎√2𝜋. The special case where 𝜇 = 0 and 𝜎 =
1, is called the Standard Normal Distribution and designated by Z.

Normal Family of Probability Distributions


The Standard Normal Distribution (Z) plus other Probability Distributions that are functions of
independent random variables with Standard Normal Distribution. Examples include the t, the F and the
Chi-square distributions.
P a g e | 69

Null Hypothesis (Ho)


A statement about the value of a population parameter that is assumed to be true for the purpose of
testing.

Outlier
A data point that is far removed from the other entries in the data set.

p-value
The probability, assuming that the Null Hypothesis is true, of getting a value of the test statistic at least
as extreme as the computed value for the test.

Parameter
A fixed numerical value that describes a characteristic of a population.

Point Estimate
A single sample statistic that is used to estimate a population parameter. For example, 𝑋� is a point
estimator for 𝜇.

Population
The set of all possible members, objects or measurements of the phenomena being studied.

Power (or Statistical Power)


The probability, set by design, of rejecting the Null Hypothesis when it is actually false. Power is
calculated for specific possible values of the Alternative Hypothesis and is the complement of Beta (β).

Probability Distribution Function (PDF)

A function that assigns a probability to all possible values of a random variable. In the case of a
continuous random variable (like the Normal Distribution), the PDF refers to the area to the left of a
designated value under a Probability Density Function.

Random Sample
A sample where the values are equally likely to be selected and mutually independent of each other.

Random Variable
A numerical value that is determined by an experiment with a probability distribution function.

Replicate
In ANOVA, the sample size for a specific level of factor. If the replicates are the same for each level, the
design is balanced.
P a g e | 70

Rejection Region
Region(s) of the Statistical Model which contain the values of the Test Statistic where the Null
Hypothesis will be rejected. The area of the Rejection Region = α.

Response
In ANOVA, the numeric variable that is being tested under different treatments or populations.

Sample
A subset of the population.

Sample Mean
a) The arithmetic average of a data set.
b) A random variable that has an approximately Normal Distribution if the sample size is sufficiently
large.

Significance Level – see Level of Significance

Standard Deviation
The square root of the variance and measures the spread of data, distance from the mean. The units of
the standard deviation are the same units as the data.

Standard Normal Distribution – see Normal Distribution

Statistic
A value that is calculated from sample data only that is used to describe the data. Examples of statistics
are the sample mean, sample standard deviation, range, sample median and the interquartile range.
Since statistics depend on the sample, they are also random variables.

Statistical Inference
The process of estimating or testing hypotheses of population parameters using statistics from a random
sample.

Statistical Model
A mathematical model that describes the behavior of the data being tested.

Student’s t distribution (or t distribution)


A family of continuous random variables (based on degrees of freedom) with a probability density
function that is from the Normal Family of Probability Distributions. The t distribution is used for
statistical inference of the population mean when the population standard deviation is unknown.
P a g e | 71

Test Statistic
A value, determined from sample information, used to determine whether or not to reject the Null
Hypothesis.

Type I Error
Rejecting the Null Hypothesis when it is actually true.

Type II Error
Failing to reject the Null Hypothesis when it is actually false.

Variance
A measure of the mean squared deviation of the data from the mean. The units of the variance are the
square of the units of the data.

Z-score
A measure of relative standing that shows the distance in standard deviations a particular data point is
above or below the mean.
P a g e | 72

10. Flash Animations

I have designed four interactive Flash animations that will provide the student with deeper insight of the
major concepts of inference and hypothesis testing. These animations are on my website
http://nebula2.deanza.edu/~mo/ .

Central Limit Theorem (Section 4.3)


Using die rolling with progressively increasing sample sizes, this
animation shows the three main properties of the Central Limit
Theorem.

Inference Process (Section 5.1)


This animation walks a student through the logic of the statistical
inference and is presented just before confidence intervals and
hypothesis testing.

Confidence Intervals (Section 5.3.1)


This animation compares hypothesis testing to an unusual method of
playing darts and compares it to a practical example from the 2008
presidential election.

Statistical Power in Hypothesis Testing (Section 6.7)


This animation explains power, Type I and Type II error conceptually,
and demonstrates the effect of changing model assumptions.
P a g e | 73

11. PowerPoint Slides


I have developed PowerPoint Slides that follow the material presented in the course. This material is
presented online at as a slideshow as well as note pages that can be downloaded at
http://nebula2.deanza.edu/~mo/.

Section 1:
Descriptive Statistics

Section 2:
Probability

Section 3:
Discrete Random Variables

Section 4:
Continuous Random Variables and the Central Limit Theorem (Partially covered in this text)

Section 5:
Point Estimation and Confidence Intervals (Covered in this text)

Section 6:
One Population Hypothesis Testing (Covered in this text)

Section 7:
Two Population Inference (Covered in this text)

Section 8:
Chi-square and ANOVA Tests (Partially covered in this text)

Section 9:
Correlation and Regression
P a g e | 74

12. Notes and Sources


1
Talk of the Nation, National Public Radio Archives, http://www.npr.org/
2
John Cimbaro, Fish Anatomy,
http://www.fws.gov/midwest/lacrossefishhealthcenter/PhotoAlbum.html
3
Chen Zheng-Long, Chinese Koi Fish, http://www.orientaloutpost.com/proddetail.php?prod=czl-kf135-1
4
Richard Christian Looijen, Holism and Reductionism in Biology and Ecology: The Mutual Dependence of
Higher and Lower Level Research Programmes, Springer, 2000

5
The Poems of John Godfrey Saxe (Highgate Edition), Boston: Houghton, Mifflin and Company, 1881
6
Donna Young, American Society of Health System Pharmacists, April 6, 2007,
http://www.ashp.org/import/News/HealthSystemPharmacyNews/newsarticle.aspx?id=2517
7
The Lancet, news release, June 29, 2009,
http://www.nlm.nih.gov/medlineplus/news/fullstory_86206.html
8
Ronald Walpole & Raymond Meyers & Keying Ye, Probability and Statistics for Engineers and Scientists.
Pearson Education, 2002, 7th edition.
9
Taleb, Nicholas, The Black Swan: The Impact of the Highly Improbable, Penguin, 2007.
10
Food and Drug Administration, FDA Consumer Magazine , Jan/Feb 2003
11
Mark Blumenthal, Is Polling as we Know it Doomed?, The National Journal Online,
http://www.nationaljournal.com/njonline/mp_20090810_1804.php, August 10, 2009
12
Russ Lenth, Java Applets for Power and Sample Size, University of Iowa ,
http://www.stat.uiowa.edu/~rlenth/Power/ , 2009
13
J. B. Orris, MegaStat for Excel, Version 10.1, Butler University, 2007
14
Shlomo S. Sawilowsky, Fermat, Schubert, Einstein, and Behrens-Fisher: The Probable Difference
Between Two Means When 𝜎12 ≠ 𝜎22 , Journal of Modern Applied Statistical Methods, Vol. 1, No 2, Fall
2002
15
Lowry, Richard. One Way ANOVA – Independent Samples. Vassar.edu, 2011

Additional reference used but not specifically cited:

Dean Fearn, Elliot Nebenzahl, Maurice Geraghty, Student Guide for Elementary Business Statistics,
Kendall/Hunt, 2003
Math 10 – Part 1 Slides

Introduction
„ Green Sheet – Homework 0
Math 10 „ Projects
„ Computer Lab – S44
„ Minitab
„ Website
Part 1 „ http://nebula2.deanza.edu/~mo
Data and Descriptive Statistics „ Tutor Lab - S43
© Maurice Geraghty 2015 „ Drop in or assigned tutors – get form from lab.
„ Group Tutoring
„ Other Questions
1 2

Descriptive Statistics Problem Solving


„ Organizing, summarizing and displaying „ The Role of Probability
data „ Modeling
„ Graphs „ Simulation
„ Charts „ Verification
„ Measure of Center
„ Measures of Spread
„ Measures of Relative Standing

3 4

Raw Data – Apple


Inferential Statistics Monthly Adjusted Stock Price: 12/1999 to 12/2014

„ Population – the set of all measurements of


interest to the sample collector
„ Sample – a subset of measurements selected
from the population
„ Inference – A conclusion about the
population based on the sample
„ Reliability – Measure the strength of the
Inference

5 6

© Maurice Geraghty 2015 1


Math 10 – Part 1 Slides

Apple – Adjusted Stock Price 15 Years

Crime Rate
„ In the last 18 years, has violent crime:
„ Increased?

„ Stayed about the Same?

„ Decreased?

7 8

Perception – Gallup Poll Reality (Source: US Justice Department)

9 10

Pie Chart - What do you think of


Line Graph - Crime and Lead your College roommate?

11 12

© Maurice Geraghty 2015 2


Math 10 – Part 1 Slides

Bar Chart - Health Care Distorting the truth with Statistics

13 14

Nuclear, Oil and Coal Energy


Deaths per terawatt-hour produced
source: thebigfuture.com Should Police wear Body Cameras?

15 16

Most Popular Websites for


Increase in Debt since 1999 College Students in 2007

17 18

© Maurice Geraghty 2015 3


Math 10 – Part 1 Slides

Decline of MySpace

19 20

Types of Data Levels of Measurement


„ Qualitative „ Nominal: Names or labels only
Example: What city do you live in?
Non-numeric
„
„
„ Ordinal: Data can be ranked, but no
„ Always
y categorical
g quantifiable difference.
„ Quantitative „ Example: Ratings Excellent, Good, Fair, Poor
„ Numeric „ Interval: Data can be ranked with quantifiable
differences, but no true zero.
„ Categorical numbers are actually „ Example: Temperature
qualitative „ Ratio: Data can be ranked with quantifiable
„ Continuous or discrete differences and there is a true zero.
„ Example: Age
21 22

Examples of Data Data Collection


„ Distance from De Anza College „ Personal – individual interviews
„ Number of Grandparents still alive „ Phone – voice and automated
„ Eye Color
„ Amount yyou spend
p on food each week. „ Impersonal Survey – Internet or Mail
„ Number of Facebook “Friends” „ Direct Observation – measurements
„ Zip Code
„ City you live in. „ Scientific Studies – control for lurking variables
„ Year of Birth „ Observational Studies – difficult to establish a
„ How to prepare Steak? (rare, medium, well-done) cause and effect relationship.
„ Do you own an SUV?

23 24

© Maurice Geraghty 2015 4


Math 10 – Part 1 Slides

Sampling Graphical Methods


Random Sampling
„
„ Each member of the population has the same chance of being sampled.
„ Stem and Leaf Chart
„ Systematic Sampling „ Grouped data
„ The sample is selected by taking every kth member of the population.
„ Stratified Sampling „ Pie Chart
The population is broken into more homogenous subgroups (strata) and a
Histogram
„

random sample is taken from each strata. „


„ Cluster Sampling
„ Divide population into smaller clusters, randomly select some clusters and „ Ogive
sample each member of the selected clusters.
„ Convenience Sampling „ Grouping data
Self selected and non-scientific methods which are prone to extreme bias.
Example
„

25 26

Daily Minutes spent on the


Internet by 30 students Stem and Leaf Graph
102 104 85 67 101
6 7
71 116 107 99 82 7 18
103 97 105 103 95 8 25677
105 99 86 87 100 9 25799
109 108 118 87 125 10 01233455789
124 112 122 78 92 11 268
12 245
27 28

Back-to-back Example Back to Back Example


„ Passenger loading times for two airlines 0
0 8
11, 14,
11 14 16,
16 17,
17 8, 11,
8 11 13,
13 14,
14 14 1 134
19, 21, 22, 23, 15, 16, 16, 18, 679 1 566899
24, 24, 24, 26, 19, 19, 21, 21, 123444 2 1124
31, 32, 38, 39 22, 24, 26, 31 6 2 6
12 3 1
89 3
29 30

© Maurice Geraghty 2015 5


Math 10 – Part 1 Slides

Grouping Data Grouping Data


Cumulative
• Choose the number of groups Class Relative Relative
• between 5 and 10 is best Interval Frequency Frequency Frequency
66.5-78.5 3 .100 .100
• Intervall Width
d h = (Range+1)/(Number
( )/( b off Groups))
• Round up to a convenient value 78.5-90.5 5 .167 .267

90.5-102.5 8 .266 .533


• Start with lowest value and create the groups.
102.5-114.5 9 .300 .833
• Example – for 5 categories
Interval Width = (58+1)/5 = 12 (rounded up) 114.5-126.5 5 .167 1.000

Total 30 1.000
31 32

Histogram – Graph of Frequency


or Relative Frequency Dot Plot – Graph of Frequency

33 34

Ogive – Graph of Cumulative


Relative Frequency Measures of Central Tendency
Mean ∑X
100.0
„
„ Arithmetic Average X= i

n
75.0
cent

M di
Median
Cumulative Perc

50.0 „ “Middle” Value after


ranking data
„ Not affected by “outliers”
25.0 „ Mode
„ Most Occurring Value
„ Useful for non-numeric
0.0 data
60 70 80 90 100 110 120 130

35 36

© Maurice Geraghty 2015 6


Math 10 – Part 1 Slides

Example Example – 5 Recent Home Sales


2 2 5 9 12 „ $500,000
Circle the Average „ $600,000
„ $600 000
$600,000
a) 2 „ $700,000
b) 5 „ $2,600,000

c) 6

37 38

Positively Skewed Data Set Negatively Skewed Data Set


Mean > Median Mean < Median

39 40

Symmetric Data Set


Mean = Median Measures of Variability
„ Range
„ Variance
„ Standard Deviation
„ Interquartile Range (percentiles)

41 42

© Maurice Geraghty 2015 7


Math 10 – Part 1 Slides

Range Sample Variance


Max(Xi) –Min(Xi)
125 – 67 = 58 s 2
=
∑ (x − x) i
2

n −1

∑x − (∑ xi ) 2 / n
2

s 2
= i

n −1

43 44

Sample Standard Deviation Variance and Standard Deviation


Xi Xi − X (X i −X)
2

2 -4 16

s=
∑ (x − x)
i
2
2 -4
4 16 s2 =
78
=19.5
n −1 5 -1 1 4
9 3 9
12 6 36
s = 19.5 ≈4.42
30 0 78

45 46

Interpreting the Standard


Deviation Empirical Rule
„ Chebyshev’s Rule
„ At least 100 x (1-(1/k)2)% of any data set must be
within k standard deviations of the mean.
„ Empirical Rule (68-95-99 rule)
„ Bell shaped data
„ 68% within 1 standard deviation of mean
„ 95% within 2 standard deviations of mean
„ 99.7% within 3 standard deviations of mean

47 48

© Maurice Geraghty 2015 8


Math 10 – Part 1 Slides

Measures of Relative Standing Z-score


„ Z-score „ The number of Standard Deviations from
the Mean
„ Percentile
„ Z>0, Xi is g
greater than mean
„ Quartiles
„ Z<0, Xi is less than mean
„ Box Plots
Xi − X
Z=
s
49 50

Percentile Rank Quartiles


Formula for ungrouped data „ 25th percentile is 1st quartile
„ The location is (n+1)p (interpolated or rounded) „ 50th percentile is median
„ n= sample size „ 75th percentile is 3rd quartile
„ 75th percentile – 25th percentile is called
p = percentile
„
the Interquartile Range which
represents the “middle 50%”

51 52

4-26

IQR example Box Plots


n+1=31 „ A box plot is a graphical display, based on quartiles,
that helps to picture a set of data.
.25 x 31 = 7.75 location 8 = 87 Å 1st Quartile „ Five pieces of data are needed to construct a box
plot:
l t
„ Minimum Value
.75 x 31 = 23.25 location 23 = 108 Å 3rd Quartile „ First Quartile
„ Median
„ Third Quartile
„ Maximum Value.
Interquartile Range (IQR) =108 – 87 = 21

53 54

© Maurice Geraghty 2015 9


Math 10 – Part 1 Slides

Box Plot Outliers


„ An outlier is data point that is far
removed from the other entries in the
data set.
„ Outliers could be
„ Mistakes made in recording data
„ Data that don’t belong in population
„ True rare events

55 56

Outliers have a dramatic effect


on some statistics Using Box Plot to find outliers
„ Example quarterly home sales for „ The “box” is the region between the 1st and 3rd quartiles.
Possible outliers are more than 1.5 IQR’s from the box (inner fence)
10 realtors:
„

„ Probable outliers are more than 3 IQR’s from the box (outer fence)
2 2 3 4 5 5 6 6 7 50 „ In the box p
plot below,, the dotted lines represent
p the “fences” that are
1.5 and 3 IQR’s from the box. See how the data point 50 is well
with outlier without outlier outside the outer fence and therefore an almost certain outlier.

Mean 9.00 4.44 BoxPlot

Median 5.00 5.00


Std Dev 14.51 1.81
0 10 20 30 40 50 60
IQR 3.00 3.50
#1

57 58

Using Z-score to detect outliers Outliers – what to do


„ Calculate the mean and standard deviation „ Remove or not remove, there is no clear answer.

without the suspected outlier. „ For some populations, outliers don’t dramatically change the
„ Calculate the Z-score of the suspected overall statistical analysis. Example: the tallest person in the
world
ld will
ill nott dramatically
d ti ll change
h the
th mean height
h i ht off 10000
outlier. people.
„ If the Z-score is more than 3 or less than -3,
However, for some populations, a single outlier will have a
that data point is a probable outlier. „
dramatic effect on statistical analysis (called “Black Swan” by
Nicholas Taleb) and inferential statistics may be invalid in
50 − 4.4
Z= = 25.2 analyzing these populations. Example: the richest person in the
world will dramatically change the mean wealth of 10000
1.81 people.

59 60

© Maurice Geraghty 2015 10


Math 10 – Part 1 Slides

Bivariate Data Example of Bivariate Data


„ Ordered numeric pairs (X,Y) „ Housing Data
„ Both values are numeric „ X = Square Footage
„ Paired by a common characteristic „ Y = Price

„ Graph as Scatterplot

61 62

Example of Scatterplot Another Example


Housing Prices and Square Footage Housing Prices and Square Footage - San Jose Only

200 130
180 120
160 110
140 100
90
Price

120
80
Price

100
70
80
60
60
50
40
40
20 15 20 25 30
0 Size
10 15 20 25 30
Size

63 64

12-3 12-4

Correlation Analysis The Coefficient of Correlation, r


Correlation Analysis: A group of statistical
„
techniques used to measure the strength of the „ The Coefficient of Correlation (r) is a
relationship (correlation) between two variables. measure of the strength of the
„ Scatter Diagram: A chart that portrays the relationship between two variables.
relationship between the two variables of „ It requires interval or ratio-scaled data (variables).
interest. „ It can range from -1 to 1.
„ Dependent Variable: The variable that is being „ Values of -1 or 1 indicate perfect and strong
predicted or estimated. “Effect” correlation.
„ Independent Variable: The variable that „ Values close to 0 indicate weak correlation.
provides the basis for estimation. It is the „ Negative values indicate an inverse relationship
predictor variable. “Cause?” (Maybe!) and positive values indicate a direct relationship.
65 66

© Maurice Geraghty 2015 11


Math 10 – Part 1 Slides

12-6 12-5

Perfect Positive Correlation Perfect Negative Correlation


10 10
9 9
8 8
7 7
6 6
Y Y
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X
67 68

12-7 12-8

Zero Correlation Strong Positive Correlation


10 10
9 9
8 8
7 7
6 6
Y Y
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X
69 70

12-8

Weak Negative Correlation Causation


10
9
8
„ Correlation does not necessarily imply
7 causation.
Y
6 „ There are 4 possibilities if X and Y are
5 correlated:
4
1. X causes Y
3
2 2. Y causes X
1 3. X and Y are caused by something else.
0 4. Confounding - The effect of X and Y are
0 1 2 3 4 5 6 7 8 9 10 hopelessly mixed up with other variables.
X
71 72

© Maurice Geraghty 2015 12


Math 10 – Part 1 Slides

12-9

Formula for correlation coefficient r


Causation - Examples
„ City with more police per capita have
SSXY
more crime per capita. r=
„ As Ice cream sales go up,
up shark attacks SSX⋅ SS
SS SSY
go up.
SSX = ΣX 2 − 1n (ΣX )
2

„ People with a cold who take a cough


SSY = ΣY 2 − 1n (ΣY )
2
medicine feel better after some rest.
SSXY = ΣXY − 1n (ΣX ⋅ ΣY )

73 74

Example Example continued


„ X = Average Annual Rainfall (Inches) „ Make a Scatter Diagram
„ Y = Average Sale of Sunglasses/1000
„ Make a Scatter Diagram „ Find the correlation coefficient
„ Find the correlation coefficient

X 10 15 20 30 40

Y 40 35 25 25 15

75 76

Example continued Example continued


X Y X2 Y2 XY
scatter diagram 10 40 100 1600 400
15 35 225 1225 525
asses

60 20 25 400 625 500


per 1000
0
sales sungla

40 30 25 900 625 750


20 40 15 1600 225 600
0 115 140 3225 4300 2775
0 10 20 30 40 50
rainfall
• SSX = 3225 - 1152/5 = 580
• SSY = 4300 - 1402/5 = 380
• SSXY= 2775 - (115)(140)/5 = -445
77 78

© Maurice Geraghty 2015 13


Math 10 – Part 1 Slides

Example continued
SSXY
r=
SSX ⋅ SSY
− 445
r= = −0.9479
580 ⋅ 330

„ Strong negative correlation

79

© Maurice Geraghty 2015 14


Part 2 Probability

Probability
„ Classical probability
Math 10 „ Based on mathematical formulas
„ Empirical probability
„ Based on the relative frequencies of
Part 2 historical data.
Probability „ Subjective probability
© Maurice Geraghty 2015

„ “one-shot” educated guess.

1 2

Examples of Probability Classical Probability


Event
„ What is the probability of rolling a four „
„ A result of an experiment
on a 6-sided die? „ Outcome
„ A result of the experiment that cannot be broken down into smaller
„ What percentage of De Anza students events
Sample Space
live in Cupertino?
„

„ The set of all possible outcomes


Probability Event Occurs
„ What is the chance that the Golden „
„ # of elements in Event / # Elements in Sample Space
State Warriors will repeat as NBA „ Example – flip two coins, find the probability of exactly 1 head.
{HH, HT, TH, TT}
champions in 2016? „

„ P(1 head) = 2/4 = .5

3 4

Empirical Probability Rule of Complement


„ Historical Data National: Rate Your „ Complement of an
„ Relative Frequencies community event
Percentage of Saample

„ Example: What is 60 51 „ The event does not


50 A
the chance someone 40 32 occur
30
rates their 20 13 „ A’ is the A’
10 3 1
community as good 0 complement of A
or better? „ P(A) + P(A’) =1
ir

er
l

or
ce
oo

Fa

th
Po

0.51 + 0.32 = 0.83


Ex

P(A) = 1 – P(A’)
O

„
„
Rating

5 6

© Maurice Geraghty 2015 1


Part 2 Probability

Additive Rule Example


The UNION of two events A and B is
„
that either A or B occur (or both). (All
„ In a group of students, 40% are taking
colored parts) Math, 20% are taking History.
A B
„ The INTERSECTION of two events A „ 10% of students are taking both Math
and B is that both A and B will occur. and History.
(Purple Part only)
„ Find the Probability of a Student taking
„ Additive Rule: either Math or History or both.
P(A or B) = P(A) + P(B) – P(A and B)
„ P(M or H) = 40% + 20% - 10% = 50%

7 8

Mutually Exclusive Conditional Probability


„ Mutually Exclusive „ The probability of an event occuring GIVEN another
event has already occurred.
„ Both cannot occur
„ P(A|B) = P(A and B) / P(B)
„ If A and B are mutuallyy exclusive,, then
„ P(A or B) = P(A) + P(B) „ Example: Of all cell phone users in the US, 15% have a smart
phone with AT&T. 25% of all cell phone users use AT&T. Given
„ Example roll a die a selected cell phone user has AT&T, find the probability the
user also has a smart phone.
„ A: Roll 2 or less B: Roll 5 or more
„ P(A)=2/6 P(B)=2/6 „ A=AT&T subscriber B=Smart Phone User
„ P(A or B) = P(A) + P(B) = 4/6 „ P(A and B) = .15 P(A)=.25
„ P(B|A) = .15/.25 = .60

9 10

Contingency Tables Contingency Tables


„ Two data items can be displayed in a Accident No Accident Total
DUI 70 130 200
contingency table.
Non- DUI 30 770 800
„ Example: auto accident during
g year and DUI Total 100
00 900 1000
000
of driver.
Accident No Accident Total Given the Driver is DUI, find the Probability of an Accident.
DUI 70 130 200
A=Accident D=DUI
Non- DUI 30 770 800
Total 100 900 1000 P(A and D) = .07 P(D) = .2
P(A|D) = .07/.2 = .35

11 12

© Maurice Geraghty 2015 2


Part 2 Probability

Multiplicative Rule Independence


„ P(A and B) = P(A) x P(B|A) „ If A is not dependent on B, then they
„ P(A and B) = P(B) x P(A|B) are INDEPENDENT events, and the
following statements are true:
„ Example: A box contains 4 green balls and 3 red
balls. Two balls are drawn. Find the probability of
choosing two red balls. „ P(A|B)=P(A)
„ A=Red Ball on 1st draw B=Red Ball on 2nd Draw „ P(B|A)=P(B)
„ P(A)=3/7 P(B|A)=2/6 „ P(A and B) = P(A) x P(B)
„ P(A and B) = (3/7)(2/6) = 1/7

13 14

Example Example
Accident No Accident Total Accident No Accident Total
DUI 70 130 200 US Car 60 540 600
Non- DUI 30 770 800 Import Car 40 360 400
Total 100
00 900 1000
000 Total 100
00 900 1000
000

A: Accident D:DUI Driver A: Accident U:US Car


P(A) = .10 P(A|U) = .10 (60/600)
P(A) = .10 P(A|D) = .35 (70/200)
Therefore A and U are INDEPENDENT events as P(A) = P(A|U)
Therefore A and D are DEPENDENT events as P(A) < P(A|D)
Also P(A and U) = P(A)xP(U) = (.1)(.6) = .06
15 16

Random Sample Tree Diagram method


„ A random sample is where each „ Alternative Method of showing
member of the population has an probability
equally likely chance of being chosen, „ Example:
p Flip p Three Coins
and each member of the sample is „ Example: A Circuit has three switches. If at least two
of the switches function, the Circuit will succeed.
INDEPENDENT of all other sampled Each switch has a 10% failure rate if all are
data. operating, and a 20% failure rate if one switch has
already failed. Find the probability the circuit will
succeed.

17 18

© Maurice Geraghty 2015 3


Part 2 Probability

Circuit Problem Switching the Conditionality


.9 .1
Pr(Good)= „ Often there are questions where you desire
.81+.072+.064=.946
to change the conditionality from one variable
A A’ to the other variable
.9 .1 .8 .2 „ First construct a tree diagram.
„ Second, move the information to a
B’ B B’
B .2 Contingency Table
.8 .8
.2 .02
.81 „ From the Contingency table it is easy to
C
C’ C C’ calculate all conditional probabilities.
.072 .016
.018 .064
19 20

Example Example
„ 10% of prisoners in a Canadian prison are .1 .9
HIV positive.
„ A test will correctly detect HIV 95% of the A A’
time, but will incorrectly “detect” HIV in non- .95 .05 .15 .85

infected prisoners 15% of the time (false


positive). B’ B B’
B
„ If a randomly selected prisoner tests positive, .095 .005 .135 .765
find the probability the prisoner is HIV+
A=Prisoner is HIV+
B=Test is Positive for HIV
21 22

Example
HIV+ HIV-
A A’ Total

Test+ .095 .135 .230


B

Test- .005 .765 .770


B’

Total .100 .900 1.000

.095
P( A | B ) = ≈ .413
.230
23

© Maurice Geraghty 2015 4


Math 10 Part 3 – Discrete Random Variables

Random Variable
„ The value of the variable depends on
Math 10 an experiment, observation or
measurement.
„ The result is not known in advance.
Part 3
„ For the purposes of this class, the
Discrete Random Variables
variable will be numeric.
© Maurice Geraghty 2013

1 2

Random Variables Discrete Random Variable


„ Discrete – Data that you Count „ List Sample Space
„ Defects on an assembly line „ Assign probabilities P(x) to each event x
„ Reported
p Sick days
y „ Use “relative
relative frequencies”
frequencies
RM 7.0 earthquakes on San Andreas Fault
„
„ Must follow two rules
„ Continuous – Data that you Measure „ P(x) ≥ 0
„ Temperature „ ΣP(x) = 1
Height
„
„ P(x) is called a Probability Distribution
„ Time Function or pdf for short.
3 4

Probability Distribution Probability Distribution


Example Example
„ Students are asked „ Students are asked
4 questions and the x P(x) 4 questions and the x P(x)
number of correct 0 .1 number of correct 0 .1
answers are answers are
1 .1 1 .1
determined. determined.
„ Assign probabilities 2 .2 „ Assign probabilities 2 .2
to each event. 3 .4 to each event. 3 .4
4 4 .2

5 6

© Maurice Geraghty 2013 1


Math 10 Part 3 – Discrete Random Variables

Mean and Variance of Discrete Example of Mean and


Random Variables Variance
„ Population mean μ, is the expected value of x x P(x) xP(x) (x-μ)2P(x)

μ = Σ[ (x) P(x) ] 0 0.1 0.0 .625


1 01
0.1 01
0.1 .225
225
2 0.2 0.4 .050
„ Population variance σ, is the expected value
of (x-μ)2 3 0.4 1.2 .100

σ2 = Σ[ (x-μ)2 P(x) ] 4 0.2 0.8 .450


Total 1.0 2.5=μ 1.450=σ2

7 8

Bernoulli Distribution Mean and Variance of Bernoulli


„ Experiment is one trial x P(x) xP(x) (x-μ)2P(x)
0 (1-p) 0.0 p2(1-p)
„ 2 possible outcomes (Success,Failure)
1 p p p(1-p)
p( p)2
„ p probability of success
p=probability
Total 1.0 p=μ p(1-p)=σ2
„ q=probability of failure
„ X=number of successes (1 or 0) „ μ=p
„ Also known as Indicator Variable „ σ2 = p(1-p) = pq

9 10

Binomial Distribution Binomial Distribution


„ n identical trials „ n independent Bernoulli trials
„ Two possible outcomes (success/failure) „ Mean and Variance of Binomial Distribution is just
sample size times mean and variance of Bernoulli
„ Probability of success in a single trial is p Distribution
„ Trials are mutually independent
„ X is the number of successes p ( x)= n C x p x (1 − p ) n − x
μ = E ( X ) = np
Note: X is a sum of n independent identically
σ 2 = Var ( X ) = np(1 − p )
„

distributed Bernoulli distributions

11 12

© Maurice Geraghty 2013 2


Math 10 Part 3 – Discrete Random Variables

Binomial Examples Binomial Example


„ The number of defective parts in a fixed „ 90% of intake valves manufactured are
good (not defective). A sample of 10 is
sample. selected.
„ The number of adults in a sample who „ Find the probability of exactly 8 good valves
support the war in Iraq. being chosen.
„ The number of correct answers if you „ Find the probability of 9 or more good
valves being chosen.
guess on a multiple choice test.
„ Find the probability of 8 or less good valves
being chosen.

13 14

Using Technology Poisson Distribution


Use Minitab or Excel „ Occurrences per time period (rate)
to make a table of „ Rate (μ) is constant
Binomial Probabilities.
P(X=8) = .19371
„ No limit on occurrences over time
period
P(X≤8) = .26390
μ=μ
P(X≥9) = 1 - P(X≤8) = .73610 e−μ μ x
P(x) = σ= μ
x!

15 16

Examples of Poisson Poisson Example


„ Text messages in the next hour „ Earthquakes of Richter magnitude 3 or greater
occur on a certain fault at a rate of twice every
„ Earthquakes on a fault year.
„ Customers at a restaurant „ Find the probability of at least one earthquake
„ Flaws in sheet metal produced of RM 3 or greater in the next year.
„ Lotto winners P ( X > 0) = 1 − P ( 0)
Note: A binomial distribution with a large n and small p is
approximately Poisson with μ ≈ np. e −2 20
= 1−
0!
= 1 − e −2 ≈ .8647
17 18

© Maurice Geraghty 2013 3


Math 10 Part 3 – Discrete Random Variables

Poisson Example (cont)


„ Earthquakes of Richter magnitude 3 or greater
occur on a certain fault at a rate of twice every
year.
„ Find the probability of exactly 6 earthquakes of
RM 3 or greater in the next 2 years.

μ = 2(2) = 4
e −4 46
P( X = 6) = ≈ .1042
6!

19

© Maurice Geraghty 2013 4


Math 10 Part 4 Slides Continuous Random Variables

Continuous Distributions
„ “Uncountable” Number of possibilities
Math 10 „ Probability of a point makes no sense
„ Probability is measured over intervals
Part 4 Slides „ Comparable to Relative Frequency
Continuous Random Variables and Histogram – Find Area under curve.
the Central Limit Theorem
© Maurice Geraghty, 2015

1 2

Discrete vs Continuous Continuous Random Variable


„ Countable „ Uncountable „ f(x) is a density function
„ Discrete Points „ Continuous Intervals „ P(X<x) is a distribution function.
„ P(a<X<b) = area under function between a and b
„ p(x) is probability „ f(x) is probability
distribution function density function
„ p(x) ≥ 0 „ f(x) ≥ 0
„ Σp(x) =1 „ Total Area under
curve =1

3 4

Examples of Exponential
Exponential distribution Distributiuon
„ Waiting time „ Time until…
„ “Memoryless” „ a circuit will fail
„ f(x) = (1/μ)e
(1/ )e−(1/μ)x
(1/μ)x „ the next RM 7 Earthquake
„ P(x>a) = e –(a/μ) „ the next customer calls
„ μ=μ σ2=μ2 „ An oil refinery accident
„ P(x>a+b|x>b) = e –(a/μ) „ you buy a winning lotto ticket

5 6

© Maurice Geraghty, 2015 1


Math 10 Part 4 Slides Continuous Random Variables

Relationship between Poisson


and Exponential Distributions Exponential Example
„ If occurrences follow a Poisson Process with The life of a digital display of a calculator has exponential
mean = μ, then the waiting time for the next distribution with μ=500hours.
occurrence has Exponential distribution with
(a) Find the chance the display will last at least 600 hours.
mean = 1/μ.
P(x>600) = e-600/500 = e-1.2= .3012
„ Example: If accidents occur at a plant at a
(b) Assuming it has already lasted 500 hours, find the chance
constant rate of 3 per month, then the the display will last an additional 600 hours.
expected waiting time for the next accident is
1/3 month. P(x>1100|x>500) = P(x>600) = .3012

7 8

Exponential Example Uniform Distribution


The life of a digital display of a calculator has „ Rectangular distribution
exponential distribution with μ=500 hours. „ Example: Random number generator
1
( ) Find
(a) Fi d the
th median
di off the
th distribution
di t ib ti f ( x) = a≤ x≤b
b−a
P(x>med) = e-(med)/500 = 0.5
med = -ln(.5)x500=346.57 b+a
μ = E( X ) =
2
pth Percentile = -ln(1-p)μ
σ 2 = Var ( X ) =
(b − a )2
12
9 10

Uniform Distribution - Probability Uniform Distribution - Percentile


Area = p

a c d b a Xp b

d −c Formula to find the pth percentile Xp:


P (c < X < d ) =
b−a X p = a + p (b − a )

11 12

© Maurice Geraghty, 2015 2


Math 10 Part 4 Slides Continuous Random Variables

Uniform Example 1 Uniform Example 2


„ Find mean, variance, P(X<3) and 70th percentile for a „ A tea lover orders 1000 grams
uniform distribution from 1 to 11. of Tie Guan Yin loose leaf when
his supply gets to 50 grams.
(11 − 1) = 8.33
2
1 + 11 The amount of tea currently in
μ= =6 σ2 = 8 33 „
stock follows a uniform random
2 12 variable.
3 −1
P ( X < 3) = = 0.3 „ Determine this model
11 − 1 „ Find the mean and variance
Find the probability of at least
X 70 = 1 + 0.7 (11 − 1) = 8
„
700 grams in stock.
„ Find the 80th percentile
13 14

Uniform Example 3 Normal Distribution


„ The normal curve is bell-shaped
The mean, median, and mode of
„ A bus arrives at a stop every 20 minutes. „

the distribution are equal and


„ Find the probability of waiting more than 15 minutes for the located at the peak.
bus after arriving randomly at the bus stop.
„ The normal distribution is
symmetrical about its mean. Half
„ If you have already waited 5 minutes, find the probability of the area under the curve is above
waiting an additional 10 minutes or more. (Hint: recalculate the peak, and the other half is
parameters a and b) below it. − 21σ ( x − μ ) 2
The normal probability distribution e
„

is asymptotic - the curve gets


f ( x) =
closer and closer to the x-axis but σ 2π
never actually touches it.

15 16

7-6 7-9

The Standard Normal Areas Under the Normal Curve – Empirical Rule
Probability Distribution
„ About 68 percent of the area under the
„ A normal distribution with a mean of 0 and a normal curve is within one standard deviation
standard deviation of 1 is called the standard of the mean. μ ± 1σ
normal distribution.
„ Z value: The distance between a selected value,
designated x, and the population mean μ, divided „ About 95 percent is within two standard
by the population standard deviation, σ deviations of the mean μ ± 2σ

X − μ 99.7 percent is within three standard


Z = „

σ deviations of the mean. μ ± 3σ

17 18

© Maurice Geraghty, 2015 3


Math 10 Part 4 Slides Continuous Random Variables

7-11

Normal Distribution –
EXAMPLE probability problem procedure
„ The daily water usage per person in a town is
normally distributed with a mean of 20 gallons and „ Given: Interval in terms of X
a standard deviation of 5 gallons.
X −μ
About 68% of the daily water usage per person in „ Convert to Z by Z =
„
New Providence lies between what two values? σ
„ μ ±1σ = 20 ±1(5). That is, about 68% of the daily „ Look up probability in table.
water usage will lie between 15 and 25 gallons.

19 20

7-12 7-12

EXAMPLE EXAMPLE continued


„ The daily water usage per person in a town is „ The daily water usage per person in a town is
normally distributed with a mean of 20 gallons and a normally distributed with a mean of 20 gallons and a
standard deviation of 5 gallons. standard deviation of 5 gallons.
„ What is the probability that a person from the town „ What proportion of the people uses between 18 and
selected at random will use less than 18 gallons per 24 gallons?
day?
„ The Z value associated with x=18 is Z=-0.40 and
„ The associated Z value is Z=(18-20)/5=0. with X=24, Z=(24-20)/5=0.80.
„ Thus, P(X<18)=P(Z<-0.40)=.3446 „ Thus, P(18<X<24)=P(-0.40<Z<0.80)
=.7881-.3446=.4435

21 22

7-14

Normal Distribution –
EXAMPLE continued percentile problem procedure
„ The daily water usage per person in a town is „ Given: probability or percentile desired.
normally distributed with a mean of 20 gallons and
a standard deviation of 5 gallons. „ Look up Z value in table that corresponds to
„ What
h percentage off the
h population
l uses more than
h probability
probability.
26.2 gallons?
„ Convert to X by the formula:
„ The Z value associated with X=26.2,
Z=(26.2-20)/5=1.24.
„ Thus P(X>26.2)=P(Z>1.24) X = μ + Zσ
=1-.8925=.1075

23 24

© Maurice Geraghty, 2015 4


Math 10 Part 4 Slides Continuous Random Variables

7-14 7-15

EXAMPLE EXAMPLE
„ Professor Kurv has determined that the final
„ The daily water usage per person in a town is averages in his statistics course is normally
normally distributed with a mean of 20 gallons and distributed with a mean of 77.1 and a standard
deviation of 11.2.
a standard deviation of 5 gallons. A special tax is
„ Hee dec
decides
des to ass
assign
g hiss ggrades
ades for
o hiss cu
current
e t
going to be charged on the top 5% of water users
users. course such that the top 15% of the students
„ Find the value of daily water usage that generates receive an A.
the special tax „ What is the lowest average a student can receive
to earn an A?
„ The Z value associated with 95th percentile =1.645
„ The top 15% would be the finding the 85th
percentile. Find k such that P(X<k)=.85.
„ X=20 + 5(1.645) = 28.2 gallons per day „ The corresponding Z value is 1.04. Thus we have
X=77.1+(1.04)(11.2), or X=88.75

25 26

7-17

EXAMPLE Distribution of Sample Mean


„ The amount of tip the servers in an exclusive
restaurant receive per shift is normally distributed „ Random Sample: X1, X2, X3, …, Xn
with a mean of $80 and a standard deviation of „ Each Xi is a Random Variable from the same population
$10. „ All Xi’s are Mutually Independent
„ Shelli feels she has provided poor service if her
total tip for the shift is less than $65. „ X is a function of Random Variables, so
„ What percentage of the time will she feel like she X is itself Random Variable.
provided poor service?
„ In other words, the Sample Mean can change if the
values of the Random Sample change.
„ Let y be the amount of tip. The Z value associated
with X=65 is Z= (65-80)/10= -1.5.
„ Thus P(X<65)=P(Z<-1.5)=.0668. „ What is the Probability Distribution of X ?

27 28

Example – Roll 1 Die Example – Roll 2 Dice


P r oba bi l i t y D i st r i but i on of S a mpl e M e a n - 1 D i e R ol l P r oba bi l i t y D i st r i but i on of S a mpl e M e a n - 2 D i e R ol l s

0. 1 8
0. 1 8
0. 1 6
0. 1 6
0. 1 4
0. 1 4
0. 1 2
0. 1 2
0. 1
0. 1

0. 0 8
0. 0 8

0. 0 6
0. 0 6

0. 0 4
0. 0 4

0. 0 2
0. 0 2

0
0
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
1 2 3 4 5 6

29 30

© Maurice Geraghty, 2015 5


Math 10 Part 4 Slides Continuous Random Variables

Example – Roll 10 Dice Example – Roll 30 Dice


P r oba bi l i t y D i st r i but i on of S a m pl e M e a n - 3 0 D i e R ol l s
P r ob a b i l i t y D i st r i bu t i o n o f S a m pl e M e a n - 10 D i e R ol l s

0. 0 45
0 . 03

0 . 04

0. 0 25
0 0 35
0.

0 . 03
0 . 02

0. 0 25

0. 0 15
0 . 02

0 . 01 0. 0 15

0 . 01
0. 0 05
0. 0 05

0 0
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6

31 32

Example - Poisson Central Limit Theorem – Part 1


0.40 „ IF a Random Sample of any
0.35
size is taken from a
population with a Normal
0.30 Distribution with mean= μ
μ=1.25
μ μ=12.5
μ |
0.25 and standard deviation = σ μ
X
p(x)

0.20
„ THEN the distribution of the
0.15 sample mean has a Normal x
x
Distribution with: xx
0.10
σ xx x
0.05 μX = μ σX = xxx
n xxx
0.00 xxxxx
xxxxx
1

11

13

15

17

19

21

23

25

x X
33 34

Central Limit Theorem – Part 2 Central Limit Theorem


„ IF a random sample of
3 important results for the distribution of X
sufficiently large size is taken „ Mean Stays the same
from a population with any
Distribution with mean= μ and μX = μ
standard deviation = σ μ
X
THEN the distribution of the „ Standard Deviation Gets Smaller
σ
„
sample mean has approximately x
a Normal Distribution with:
x
xx σX =
xx x n
σ x
x x
μX = μ σX = xxx
xxxxx „ If n is sufficiently large, X has a Normal
n xxxxx
Distribution
X
35 36

© Maurice Geraghty, 2015 6


Math 10 Part 4 Slides Continuous Random Variables

Example Example (cont)


μ = 69.2
The mean height of American men (ages 20-29) is μ σ = 2.9
= 69.2 inches. If a random sample of 60 men in this
age group is selected, what is the probability the
mean height for the sample is greater than 70
inches? Assume σ = 2.9”. 69.2
⎛ (70 − 69.2) ⎞ μ X = 69.2
P( X > 70) = P⎜⎜ Z > ⎟
2.9 60 ⎟⎠
x 2 .9
⎝ x σX = = 0.3749
x x 60
xx x
x x x
= P ( Z > 2 .14 )= 0 .0162 x x x
x x x x x
x x x x x

37 38

Example – Central Limit Theorem


The waiting time until receiving a text message
follows an exponential distribution with an expected
waiting time of 1.5 minutes. Find the probability that
the mean waiting time for the 50 text messages
exceeds 1.6 minutes.
μ = 1.5 σ = 1.5 n = 40
Use Normal Distribution (n>30)
⎛ (1.6 − 1.5) ⎞
P ( X > 1.6) = P⎜⎜ Z > ⎟ = P ( Z > 0.47)= 0.3192
⎝ 1.5 50 ⎟⎠

39

© Maurice Geraghty, 2015 7


Math 10 – Part 5 – Confidence Intervals

Inference Process

Math 10

Part 5 Slides
Confidence Intervals
© Maurice Geraghty, 2009

1 2

Inference Process Inference Process

3 4

Inference Process Inferential Statistics


„ Population Parameters
„ Mean = μ
„ Standard Deviation = σ
„ Sample Statistics
„ Mean = X
„ Standard Deviation = s

5 6

© Maurice Geraghty, 2009 1


Math 10 – Part 5 – Confidence Intervals

Inferential Statistics Estimation of μ


„ Estimation X is an unbiased point estimator of μ
„ Using sample data to estimate population
parameters. Example:
p The number of defective items p
produced
„ Example: Public opinion polls by a machine was recorded for five randomly
selected hours during a 40-hour work week. The
„ Hypothesis Testing observed number of defectives were 12, 4, 7, 14, and
„ Using sample data to make decisions or 10. So the sample mean is 9.4.
claims about population
„ Example: A drug effectively treats a Thus a point estimate for μ, the hourly mean number
of defectives, is 9.4.
disease
7 8

Confidence Intervals Confidence Intervals


„ An Interval Estimate states the range within which a „ A 95% confidence interval means that about 95% of the
population parameter “probably” lies. similarly constructed intervals will contain the parameter being
estimated, or 95% of the sample means for a specified sample
size will lie within 1.96 standard deviations of the hypothesized
„ The interval within which a population parameter is population mean.
expected
t d tto occur iis called
ll d a Confidence
C fid Interval.
I t l
„ For the 99% confidence interval, 99% of the sample means for
a specified sample size will lie within 2.58 standard deviations of
„ The distance from the center of the confidence the hypothesized population mean.
interval to the endpoint is called the “Margin of
Error” „ For the 90% confidence interval, 90% of the sample means for
a specified sample size will lie within 1.645 standard deviations
of the hypothesized population mean.
„ The three confidence intervals that are used
extensively are the 90%, 95% and 99%.

9 10

8-18 8-19

90%, 95% and 99% Confidence Constructing General Confidence


Intervals for µ Intervals for µ
„ The 90%, 95% and 99% confidence intervals for μ „ In general, a confidence interval for the mean is
are constructed as follows when n ≥ 30 computed by:
σ
„ 90% CI for the population mean is given by
σ X ± Z
X ± 1.645 n
n
„ 95% CI for the population mean is given by
σ
X ± 1 . 96 „ This can also be thought of as:
n
„ 99% CI for the population mean is given by Point Estimator ± Margin of Error
σ
X ± 2 .58
n
11 12

© Maurice Geraghty, 2009 2


Math 10 – Part 5 – Confidence Intervals

8-19 8-20

The nature of Confidence Intervals EXAMPLE


„ The Population mean μ „ The Dean wants to estimate the mean
is fixed. number of hours worked per week by
„ The confidence interval students. A sample of 49 students showed
is centered around the a mean of 24 hours with a standard
sample mean which is a
Random Variable. deviation of 4 hours.
„ So the Confidence „ The point estimate is 24 hours (sample
Interval (Random mean).
Variable) is like a target
trying hit a fixed dart „ What is the 95% confidence interval for the
(μ). average number of hours worked per week
by the students?
13 14

8-21 8-21

EXAMPLE continued EXAMPLE continued

„ Using the 95% CI for the population „ Using the 99% CI for the population
mean, we have mean, we have
24 ± 1.96 ( 4 / 7 ) = 22 .88 to 25 .12 24 ± 2.58 ( 4 / 7 ) = 22 .53 to 25 .47
„ The endpoints of the confidence „ Compare to the 95% confidence
interval are the confidence limits. The interval. A higher level of confidence
lower confidence limit is 22.88 and the means the confidence interval must be
upper confidence limit is 25.12 wider.

15 16

8-27 8-28

Selecting a Sample Size Sample Size for the Mean


A convenient computational formula for determining
„ There are 3 factors that determine the „
n is:
size of a sample, none of which has ⎛ Zσ ⎞
2

any direct relationship to the size of n=⎜ ⎟


⎝ E ⎠
the population. They are:
where E is the allowable error (margin of error), Z is
The degree of confidence selected.
„
„
the z score associated with the degree of confidence
„ The maximum allowable error. selected, and σ is the sample deviation of the pilot
survey.
„ The variation of the population. „ σ can be estimated by past data, target sample or
range of data.

17 18

© Maurice Geraghty, 2009 3


Math 10 – Part 5 – Confidence Intervals

8-29

Normal Family of Distributions: Z, t, χ2, F


EXAMPLE
„ „
„ A consumer group would like to estimate the mean
monthly electric bill for a single family house in July.
Based on similar studies the standard deviation is
estimatedd to be
b $20.00.
$ A 99%±level
l l off confidence
fd
is desired, with an accuracy of $5.00. How large a
sample is required? „

n = [( 2 .5 8 )( 2 0 ) / 5 ] 2 = 1 0 6 .5 0 2 4 ≈ 1 0 7

19 20

10-3 9-3
9-3

Characteristics of Student’s t- The degrees of


Distribution freedom for
z-distribution
the t-distribution
„ The t-distribution has the following is df = n - 1.
t-distribution
properties:
„ It is continuous, bell-shaped, and symmetrical
about zero like the z-distribution.
„ There is a family of t-distributions sharing a
mean of zero but having different standard
deviations based on degrees of freedom.
„ The t-distribution is more spread out and flatter
at the center than the z-distribution, but
approaches the z-distribution as the sample
size gets larger.
21 22

Confidence Interval for μ


(small sample σ unknown) Example – Confidence Interval
• In a random sample of 13
Formula uses the t-distribution, American adults, the mean waste
a (1-α)100% confidence interval uses the recycled per person per day was
formula shown below: 5 3 pounds and the standard
5.3
deviation was 2.0 pounds.
⎛ s ⎞
X ± (tα / 2 )⎜ ⎟ df = n − 1 • Assume the variable is normally
⎝ n⎠ distributed and construct a 95%
confidence interval for μ.

23 24

© Maurice Geraghty, 2009 4


Math 10 – Part 5 – Confidence Intervals

Confidence Intervals,
Example- Confidence Interval Population Proportions
„ Point estimate for proportion X
α/2=.025 of successes in population is: pˆ =
n
df=13-1=12
„ X is the number of successes
t=2 18
t=2.18 in a sample of size n.

2.0 Standard deviation of p̂ is ( p )(1 − p )


5.3 ± 2.18
„
n
13 „ Confidence Interval for p: p (1 − p )
pˆ ± Z α 2 ⋅
5.3 ± 1.2 = (4.1, 6.5) n

25 26

Population Proportion Example Population Proportion Example


In a May 2006 AP/ISPOS Poll, 1000
„
adults were asked if "Over the next
„ Sample proportion
six months, do you expect that 700
increases in the price of gasoline will pˆ = = .70 = 70%
cause financial hardshipp for you
y or 1000
your family, or not?“

„ 700 of those sampled responded „ Margin of Error


yes!
.70(1 − .70)
„ Find the sample proportion and MOE = 1.96 = .028 = 2.8%
margin of error for this poll. (This 1000
means find a 95% confidence
interval.)

27 28

8-28

Sample Size for the Proportion


„ A convenient computational formula for
determining n is: 2
⎛Z⎞
n = ( p(1 − p ))⎜ ⎟
⎝E⎠
„ where E is the allowable margin of error, Z is the
z-score associated with the degree of confidence
selected, and p is the population proportion.
„ If p is completely unknown, p can be set equal to ½
which maximizes the value of (p)(1-p) and
guarantees the confidence interval will fall within the
margin of error.

29 30

© Maurice Geraghty, 2009 5


Math 10 – Part 5 – Confidence Intervals

Example Example
„ In polling, determine the minimum „ In polling, determine the minimum
sample size needed to have a sample size needed to have a
margin of error of 3% when p is margin of error of 3% when p is
unknown. known to be close to 1/4.
2
⎛ 1.96 ⎞
n = (.5)(1 − .5)⎜
2
⎛ 1.96 ⎞
⎟ = 1068 n = (.25)(1 − .25)⎜ ⎟ = 801
⎝ .03 ⎠ ⎝ .03 ⎠

31 32

14-2 2-2
CHI--SQUARE DISTRIBUTION
CHI
Characteristics of the Chi-Square
Distribution
df = 3
„ The major characteristics of the chi-
square distribution are:
„ It is p
positivelyy skewed
df = 5
„ It is non-negative df = 10
„ It is based on degrees of freedom
„ When the degrees of freedom change, a new
distribution is created

χ2

33 34

Inference about Population Variance


and Standard Deviation Distribution of s2
„ s2 is an unbiased point estimator for σ2
„ s is a point estimator for σ „
(n − 1) s 2 has a chi-square distribution
„ Interval estimates and hypothesis testing for σ 2

both σ2 and σ require a new distribution – the


χ2 (Chi-square)
„ n-1 is degrees of freedom
„ s2 is sample variance
„ σ2 is population variance
35 36

© Maurice Geraghty, 2009 6


Math 10 – Part 5 – Confidence Intervals

Confidence interval for σ2 Example


„ Confidence is NOT symmetric since chi-square distribution is „ In performance measurement of investments,
not symmetric
„ We can construct a (1-α)100% confidence interval for σ2
standard deviation is a measure of volatility
or risk.
⎛ (n − 1)s 2 (n − 1)s 2 ⎞ Twenty monthly returns from a mutual fund
⎜⎜ ⎟
„
, 2
χ1−α / 2 ⎟⎠
show an average monthly return of 1% and a
⎝ χα / 2
2
sample standard deviation of 5%
„ Take square root of both endpoints to get confidence interval „ Find a 95% confidence interval for the
for σ, the population standard deviation. monthly standard deviation of the mutual
fund.
37 38

Example (cont)
„ df = n-1 =19
„ 95% CI for σ

⎛ (19 )52 (19)52 ⎞⎟ = (3.8,7.3)


⎜ ,
⎜ 32.8523 8.90655 ⎟
⎝ ⎠

39

© Maurice Geraghty, 2009 7


Math 10 Part 6 – Hypothesis Testing

Procedures of Hypotheses Testing

Math 10

Part 6
Hypothesis Testing
© Maurice Geraghty, 2010

1 2

Hypotheses Testing – Procedure 1 General Research Question


„ Decide on a topic or phenomena that you want to
research.
„ Formulate general research questions based on the
topic
topic.
„ Example:
„ Topic: Health Care Reform
„ Some General Questions:
„ Would a Single Payer Plan be less expensive than Private Insurance?
„ Do HMOs provide the same quality care as PPOs?
„ Would the public support mandated health coverage?

3 4

Hypotheses Testing – Procedure 2 Hypothesis Testing Design


State Your Hypotheses

Null Hypothesis Alternative Hypothesis

Determine Appropriate Model

Test Statistic One or Two Tailed

Determine Decision Criteria

α – Significance Level β and Power Analysis


5 6

© Maurice Geraghty, 2011 1


Math 10 Part 6 – Hypothesis Testing

9-3 9-4

What is a Hypothesis? What is Hypothesis Testing?


Hypothesis: A statement about the value of a
Hypothesis testing: A procedure, based
„
„
population parameter developed for the purpose of
testing. on sample evidence and probability
„ Examples
a p es oof hypotheses
ypot eses made
ade about a popu
population
at o theory, used to determine whether the
parameter are: hypothesis is a reasonable statement
The mean monthly income for programmers is
and should not be rejected, or is
„

$9,000.
„ At least twenty percent of all juvenile offenders are unreasonable and should be rejected.
caught and sentenced to prison.
„ The standard deviation for an investment portfolio is
no more than 10 percent per month.

7 8

9-6

Hypothesis Testing Design Definitions


State Your Hypotheses
„ Null Hypothesis H0: A statement about the
Null Hypothesis Alternative Hypothesis value of a population parameter that is
assumed to be true for the purpose of
Determine Appropriate Model testing.
Test Statistic One or Two Tailed „ Alternative Hypothesis Ha: A statement
about the value of a population parameter
that is assumed to be true if the Null
Determine Decision Criteria
Hypothesis is rejected during testing.
α – Significance Level β and Power Analysis
9 10

9-7

Hypothesis Testing Design Definitions


State Your Hypotheses „ Statistical Model: A mathematical model that describes the
behavior of the data being tested.
Null Hypothesis Alternative Hypothesis „ Normal Family = the Standard Normal Distribution (Z) and
functions of independent Standard Normal Distributions
(eg: t, χ2, F).
Determine Appropriate Model „ Most Statistical Models will be from the Normal Family due

to the Central Limit Theorem.


Test Statistic One or Two Tailed
„ Model Assumptions: Criteria which must be satisfied to
appropriately use a chosen Statistical Model.
„ Test statistic: A value, determined from sample information,
Determine Decision Criteria
used to determine whether or not to reject the null
α – Significance Level β and Power Analysis hypothesis.

11 12

© Maurice Geraghty, 2011 2


Math 10 Part 6 – Hypothesis Testing

9-6

Hypothesis Testing Design Definitions


State Your Hypotheses
„ Level of Significance: The probability of
Null Hypothesis Alternative Hypothesis
rejecting the null hypothesis when it is
true (signified by α))
actually true.
Determine Appropriate Model
„ Type I Error: Rejecting the null
Test Statistic One or Two Tailed
hypothesis when it is actually true.
„ Type II Error: Failing to reject the null
Determine Decision Criteria
hypothesis when it is actually false.
α – Significance Level β and Power Analysis
13 14

Outcomes of Hypothesis Testing Hypothesis Testing Design


State Your Hypotheses
Fail to Reject
Reject Ho Null Hypothesis Alternative Hypothesis
Ho

Correct Determine Appropriate Model


Ho is true Type I error
Decision Test Statistic One or Two Tailed

Correct
Ho is False Type II error Determine Decision Criteria
Decision
α – Significance Level β and Power Analysis
15 16

9-7 9-8

One-Tailed Tests of
Definitions
Significance
„ A test is one-tailed when the alternate
hypothesis, Ha , states a direction, such as:
„ Critical value(s): The dividing point(s) between the „ H0 : The mean income of females is less than or equal to the
region where the null hypothesis is rejected and the mean income of males.
region where it is not rejected. The critical value „ Ha : The mean income of females is greater than males.
determines the decision rule. „ Equality is part of H0
„ Rejection Region: Region(s) of the Statistical Model „ Ha determines which tail to test
which contain the values of the Test Statistic where
„ Ha: μ>μ0 means test upper tail.
the Null Hypothesis will be rejected. The area of the
Rejection Region = α „ Ha: μ<μ0 means test lower tail.

17 18

© Maurice Geraghty, 2011 3


Math 10 Part 6 – Hypothesis Testing

9-10

Two-Tailed Tests of
One-tailed test Significance
A test is two-tailed when no direction is
H 0 : μ ≤ μ0 „
specified in the alternate hypothesis Ha ,
such as:
H a : μ > μ0 „ H0 : The mean income of females is equal to the mean
income of males.
α = .05 „ Ha : The mean income of females is not equal to the mean
income of the males.
Equality is part of H0
X − μ0 „

Z= „ Ha determines which tail to test


σ „ Ha: μ≠μ0 means test both tails.

n
19 20

Two-tailed test Hypotheses Testing – Procedure 3


H 0 : μ = μ0
H a : μ ≠ μ0
α = .05 α 2 = .025
X − μ0
Z=
σ
n
21 22

Collect and Analyze Experimental Data Collect and Analyze Experimental Data
Collect and Verify Data Collect and Verify Data

Conduct Experiment Check for Outliers Conduct Experiment Check for Outliers

Determine Test Statistic and/or p-value Determine Test Statistic and/or p-value

Compare to Critical Value Compare to α Compare to Critical Value Compare to α

Make a Decision about Ho Make a Decision about Ho

Reject Ho and support Ha Fail to Reject Ho Reject Ho and support Ha Fail to Reject Ho

23 24

© Maurice Geraghty, 2011 4


Math 10 Part 6 – Hypothesis Testing

Outliers have a dramatic effect


Outliers on some statistics
„ An outlier is data point that is far „ Example quarterly home sales for
removed from the other entries in the 10 realtors:
data set. 2 2 3 4 5 5 6 6 7 50
„ Outliers could be with outlier without outlier
„ Mistakes made in recording data Mean 9.00 4.44
„ Data that don’t belong in population Median 5.00 5.00
„ True rare events Std Dev 14.51 1.81
IQR 3.00 3.50

25 26

Using Box Plot to find outliers Using Z-score to detect outliers


The “box” is the region between the 1st and 3rd quartiles.
„
„ Calculate the mean and standard deviation
Possible outliers are more than 1.5 IQR’s from the box (inner fence)
without the suspected outlier.
„

„ Probable outliers are more than 3 IQR’s from the box (outer fence)
„ In the box p
plot below, the dotted lines represent
p the “fences” that are „ Calculate the Z-score of the suspected
1.5 and 3 IQR’s from the box. See how the data point 50 is well outlier.
outside the outer fence and therefore an almost certain outlier.
„ If the Z-score is more than 3 or less than -3,
that data point is a probable outlier.
50 − 4.4
Z= = 25.2
1.81
27 28

Outliers – what to do Collect and Analyze Experimental Data


„ Remove or not remove, there is no clear answer. Collect and Verify Data

„ For some populations, outliers don’t dramatically change the Conduct Experiment Check for Outliers
overall statistical analysis. Example: the tallest person in the
world
ld will
ill nott dramatically
d ti ll change
h the
th mean height
h i ht off 10000
people. Determine Test Statistic and/or p-value

„ However, for some populations, a single outlier will have a Compare to Critical Value Compare to α
dramatic effect on statistical analysis (called “Black Swan” by
Nicholas Taleb) and inferential statistics may be invalid in
analyzing these populations. Example: the richest person in the Make a Decision about Ho
world will dramatically change the mean wealth of 10000
people. Reject Ho and support Ha Fail to Reject Ho

29 30

© Maurice Geraghty, 2011 5


Math 10 Part 6 – Hypothesis Testing

The logic of Hypothesis Testing Test Statistic


„ This is a “Proof” by contradiction. „ Test Statistic: A value calculated from the
„ We assume Ho is true before observing data and design Ha to be Data under the appropriate Statistical Model
the complement of Ho. from the Data that can be compared to the
Observe the data (evidence)
(evidence). How unusual are these data under
„
H o?
Criticall Value
l off the
h Hypothesis
h test
„ If the data are too unusual, we have “proven” Ho is false: Reject „ If the Test Statistic fall in the Rejection
Ho and go with Ha (Strong Statement)
„ If the data are not too unusual, we fail to reject Ho. This “proves”
Region, Ho is rejected.
nothing and we say data are inconclusive. (Weak Statement) „ The Test Statistic will also be used to
We can never “prove” Ho , only “disprove” it.
calculate the p-value as will be defined next.
„

„ “Prove” in statistics means support with (1-α)100% certainty.


(example: if α=.05, then we are 95% certain.

31 32

9-12 9-15

Example - Testing for the Population Mean


Large Sample, Population Standard Deviation Known p-Value in Hypothesis Testing
„ p-Value: the probability, assuming that the
„ When testing for the population mean null hypothesis is true, of getting a value of
from a large sample and the population the test statistic at least as extreme as the
standard deviation is known
known, the test computed d value
l for
f theh test.
statistic is given by: „ If the p-value is smaller than the significance
X −μ level, H0 is rejected.
Z = „ If the p-value is larger than the significance
σ/ n level, H0 is not rejected.

33 34

Comparing p-value to α Graphic where decision is to Reject Ho

„ Both p-value and α are probabilities. „ Ho: μ = 10


Ha: μ > 10
„ The p-value is determined by the data, and is the „ Design: Critical Value is
probability of getting results as extreme as the data determined by significance level α.
assuming H0 is true
true. Small values make one more „ Data Analysis: p-value
p value is
likely to reject H0. determined by Test Statistic
„ Test Statistic falls in Rejection
„ α is determined by design, and is the maximum Region.
probability the experimenter is willing to accept of „ p-value (blue) < α (purple)
rejecting a true H0. „ Reject Ho.
„ Reject H0 if p-value < α for ALL MODELS. „ Strong statement: Data supports
Alternative Hypothesis.

35 36

© Maurice Geraghty, 2011 6


Math 10 Part 6 – Hypothesis Testing

Graphic where decision is Fail to Reject Ho Hypotheses Testing – Procedure 4


„ Ho: μ = 10
Ha: μ > 10
„ Design: Critical Value is
determined by significance level α.
„ Data Analysis: p-value
p value is
determined by Test Statistic
„ Test Statistic falls in Non-rejection
Region.
„ p-value (blue) > α (purple)
„ Fail to Reject Ho.
„ Weak statement: Data is
inconclusive and does not support
Alternative Hypothesis.

37 38

Conclusions need to be consistent with the


Conclusions results of the Hypothesis Test.

„ Conclusions need to „ Rejecting Ho requires a strong statement in support of Ha.


„ Be consistent with the results of the Hypothesis Test. „ Failing to Reject Ho does NOT support Ho, but requires a weak
„ Use language that is clearly understood in the context of the statement of insufficient evidence to support Ha.
p
problem. „ Example:
The researcher wants to support the claim that, on average, students send
„ Limit the inference to the population that was sampled. „

more than 1000 text messages per month


„ Report sampling methods that could question the integrity of „ Ho: μ=1000 Ha: μ>1000
the random sample assumption. „ Conclusion if Ho is rejected: The mean number of text messages sent by
„ Conclusions should address the potential or necessity of students exceeds 1000.
further research, sending the process back to the first „ Conclusion if Ho is not rejected: There is insufficient evidence to support
the claim that the mean number of text messages sent by students exceeds
procedure. 1000.

39 40

Conclusions need to use language that is clearly Conclusions need to limit the inference to the
understood in the context of the problem. population that was sampled.
„ Avoid technical or statistical language. „ If a survey was taken of a sub-group of population, then the
„ Refer to the language of the original general question. inference applies to the subgroup.
„ Compare these two conclusions from a test of correlation „ Example
between home prices square footage and price. „ Studies by pharmaceutical companies will only test adult patients, making it
difficult to determine effective dosage and side effects for children.
children
Housing Prices and Square Footage

Conclusion 1: By rejecting the Null „ “In the absence of data, doctors use their medical judgment to decide on a
Hypothesis we are inferring that the
200
180
particular drug and dose for children. ‘Some doctors stay away from drugs,
Alterative Hypothesis is supported and 160 which could deny needed treatment,’ Blumer says. "Generally, we take our
that there exists a significant correlation 140 best guess based on what's been done before.”
between the independent and dependent
“The antibiotic chloramphenicol was widely used in adults to treat infections
120
„
variables in the original problem
Price

100

comparing home prices to square 80 resistant to penicillin. But many newborn babies died after receiving the
footage. 60 drug because their immature livers couldn't break down the antibiotic.”
40

Conclusion 2: Homes with more square source: FDA Consumer Magazine – Jan/Feb 2003
20
0
footage generally have higher prices. 10 15 20 25 30
Size

41 42

© Maurice Geraghty, 2011 7


Math 10 Part 6 – Hypothesis Testing

Conclusions need to report sampling methods Conclusions should address the potential or
that could question the integrity of the necessity of further research, sending the
random sample assumption. process back to the first procedure.
„ Be aware of how the sample was obtained. Here are some „ Answers often lead to new questions.
examples of pitfalls: „ If changes are recommended in a researcher’s conclusion, then
„ Telephone polling was found to under-sample young people during the further research is usually needed to analyze the impact and
2008 presidential campaign because of the increase in cell phone only
households. Since yyoung g people
p p were more likelyy to favor Obama,, this
effectiveness of the implemented changes.
caused bias in the polling numbers. „ There may have been limitations in the original research project
„ Sampling that didn’t occur over the weekend may exclude many full time (such as funding resources, sampling techniques, unavailability
workers. of data) that warrants more a comprehensive study.
„ Self-selected and unverified polls (like ratemyprofessors.com) could contain
immeasurable bias.
„ Example: A math department modifies is curriculum based on a
performance statistics for an experimental course. The department would
want to do further study of student outcomes to assess the effectiveness of
the new program.

43 44

9-13

EXAMPLE – General Question


Procedures of Hypotheses Testing
„ A food company has a policy that the stated
contents of a product match the actual results.

„ A General Question might be “Does


Does the stated net
weight of a food product match the actual
weight?”

„ The quality control statistician decides to test the


16 ounce bottle of Soy Sauce.

45 46

EXAMPLE – Design Experiment EXAMPLE – Conduct Experiment


„ A sample of n=36 bottles will be selected hourly and „ Last hour a sample of 36 bottles had a mean
the contents weighed. weight of 15.88 ounces.
„ From past data, assume the population
Ho: μ=16 Ha: μ ≠16
„
standard deviation is 0.5 ounces.
„ The Statistical Model will be the one population test „ Compute the Test Statistic
of mean using the Z Test Statistic. Z = [15.88 − 16] /[.5 / 36 ] = −1.44
This model will be appropriate since the sample size
„ For a two tailed test, The Critical Values are
„
insures the sample mean will have a Normal at Z = ±1.96
Distribution (Central Limit Theorem)

„ We will choose a significance level of α = 5%


47 48

© Maurice Geraghty, 2011 8


Math 10 Part 6 – Hypothesis Testing

9-16

Computation of the p-Value


Decision – Critical Value Method
„ This two-tailed test has
two Critical Value and „ One-Tailed Test: p-Value = P{z ≥ absolute value of
Two Rejection Regions the computed test statistic value}
„ The significance level (α)
mustt be
b divided
di id d by
b 2 so
„ Two-Tailed Test: p-Value = 2P{z ≥ absolute value of
that the sum of both
purple areas is 0.05 the computed test statistic value}
„ The Test Statistic does
not fall in the Rejection „ Example: Z= 1.44, and since it was a two-tailed test,
Regions. then p-Value = 2P {z ≥ 1.44} = 0.0749) = .1498.
„ Decision is Since .1498 > .05, do not reject H0.
Fail to Reject Ho.

49 50

Decision – p-value Method Example - Conclusion


„ The p-value for a two- „ There is insufficient evidence to conclude that the
tailed test must include all machine that fills 16 ounce soy sauce bottles is
values (positive and
negative) more extreme
operating improperly.
than the Test Statistic
Statistic. „ This conclusion is based on 36 measurements taken
„ p-value = .1498 which during a single hour’s production run.
exceeds α = .05
„ We recommend continued monitoring of the machine
Decision is
„
during different employee shifts to account for the
Fail to Reject Ho.
possibility of potential human error.

51 52

Hypothesis Testing Design Statistical Power and Type II error


State Your Hypotheses

Null Hypothesis Alternative Hypothesis Fail to Reject


Reject Ho
Ho
Determine Appropriate Model α
Ho is true 1−α
Test Statistic One or Two Tailed Type I error
β 1−β
Ho is False
Determine Decision Criteria Type II error Power
α – Significance Level β and Power Analysis
53 54

© Maurice Geraghty, 2011 9


Math 10 Part 6 – Hypothesis Testing

Graph of “Four Outcomes”


Statistical Power (continued)
„ Power is the probability of rejecting a
false Ho, when μ = μa
„ Power depends
p on:
„ Effect size |μo-μa|
„ Choice of α
„ Sample size
„ Standard deviation
„ Choice of statistical test

55 56

Statistical Power Example Statistical Power Example


„ Bus brake pads are claimed to last on average „ Set up the test
at least 60,000 miles and the company wants „ Ho: μ >= 60,000 miles
to test this claim. „ Ha: μ < 60,000 miles
„ α = 5%

„ The bus company considers a “practical” value „ Determine the Critical Value
for purposes of bus safety to be that the pads „ Reject Ho if X > 58,837
at least 58,000 miles.
„ Calculate β and Power
„ If the standard deviation is 5,000 and the „ β = 12%
sample size is 50, find the Power of the test „ Power = 1 – β = 88%
when the mean is really 58,000 miles. Assume
α = .05
57 58

Statistical Power Example New Models, Similar Procedures


„ The procedures outlined for the test of population
mean vs. hypothesized value with known population
standard deviation will apply to other models as well.
„ Examples of some other one population models:
„ Test of population mean vs. hypothesized value, population
standard deviation unknown.
„ Test of population proportion vs. hypothesized value.
„ Test of population standard deviation (or variance) vs.
hypothesized value.

59 60

© Maurice Geraghty, 2011 10


Math 10 Part 6 – Hypothesis Testing

10-5 10-9

Testing for the Population Mean: Population


Standard Deviation Unknown Decision Rules

„ Like the normal distribution, the logic for one and two tail
„ The test statistic for the one sample case is given by: testing is the same.
„ For a two-tail test using the t-distribution, you will reject
X − μ the null hypothesis when the value of the test statistic is
t = greater than tdf,α/2 or if it is less than - tdf,α/2
s / n
„ For a left-tail test using the t-distribution, you will reject
„ The degrees of freedom for the test is n-1. the null hypothesis when the value of the test statistic is
„ The shape of the t distribution is similar to the Z, less than -tdf,α
except the tails are fatter, so the logic of the decision
„ For a right-tail test using the t-distribution, you will reject
rule is the same. the null hypothesis when the value of the test statistic is
greater than tdf,α

61 62

10-6 10-7

Example – one population test of


Example – Designing Test
mean, σ unknown
„ Humerus bones from the same species have approximately the same length- „ Research Hypotheses
to-width ratios. When fossils of humerus bones are discovered, „ Ho: The humerus bones are from Species A
archaeologists can determine the species by examining this ratio. It is known
that Species A has a mean ratio of 9.6. A similar Species B has a mean ratio „ Ha: The humerus bones are not from Species A
of 9.1 and is often confused with Species A.
„ In terms of the population mean
„ 21 humerus bones were unearthed in an area that was originally thought to
be inhabited Species A. (Assume all unearthed bones are from the same „ Ho: μ = 9.6
species.) „ Ha: μ ≠ 9.6
„ Design a hypotheses where the alternative claim would be the humerus
bones were not from Species A. „ Significance level
„ Determine the power of this test if the bones actually came from Species B „ α =.05
(assume a standard deviation of 0.7)
„ Conduct the test using at a 5% significance level and state overall „ Test Statistic (Model)
conclusions. „ t-test of mean vs. hypothesized value.

63 64

Example - Power Analysis Example – Power Analysis


„ Information needed for Power Calculation
„ μo = 9.6 (Species A)
„ μa = 9.1 (Species B)
„ Effect Size =| μo - μa | = 0.5
„ σ = 0.7
0 7 (given)
„ α = .05
„ n = 21 (sample size)
„ Two tailed test

„ Results using online Power Calculator*


„ Power =.8755
„ β = 1 - Power = .1245
„ If humerus bones are from Species B, test has an
87.55% chance of correctly rejecting Ho and
a maximum Type II error of 12.55%
*source: Russ Lenth, University of Iowa – http://www.stat.uiowa.edu/~rlenth/Power/ 65 66

© Maurice Geraghty, 2011 11


Math 10 Part 6 – Hypothesis Testing

Example – Output of Data Analysis Example - Conclusions


„ Results:
„ The evidence supports the claim (pvalue<.05) that the humerus bones are
not from Species A.
6 7 8 9 10 11 12 „ Sampling Methodology:
„ We are assuming since the bones were unearthed in the same location
location,
they came from the same species.

P-value = .0308 „ Limitations:


A small sample size limited the power of the test, which prevented us from
α =.05
„

making a more definitive conclusion.


Since p-value < α „ Further Research
Ho is rejected and „ Test if the bone are from Species B or another unknown species.
we support Ha. „ Test to see if bones are the same age to support the sampling
methodology.

67 68

9-24 9-25

Test Statistic for Testing a Single


Tests Concerning Proportion
Population Proportion
„ Proportion: A fraction or percentage that indicates „ If sample size is sufficiently large, p̂ has an
the part of the population or sample having a approximately normal distribution. This
particular trait of interest. approximation is reasonable if np(1-p)>5

The population proportion is denoted by p . pˆ − p


„
z =
p (1 − p )
„ The sample proportion is denoted by p̂ where
n
number of successes in the sample p = population proportion
pˆ =
number sampled pˆ = sample proportion
69 70

9-26 10-7

Example Example – Designing Test

„ In the past, 15% of the mail order solicitations for a certain „ Research Hypotheses
charity resulted in a financial contribution. „ Ho: The new letter is not more effective.
„ A new solicitation letter has been drafted and will be sent to „ Ha: The new letter is more effective.
a random sample of potential donors.
„ A hypothesis test will be run to determine if the new letter is „ In terms of the population proportion
more effective. „ Ho: p = 0.15
„ Determine the sample size so that: „ Ha: p > 0.15
„ The test can be run at the 5% significance level. „ Significance level
„ If the letter has an 18% success rate, (an effect size of 3%), the power
of the test will be 95% „ α =.05
„ After determining the sample size, conduct the test. „ Test Statistic (Model)
„ Z-test of proportion vs. hypothesized value.

71 72

© Maurice Geraghty, 2011 12


Math 10 Part 6 – Hypothesis Testing

Example - Power Analysis


Example – Power Analysis
„ Information needed for Sample Size Calculation
„ po = 0.15 (current letter)
„ pa = 0.18 (potential new letter)
„ Effect Size =| po - pa | = 0.03
„ Desired Power = 0 0.95
95
„ α = .05
„ One tailed test

„ Results using online Power Calculator*


„ Sample size = 1652
„ The charity should send out 1652 new solicitation
letters to potential donors and run the test.

*source: Russ Lenth, University of Iowa – http://www.stat.uiowa.edu/~rlenth/Power/

73 74

9-27

EXAMPLE
Example – Output of Data Analysis Critical Value Alternative Method
286
„ Critical Value =1.645 (95th percentile of the Normal
Distribution.)
„ H0 is rejected if Z > 1.645
⎛ 286 ⎞
Test Statistic: ⎜ − .15 ⎟
1366 „
⎝ 1652 ⎠ = 2 .63
Z=
Response No Response
(. 15 )(. 85 )
1652
„ P-value = .0042
„ α =0.05 „ Since Z = 2.63 > 1.645, H0 is rejected. The new
letter is more effective.
„ Since p-value < α, Ho is rejected and we support Ha.

75 76

9-24

Test for Variance or Standard


Example - Conclusions Deviation vs. Hypothesized Value
„ Results: „ We often want to make a claim about the variability, volatility or
„ The evidence supports the claim (pvalue<.01) that the new letter is more consistency of a population random variable.
effective.
„ Sampling Methodology: „ Hypothesized values for population variance σ2 or standard
„ The 1652 test letters were selected as a random sample from the charity
charity’ss d i ti σ are tested
deviation t t d with the χ2 distribution.
ith th di t ib ti
mailing list. All letters were sent at the same time period.
„ Limitations:
„ Examples of Hypotheses:
„ The letters needed to be sent in a specific time period, so we were not able
to control for seasonal or economic factors. „ Ho: σ = 10 Ha: σ ≠ 10
„ Ho: σ = 100 Ha: σ2 > 100
2
„ Further Research
„ Test both solicitation methods over the entire year to eliminate seasonal
effects. „ The sample variance s2 is used in calculating the Test Statistic.
„ Send the old letter to another random sample to create a control group.

77 78

© Maurice Geraghty, 2011 13


Math 10 Part 6 – Hypothesis Testing

Test Statistic uses χ2 distribtion


Example
„ s2 is the test statistic for the population variance. „ A state school administrator claims that the standard deviation
of test scores for 8th grade students who took a life-science
Its sampling distribution is a χ2 distribution with assessment test is less than 30, meaning the results for the
n-1 d.f. class show consistency.
„ A auditor
An dit wants t to
t supportt th
thatt claim
l i bby analyzing
l i 41 students
t d t

( n − 1) s 2 recent test scores, shown here:


χ = 2

σ o2
0 10 20 30 40

„ The test will be run at 1% significance level.

79 80

10-7

Example – Designing Test


Example – Output of Data Analysis
„ Research Hypotheses Histogram

„ Ho: Standard deviation for test scores equals 30. 40


35
30
Ha: Standard deviation for test scores is less than 30.
Percent

„ 25
20

In terms of the population variance


15
„ 10
5
0
„ Ho: σ2 = 900
„ Ha: σ2 < 900 data

„ Significance level
„ α =.01 „ p-value = .0054
„ Test Statistic (Model) „ α =0.01
χ2-test of variance vs. hypothesized value.
„
„ Since p-value < α, Ho is rejected and we support Ha.

81 82

9-27

EXAMPLE
Critical Value Alternative Method Example – Decision Graph
„ Critical Value =22.164 (1st percentile of the Chi-
square Distribution.)

„ H0 is rejected if χ2 < 22.164

(40 )(469 .426 ) = 20 .86


Test Statistic: χ =
2
„
900

„ Since Z = 20.86< 22.164, H0 is rejected. The claim


that the standard deviation is under 30 is supported.
83 84

© Maurice Geraghty, 2011 14


Math 10 Part 6 – Hypothesis Testing

Example - Conclusions
„ Results:
„ The evidence supports the claim (pvalue<.01) that the standard deviation
for 8th grade test scores is less than 30.
„ Sampling Methodology:
„ Th 41 test
The t t scores were the
th results
lt off th
the recently
tl administered
d i i t d exam to
t
the 8th grade students.
„ Limitations:
„ Since the exams were for the current class only, there is no assurance that
future classes will achieve similar results.
„ Further Research
„ Compare results to other schools that administered the same exam.
„ Continue to analyze future class exams to see if the claim is holding true.

85

© Maurice Geraghty, 2011 15


Math 10 – Part 7 – Two Population Inference

Comparing two population means


Math 10 „ Four models
„ Independent Sampling
„ Large Sample or known variances
„ Z - test
„ The 2 population variances are equal
Part 7 „ Pooled variance t-test
„ The 2 population variances are unequal
Two Population Inference „ t-test for unequal variances
„ Dependent Sampling
„ Matched Pairs t-test
© Maurice Geraghty 2014

1 2

Independent Sampling Dependent sampling

3 4

Difference of Two Population means Difference between two means –


large sample Z test
„ If both n1 and n2 are over 30 and the two populations are
„ X 1 − X 2 is Random Variable independently selected, this test can be run.
X 1 − X 2 is a point estimator for μ1−μ2 „ Test Statistic:
„

( X 1 − X 2 ) − ( μ1 − μ 2 )
„ The standard deviation is Z=
σ1 2
σ 2
σ 12 σ 22
given by the formula
n1
+
n2
2
+
n1 n2
„ If n1 and n2 are sufficiently large, X1 − X 2
follows a normal distribution.

5 6

© Maurice Geraghty 2014 1


Math 10 – Part 7 – Two Population Inference

10-13

Example 1 EXAMPLE 1 - Design


H o : μ1 ≤ μ 2 H a : μ1 > μ 2
„ Are larger houses more likely to have
H o : μ1 − μ 2 ≤ 0 H a : μ1 − μ 2 > 0
pools?
„ The housing data square footage (size) α=.01
was split into two groups by pool (Y/N).
„ Test the hypothesis that the homes Z = ( X 1 − X 2 ) /( σ 1 / n1 + σ 2 / n2 )
with pools have more square feet than
the homes without pools. Let α = .01 H0 is rejected if Z>2.326

7 8

EXAMPLE 1 Data EXAMPLE 1 DATA

„ Population 1 „ Population 2 ( 26.25 − 23.04) − 0


Size with pool Size without pool Z= = 4.19
6.932 4.552
„ Sample size = 130 „ Sample size = 95
+
130 95
„ Sample mean = 26.25 „ Sample mean = 23.04 „ Decision: Reject Ho
„ Conclusion: Homes with pools have
Standard Dev = 6.93 Standard Dev = 4.55
more mean square footage.
„ „

9 10

EXAMPLE 1 p-value method EXAMPLE 1 – Results/Decision


„ Using Technology Sq ft with Sq ft no

Reject Ho if the
pool pool

p-value < α Mean 26.25 23.04


Std Dev 6 93
6.93 4 55
4.55
Observations 130 95
Hypothesized
Mean Difference 0

Z 4.19

p-value 0.0000137

11 12

© Maurice Geraghty 2014 2


Math 10 – Part 7 – Two Population Inference

10-10 10-11

Pooled Sample Variance and


Pooled variance t-test
Test Statistic
„ Pooled Sample
To conduct this test, three Variance: ( n1 − 1) s12 + ( n 2 − 1) s22
„ 2
sp =
assumptions are required: n1 + n 2 − 2
„ The populations must be normally or „ Test Statistic:
approximately normally distributed (or central limit ( X 1 − X 2 ) − ( μ1 − μ 2 )
theorem must apply). t=
1 1
„ The sampling of populations must be sp +
independent. n1 n2
„ The population variances must be equal. df = n1 + n2 − 2

13 14

10-12 10-13

EXAMPLE 2 EXAMPLE 2
A recent EPA study compared the highway fuel
: H o : μ1 ≤ μ 2 H a : μ1 > μ 2
„
economy of domestic and imported passenger „
cars. „ : α=.05
„ A sample of 12 imported cars revealed a mean of
35 76 mpg with a standard deviation of 3
35.76 3.86.
86 „ : t = ( X 1 − X 2 ) /( s p 1 / n1 + 1 / n2 )
„ A sample of 15 domestic cars revealed a mean of „ : H0 is rejected if t>1.708, df=25
33.59 mpg with a standard deviation of 2.16
mpg. „ : t=1.85 H0 is rejected. Imports have a
„ At the .05 significance level can the EPA conclude higher mean mpg than domestic cars.
that the mpg is higher on the imported cars?
(Let subscript 2 be associated with domestic
cars.)

15 16

10-13

t-test when variances are not


equal. EXAMPLE 2

t′ =
( X 1 − X 2 ) − (μ1 − μ 2 ) „ : H o : μ1 ≤ μ 2 H a : μ1 > μ 2
„ Test statistic: s 12 s2 : α=.05
+ 2 „
n1 n2
„ : tt’ test
2
⎛ s 12
⎜⎜
s2 ⎞
+ 2 ⎟⎟ „ : H0 is rejected if t>1.746, df=16
df = ⎝ n1 n2 ⎠
„ Degrees of freedom: (
⎡ s2 n 2 ) (
s 2 n2 ⎤ )
2
„ : t’=1.74 H0 is not rejected. There is
⎢ 1 1
+ 2 ⎥
⎣⎢ (n 1 − 1 ) (n 2 − 1 ) ⎥⎦ insufficient sample evidence to claim a higher
mpg on the imported cars.
„ This test (also known as the Welch-Aspin Test) has less power
then the prior test and should only be used when it is clear the
population variances are different.

17 18

© Maurice Geraghty 2014 3


Math 10 – Part 7 – Two Population Inference

Megastat Result – Equal Variances


Using Technology
„ Decision Rule: Reject Ho if pvalue<α
„ Megastat: Compare Two Independent Groups
„ Use Equal Variance or Unequal Variance Test
„ Use Original Data or Summarized Data

domestic 29.8 33.3 34.7 37.4 34.4 32.7 30.2 36.2 35.5 34.6 33.2 35.1 33.6 31.3 31.9

import 39.0 35.1 39.1 32.2 35.6 35.5 40.8 34.7 33.2 29.4 42.3 32.2

19 20

10-14

Megastat Result – Unequal Variances Hypothesis Testing - Paired Observations

„ Independent samples are samples that are not


related in any way.
„ Dependent samples are samples that are paired or
related in some fashion.
„ For example, if you wished to buy a car you

would look at the same car at two (or more)


different dealerships and compare the prices.
„ Use the following test when the samples are
dependent:

21 22

10-15 10-16

Hypothesis Testing Involving


EXAMPLE 3
Paired Observations
An independent testing agency is
X d − μd „

t= comparing the daily rental cost for renting


sd n a compact car from Hertz and Avis.
„ A random sample of 15 cities is obtained
„ where X d is the average of the and the following rental information
differences obtained.
„ sd is the standard deviation of the
„ At the .05 significance level can the testing
differences agency conclude that there is a difference
„ n is the number of pairs (differences) in the rental charged?
23 24

© Maurice Geraghty 2014 4


Math 10 – Part 7 – Two Population Inference

Example 3 – continued Example 3 - continued

By taking the
•Data for Hertz difference of each pair,
variability (measured
X 1 = 46.67 by standard deviation)
is reduced.
s1 = 5.23
X d = 1.80
•Data for Avis
sd = 2.513
X 2 = 44.87
n = 15
s2 = 5.62

25 26

10-18

EXAMPLE 3 continued
Megastat Output – Example 3
„ H 0 : μd = 0 H1: μd ≠ 0
„ α=.05
„ Matched pairs t test, df=14
„ H0 is rejected if t<-2.145 or t>2.145
„ t = (1.80 ) /[ 2.513 / 15 ] = 2.77
„ Reject H0.
„ There is a difference in mean price for
compact cars between Hertz and Avis.
Avis has lower mean prices.

27 28

11-3 11-4

Characteristics of F-
Test for Equal Variances
Distribution
„ There is a “family” of F „ For the two tail test, the test statistic is given
Distributions.
„ Each member of the family is by: S2
determined by two
parameters: the numerator F = i
2
degrees of freedom and the S j
denominator degrees of
freedom. „ si2 and s 2j are the sample variances for
F cannot be negative, and it
„
is a continuous distribution. the two populations.
„ The F distribution is „ There are 2 sets of degrees of freedom:
positively skewed.
„ Its values range from 0 to ∞
ni-1 for the numerator, nj-1 for the
. As F → ∞ the curve denominator
approaches the X-axis.

29 30

© Maurice Geraghty 2014 5


Math 10 – Part 7 – Two Population Inference

11-6

EXAMPLE 4 Test Statistic depends on Hypotheses


Hypotheses Test Statistic
„ A stockbroker at brokerage firm, reported that the H o : σ1 ≥ σ 2
mean rate of return on a sample of 10 software s22
stocks was 12
12.6
6 percent with a standard deviation H a :σ1 < σ 2 F= use α table
s12
of 4.9 percent.
The mean rate of return on a sample of 8 utility
H o : σ1 ≤ σ 2
„
s12
stocks was 10.9 percent with a standard deviation F= use α table
of 3.5 percent. H a :σ1 > σ 2 s22
„ At the .05 significance level, can the broker
conclude that there is more variation in the H o : σ1 = σ 2 max(s12 , s22 )
software stocks? F= use α / 2 table
H a : σ1 ≠ σ 2 min(s12 , s22 )
31 32

11-7

EXAMPLE 4 continued
Excel Example
„ : H o : σ1 ≤ σ 2 H a : σ1 > σ 2 „ Using Megastat – Test for equal variances under two
„ : α =.05 population independent samples test and click the
„ : F-test box to test for equality of variances
„ :H0 is rejected if F>3
F>3.68,
68 df=(9
df=(9,7)
7) „ The default pp-value is a two-tailed test,, so take one-
half reported p-value for one-tailed tests
„ : F=4.92/3.52 =1.96 Æ Fail to RejectH0. „ Example – Domestic vs Import Data
„ Ho :σ1 = σ 2 H a :σ1 ≠ σ 2
„ There is insufficient evidence to claim more „ α =.10
variation in the software stock. „ Reject Ho means use unequal variance t-test
„ FTR Ho means use pooled variance t-test

33 34

Excel Output Compare Two Means Flowchart

pvalue <.10, Reject Ho

Use unequal variance t-test


to compare means.

35 36

© Maurice Geraghty 2014 6


Math 10 – Part 8 Slides

14-2

Characteristics of the Chi-


Square Distribution

The major characteristics of the chi-


Math 10 M Geraghty
„
square distribution are:
„ It is p
positivelyy skewed
„ It is non-negative

Part 8 „

„
It is based on degrees of freedom
When the degrees of freedom change a new
Chi-square and ANOVA tests distribution is created

© Maurice Geraghty 2015

1 2

2-2 14-4
CHI--SQUARE DISTRIBUTION
CHI Goodness-of-Fit Test: Equal
Expected Frequencies
df = 3
„ Let Oi and Ei be the observed and expected
frequencies respectively for each category.
df = 5 „ H0 : there is no difference between Observed and
Expected Frequencies
df = 10 „ H a: there is a difference between Observed and
Expected Frequencies
The test statistic is: (O i − Ei )
2
„ χ 2
= ∑ Ei

χ2 „ The critical value is a chi-square value with


(k-1) degrees of freedom, where k is the
number of categories
3 4

14-5 14-6

EXAMPLE 1
EXAMPLE 1 continued
„ The following data on absenteeism was collected from a
manufacturing plant. At the .01 level of significance, test to „ Assume equal expected frequency:
determine whether there is a difference in the absence rate by (95+65+60+80+100)/5=80
day of the week.
Day Frequency Day
y O E ((O-E)^2/E
)
Mon 95 80 2.8125
Monday 95 Tues 65 80 2.8125
Tuesday 65 Wed 60 80 5.0000
Wednesday 60 Thur 80 80 0.0000
Fri 100 80 5.0000
Thursday 80
Total 400 400 15.625
Friday 100
5 6

© Maurice Geraghty 2015 1


Math 10 – Part 8 Slides

14-7 14-8

Goodness-of-Fit Test: Unequal


EXAMPLE 1 continued Expected Frequencies
EXAMPLE 2
„ Ho: there is no difference between the observed and
the expected frequencies of absences. The U.S. Bureau of the Census (2000) indicated that 54.4%
„ Ha: there is a difference between the observed and of the population is married, 6.6% widowed, 9.7% divorced
the expected frequencies of absences. (and not re-married), 2.2% separated, and 27.1% single
((never been married).
)
„ Test statistic: chi-square=Σ(O-E)2/E=15.625
A sample of 500 adults from the San Jose area showed that
„ Decision Rule: reject Ho if test statistic is greater than 270 were married, 22 widowed, 42 divorced, 10 separated,
the critical value of 13.277. (4 df, α=.01) and 156 single.

At the .05 significance level can we conclude that the San


„ Conclusion: reject Ho and conclude that there is a Jose area is different from the U.S. as a whole?
difference between the observed and expected
frequencies of absences.

7 8

14-9 14-10

EXAMPLE 2 continued EXAMPLE 2 continued

Status O E (O − E)
2

∑ E „ Design: Ho: p1=.544 p2=.066 p3=.097 p4=.022 p5=.271


Married 270 272 0.015 Ha: at least one pi is different
Widowed 22 33 3.667 „ α=.05
Divorced 42 48.5 0.871 „ Model: Chi-Square Goodness of Fit, df=4
Separated 10 11 0.091 „ Ho is rejected if χ2 > 9.488
„ Data: χ2 = 7.745, Fail to Reject Ho
Single 156 135.5 3.101
„ Conclusion: Insufficient evidence to conclude
Total 500 500 7.745 San Jose is different than the US Average

9 10

14-15 14-16

Contingency Table Analysis


EXAMPLE 3
„ In May 2014, Colorado became the first state to legalize the
„ Contingency table analysis is used to test whether recreational use of marijuana.
two traits or variables are related.
„ Each observation is classified according to two „ A poll of 1000 adults were classified by gender and their
opinion about same-sex marriage.
variables.
„ The usual hypothesis testing procedure is used. „ At the .05 level of significance, can we conclude that gender
„ The degrees of freedom is equal to: (number of and the opinion about legalizing marijuana for recreational use
are dependent events?
rows-1)(number of columns-1).
„ The expected frequency is computed as: Expected
Frequency = (row total)(column total)/grand total

11 12

© Maurice Geraghty 2015 2


Math 10 – Part 8 Slides

14-17 14-18

EXAMPLE 3 continued EXAMPLE 3 continued

„ Design: Ho: Gender and Opinion are independent.


Ha: Gender and Opinion are dependent.
„ α=.05
„ Model: Chi-Square
Chi Square Test for Independence
Independence, df=2
df 2
„ Ho is rejected if χ2 > 5.99
„ Data: χ2 = 6.756, Reject Ho
„ Conclusion: Gender and opinion are dependent
variables. Men are more likely to support legalizing
marijuana for recreational use.

13 14

11-3 11-8

Characteristics of F- Underlying Assumptions for


Distribution ANOVA
There is a “family” of F
The F distribution is also used for testing
„
Distributions. „
„ Each member of the family is
determined by two
the equality of more than two means using
parameters: the numerator a technique called analysis of variance
degrees of freedom and the
denominator degrees of (ANOVA). ANOVA requires the following
freedom.
„ F cannot be negative, and it
conditions:
is a continuous distribution. „ The populations being sampled are normally
„ The F distribution is distributed.
positively skewed.
„ The populations have equal standard deviations.
„ Its values range from 0 to ∞
. As F → ∞ the curve „ The samples are randomly selected and are
approaches the X-axis. independent.

15 16

11-9

Analysis of Variance Procedure ANOVA – Null Hypothesis

„ The Null Hypothesis: the population means are the


same.
„ The Alternative Hypothesis: at least one of the
means is different.
„ The Test Statistic: F=(between sample
variance)/(within sample variance).
„ Decision rule: For a given significance level α ,
reject the null hypothesis if F (computed) is greater Ho is true -all Ho is false -not
than F (table) with numerator and denominator means the all means the
degrees of freedom. same same

17 18

© Maurice Geraghty 2015 3


Math 10 – Part 8 Slides

11-10 11-11

ANOVA NOTES
Formulas for ANOVA
„ If there are k populations being sampled, then the df
(numerator)=k-1

( ) (ΣXn )
„ If there are a total of n sample points, then df (denominator) =
2
n-k
SSTotal = Σ X 2 −
„ The test statistic is computed
p by:F=[(SS
y [( F)/(
)/(k-1)]/[(SS
)]/[( E)/(
)/(N-k)].
)]

„ SSF represents the factor (between) sum of squares.


⎛ T 2 ⎞ (ΣX )2
SSFactor = Σ⎜⎜ c ⎟⎟ −
„ SSE represents the error (within) sum of squares.
„ Let TC represent the column totals, nc represent the number of
observations in each column, and ΣX represent the sum of all
the observations. ⎝ nc ⎠ n
„ These calculations are tedious, so technology is used to
generate the ANOVA table. SSError = SSTotal − SSFactor
19 20

11-12

ANOVA Table EXAMPLE 4


„ Party Pizza specializes in meals for students. Hsieh Li,
President, recently developed a new tofu pizza.

Source SS df MS F „ Before making it a part of the regular menu she decides to


test it in several of her restaurants. She would like to know if
there is a difference in the mean number of tofu pizzas sold
Factor SSFactor k-1 SSF/dfF MSF/MSE per day at the Cupertino, San Jose, and Santa Clara pizzerias
for sample of five days.
Error SSError n-k SSE/dfE
„ At the .05 significance level can Hsieh Li conclude that there
is a difference in the mean number of tofu pizzas sold per day
Total SSTotal n-1 at the three pizzerias?

21 22

Example 4 Example 4 continued


Cupertino San Jose Santa Clara Total
13 10 18
1822
12
14
12
13
16
17 SSTotal = 2634 − = 86
12 11 17 13
17
1822
T
n
51
4
46
4
85
5
182
13 SSFactor = 2624.25 − = 76.25
Means 12.75 11.5 17 14 13
Σ^2 653 534 1447 2634
SSError = 86 − 76.25 = 9.75

23 24

© Maurice Geraghty 2015 4


Math 10 – Part 8 Slides

11-14

Example 4 continued EXAMPLE 4 continued

ANOVA TABLE
„ Design: Ho: μ1=μ2=μ3
Source SS df MS F Ha: Not all the means are the same
„ α= 05
α=.05
Factor 76.25 2 38.125 39.10 „ Model: One Factor ANOVA
„ H0 is rejected if F>4.10
Error 9.75 10 0.975 „ Data: Test statistic: F=[76.25/2]/[9.75/10]=39.1026
Total 86.00 12 „ H0 is rejected.
„ Conclusion: There is a difference in the mean
number of pizzas sold at each pizzeria.

25 26

Post Hoc Comparison Test


„ Used for pairwise comparison
„ Designed so the overall signficance
level is 5%.
5%
„ Use technology.
„ Refer to Tukey Test Material in
Supplemental Material.

27 28

Post Hoc Comparison Test Post Hoc Comparison Test

29 30

© Maurice Geraghty 2015 5


ANOVA - Tukey’s HSD Test

Application: One-way ANOVA – pair-wise comparison of means.

Requirements: Model is ussually balanced, which means that the sample size in each population
should be the same. The samples taken in each population are called replicates. Each population is
called a treatment. (Note: There are methods of approximating this model if the design is not
balanced, but we will not cover them.)

Tests: H o : μ i = μ j H a : μ i ≠ μ j where the subscripts i and j represent two different populations

Overall significance level of α. This means that all pairwise tests can be run at the same time with an
overall significance level of α.

MSE
Test Statistic: HSD = q
nc
q = value from studentized range table.

MSE = Mean Square Error from ANOVA table

nc = number of replicates per treatment

Decision: Reject Ho if X i − X j > HSD

Note: Minitab will group differences into families by assigning letters. Pairs that do not share a
common letter are significantly different pairs.

Example:
Valencia oranges were tested for juiciness at 4 different orchards. Eight oranges were sampled from
each orchard, and the total ml of juice per 20 gms of orange was calculated:

Orchard A: Orchard B: Orchard C: Orchard D:


11,13,12,14, 10,9,8,10, 13,15,14,11, 9,7,11,9,
9,13,11,9 11,12,7,8 12,10,16,11 9,11,10,8
SS Total =158.469 SS Between=69.594

a. Test for a difference in Orchards using alpha = .05


Perform all the pairwise comparisons using Tukey's Test and an overall risk level of 5%.
One-way ANOVA: Orchard A, Orchard B, Orchard C, Orchard D

Source DF SS MS F P
Factor 3 69.59 23.20 7.31 0.001
Error 28 88.88 3.17
Total 31 158.47

S = 1.782 R-Sq = 43.92% R-Sq(adj) = 37.91%

Individual 95% CIs For Mean Based on Pooled StDev


Level N Mean StDev +---------+---------+---------+---------
Orchard A 8 11.500 1.852 (-------*-------)
Orchard B 8 9.375 1.685 (-------*-------)
Orchard C 8 12.750 2.121 (-------*-------)
Orchard D 8 9.250 1.389 (-------*-------)
+---------+---------+---------+---------
8.0 9.6 11.2 12.8

Pooled StDev = 1.782

Grouping Information Using Tukey Method

N Mean Grouping
Orchard C 8 12.750 A
Orchard A 8 11.500 A B
Orchard B 8 9.375 B
Orchard D 8 9.250 B

Means that do not share a letter are significantly different.

Tukey 95% Simultaneous Confidence Intervals


All Pairwise Comparisons

Individual confidence level = 98.92%

Orchard A subtracted from:

Lower Center Upper +---------+---------+---------+---------


Orchard B -4.556 -2.125 0.306 (-------*-------)
Orchard C -1.181 1.250 3.681 (-------*-------)
Orchard D -4.681 -2.250 0.181 (--------*-------)
+---------+---------+---------+---------
-6.0 -3.0 0.0 3.0

Orchard B subtracted from:

Lower Center Upper +---------+---------+---------+---------


Orchard C 0.944 3.375 5.806 (-------*-------)
Orchard D -2.556 -0.125 2.306 (--------*-------)
+---------+---------+---------+---------
-6.0 -3.0 0.0 3.0

Orchard C subtracted from:

Lower Center Upper +---------+---------+---------+---------


Orchard D -5.931 -3.500 -1.069 (-------*-------)
+---------+---------+---------+---------
-6.0 -3.0 0.0 3.0
Math 10 - Part 9 Slides - Regression

Mathematical Model

Math 10 „

„
You have a small business producing custom t-shirts.
Without marketing, your business has revenue
(sales) of $1000 per week.
„ E
Every dollar
d ll you spend d marketing
k ti will
ill increase
i
revenue by 2 dollars.
Correlation and Regression „ Let variable X represent amount spent on marketing
and let variable Y represent revenue per week.
Part 9 Slides „ Write a mathematical model that relates X to Y
© Maurice Geraghty 2015

1 2

Mathematical Model - Table Mathematical Model - Scatterplot


X=marketing Y=revenue

$0 $1000

$500 $2000

$1000 $3000

$1500 $4000

$2000 $5000

3 4

Mathematical Model - Linear Mathematical Linear Model


Linear Model Example

Y = β0 + β1X Y =1000 + 2X
Y : Dependent Variable Y : Re venue
X : Independent Variable X : Marketing
β0 : Y −intercept β0 : $1000
β1 : Slope β1 : $2 per $1marketing

5 6

© Maurice Geraghty 2015 1


Math 10 - Part 9 Slides - Regression

Statistical Model Statistical Model - Table


„ You have a small business producing custom t-shirts. X=Marketing Expected Y=Actual ε=Residual
Revenue Revenue Error
„ Without marketing, your business has revenue
(sales) of $1000 per week. $0 $1000 $1100 +$100
„ E
Every dollar
d ll you spend d marketing
k ti willill increase
i $500 $2000 $1500 -$500
revenue by an expected value of 2 dollars.
„ Let variable X represent amount spent on marketing $1000 $3000 $3500 +$500
and let variable Y represent revenue per week.
$1500 $4000 $3900 -$100
„ Let ε represent the difference between Expected
Revenue and Actual Revenue (Residual Error) $2000 $5000 $4900 -$100
„ Write a statistical model that relates X to Y

7 8

Statistical Model - Scatterplot Statistical Model - Linear

9 10

12-15

Statistical Linear Model Regression Analysis


Regression Model Example
„ Purpose: to determine the regression
Y = β0 + β1X + ε equation; it is used to predict the value of
Y =1000 + 2X +ε the dependent variable (Y) based on the
Y : Dependent Variable independent variable (X).
Y : Revenue
X : Independent Variable Procedure: select a sample from the
X : Marketing „

β0 : Y −intercept population and list the paired data for each


β0 : $1000 observation; draw a scatter diagram to give
β1 : Slope
β1 : $2 per $1marketing a visual portrayal of the relationship;
ε : Normal(0,σ ) determine the regression equation.

11 12

© Maurice Geraghty 2015 2


Math 10 - Part 9 Slides - Regression

Estimation of Population Parameters


Simple Linear Regression Model
„ From sample data, find statistics that will
estimate the 3 population parameters
Y = β 0 + β1 X + ε
„ Slope parameter
Y: Dependent Variable „ b1 will
ill be
b an estimator
ti t forf β1
X : Independen t Variable „ Y-intercept parameter
β 0 : Y − intercept „ bo 1 will be an estimator for βo

β 1 : Slope „ Standard deviation


„ se will be an estimator for σ
ε : Normal ( 0 , σ )

13 14

12-16 12-19

Regression Analysis Assumptions Underlying Linear


Regression
„ the regression equation: Yˆ = b0 + b1 X, where:
„ Yˆ is the average predicted value of Y for any X. „ For each value of X, there is a group of Y values,
„ b0 is the Y-intercept, or the estimated Y value when and these Y values are normally distributed.
X=0 „ The means of these normal distributions of Y
„ b1 is the slope of the line, or the average change values all lie on the straight line of regression.
in Yˆ for each change of one unit in X „ The standard deviations of these normal
„ the least squares principle is used to obtain b1 and b0
distributions are equal.
„ The Y values are statistically independent. This
SSX = ΣX 2 − 1n (ΣX )
2
SSXY means that in the selection of a sample, the Y
b1 = values chosen for a particular X value do not
SSY = ΣY 2 − 1n (ΣY )
2 SSX
depend on the Y values for any other X values.
SSXY = ΣXY − 1n (ΣX ⋅ Σy ) b0 = Y − b1 X
15 16

Example Example continued


„ X = Average Annual Rainfall (Inches) scatterplot
„ Y = Average Sale of Sunglasses/1000
asses

60
Make a Scatterplot
per 1000
0
sales sungla

„
40
„ Find the least square line 20
0
0 10 20 30 40 50
X 10 15 20 30 40
rainfall

Y 40 35 25 25 15

17 18

© Maurice Geraghty 2015 3


Math 10 - Part 9 Slides - Regression

Example continued Example continued


2 2
X Y X Y XY „ Find the least square line
10 40 100 1600 400 „ SSX = 580
15 35 225 1225 525 „ SSY = 380
20 25 400 625 500
„ SSXY = -445
30 25 900 625 750
40 15 1600 225 600 b1 = -.767
„

„
b0 = 45.647
Σ 115 140 3225 4300 2775
„ Yˆ = 45.647 - .767X

19 20

12-18

The Standard Error of


Estimate Example continued
„ The standard error of estimate measures the scatter, „ Find SSE and the X Y Y’ (Y-Yhat)2
or dispersion, of the observed values around the line standard error:
of regression 10 40 37.97 4.104
„ The formulas that are used to compute the standard 15 35 34.14
34 14 0 743
0.743
„ SSR = 341.422
error:
„ SSE = 38.578 20 25 30.30 28.108
SSR = b1 ⋅ SSXY
MSE = 12.859 30 25 22.63 5.620
( )
„

SSE = ∑ Y − Yˆ = SSY − SSR


2
„ se = 3.586
40 15 14.96 0.002
MSE = SSE
(n − 2) Total 38.578
se = MSE
21 22

12-3 12-4

Correlation Analysis The Coefficient of Correlation, r


Correlation Analysis: A group of statistical
„
techniques used to measure the strength of the „ The Coefficient of Correlation (r) is a
relationship (correlation) between two variables. measure of the strength of the
„ Scatter Diagram: A chart that portrays the relationship between two variables.
relationship between the two variables of „ It requires interval or ratio-scaled data (variables).
interest. „ It can range from -1.00 to 1.00.
„ Dependent Variable: The variable that is being „ Values of -1.00 or 1.00 indicate perfect and strong
predicted or estimated. “Effect” correlation.
„ Independent Variable: The variable that „ Values close to 0.0 indicate weak correlation.
provides the basis for estimation. It is the „ Negative values indicate an inverse relationship
predictor variable. “Cause?” (Maybe!) and positive values indicate a direct relationship.
23 24

© Maurice Geraghty 2015 4


Math 10 - Part 9 Slides - Regression

12-6 12-5

Perfect Positive Correlation Perfect Negative Correlation


10 10
9 9
8 8
7 7
6 6
Y Y
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X
25 26

12-7 12-8

Zero Correlation Strong Positive Correlation


10 10
9 9
8 8
7 7
6 6
Y Y
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
X X
27 28

12-8

Weak Negative Correlation Causation


10
9
8
„ Correlation does not necessarily imply
7 causation.
Y
6 „ There are 4 possibilities if X and Y are
5 correlated:
4
1. X causes Y
3
2 2. Y causes X
1 3. X and Y are caused by something else.
0 4. Confounding - The effect of X and Y are
0 1 2 3 4 5 6 7 8 9 10 hopelessly mixed up with other variables.
X
29 30

© Maurice Geraghty 2015 5


Math 10 - Part 9 Slides - Regression

12-10

Causation - Examples r2: Coefficient of Determination

„ City with more police per capita have „ r2 is the proportion of the total variation in
more crime per capita. the dependent variable Y that is explained
or accounted for by the variation in the
„ As Ice cream sales go up,
up shark attacks independent
d d variable
bl X.
go up. „ The coefficient of determination is the
„ People with a cold who take a cough square of the coefficient of correlation, and
medicine feel better after some rest. ranges from 0 to 1.

31 32

12-9

Formulas for r and r2 Example


„ X = Average Annual Rainfall (Inches)
SSXY SSR
r= r2 = „ Y = Average Sale of Sunglasses/1000
SSX ⋅ SS
SS SSY SSY
SSX = ΣX 2 − 1n (ΣX )
2
X 10 15 20 30 40
SSY = ΣY 2 − 1n (ΣY )
2

Y 40 35 25 25 15
SSXY = ΣXY − 1n (ΣX ⋅ ΣY )
(
SSR = SSY − SSXY
2

SSX
) 33 34

Example continued Example continued


„ Make a Scatter Diagram scatter diagram
„ Find r and r2
asses

60
per 1000
0
sales sungla

40
20
0
0 10 20 30 40 50
rainfall

35 36

© Maurice Geraghty 2015 6


Math 10 - Part 9 Slides - Regression

Example continued Example continued


X Y X2 Y2 XY
10 40 100 1600 400
15 35 225 1225 525 „ r = -445/sqrt(580 x 330) = -.9479
20 25 400 625 500 „ Strong negative correlation
30 25 900 625 750
40 15 1600 225 600 „ r2 = .8985
115 140 3225 4300 2775 „ About 89.85% of the variability of sales is
• SSX = 3225 - 1152/5 = 580 explained by rainfall.

• SSY = 4300 - 1402/5 = 380


• SSXY= 2775 - (115)(140)/5 = -445
37 38

11-3

Characteristics of F- Hypothesis Testing in Simple


Distribution Linear Regression
„ There is a “family” of F Distributions. „ The following Tests are equivalent:
„ Each member of the family is determined by two
parameters: the numerator degrees of freedom „ H0: X and Y are uncorrelated
and the denominator degrees off ffreedom. „ Ha: X and Y are correlated
„ F cannot be negative, and it is a continuous
distribution. „ H0: β1 = 0
„ The F distribution is positively skewed. „ Ha: β1 ≠ 0
„ Its values range from 0 to ∞ . As F → ∞ the
curve approaches the X-axis. „ Both can be tested using ANOVA

39 40

ANOVA Table for Simple


Linear Regression Example continued
„ Test the Hypothesis Ho: β1 = 0 , α=5%
Source SS df MS F Source SS df MS F p-value
R
Regression
i SSR 1 SSR/dfR MSR/MSE Regression 341.422 1 341.422 26.551 0.0142
Error/Residual SSE n-2 SSE/dfE Error 38.578 3 12.859
TOTAL SSY n-1 TOTAL 380.000 4

„ Reject Ho p-value < α


41 42

© Maurice Geraghty 2015 7


Math 10 - Part 9 Slides - Regression

12-20 12-21

Confidence Interval Prediction Interval


„ The confidence interval for the mean „ The prediction interval for an individual
value of Y for a given value of X is value of Y for a given value of X is
given by: given by:
g y

1 (X − X ) 1 (X − X )
2 2

Yˆ ± t ⋅ se ⋅ + Yˆ ± t ⋅ se ⋅ 1 + +
n SSX n SSX
„ Degrees of freedom for t =n-2 „ Degrees of freedom for t =n-2

43 44

Example continued Example continued


„ Find a 95% Confidence Interval for „ 95% Confidence Interval
Sales of Sunglasses when rainfall = 30
inches.
22.63 ± 6.60
„ Find a 95% Prediction Interval for Sales „ 95% Confidence Interval
of Sunglasses when rainfall = 30
inches. 22.63 ± 13.18

45 46

Using Minitab to Run Regression Using Minitab to Run Regression


„ Data shown is engine size in cubic inches (X) and Select Graphs>Scatterplot with regression line
MPG (Y) for 20 cars.
x y x y
400 15 104 25
455 14 121 26
113 24 199 21
198 22 360 10
199 18 307 10
200 21 318 11
97 27 400 9
97 26 97 27
110 25 140 28
107 24 400 15
47 48

© Maurice Geraghty 2015 8


Math 10 - Part 9 Slides - Regression

Using Minitab to Run Regression Using Minitab to Run Regression


Select Statistics>Regression>Regression, then Click the results box, and choose the fits and
choose the Response (Y-variable) and model (X- residuals to get all predictions.
variable)

49 50

Using Minitab to Run Regression Using Minitab to Run Regression


The results at the beginning are the regression Next is the ANOVA table, which tests the
equation, the intercept and slope, the standard significance of the regression model.
error of the residuals, and the r2

51 52

Using Minitab to Run Regression


Finally, the residuals show the potential outliers.

53

© Maurice Geraghty 2015 9


HowtoChooseaModel

a. CategoricalorNumericData?
b. One,Twoormanypopulations?
c. Testofmean,proportion,standarddeviation,orsomethingelse?
d. Independentordependentsampling?
e. Largeorsmallsamplesize?
f. Oneortwotailedtest?

x OnePopulationTests

o NumericData

1. Ztestforpopulationmeanvs.hypothesizedvalue(Part6slides)
x Testofmean,populationstandarddeviationknown
x Ho : P 10 Ha : P z 10, Ho : P d 10 Ha : P ! 10, Ho : P t 10 Ha : P  10 
X  P0
x Z Degreesoffreedom–NotApplicable
V
n

2. ttestforpopulationmeanvs.hypothesizedvalue(Part6slides)
x Testofmean,populationstandarddeviationunknown
x Ho : P 10 Ha : P z 10, Ho : P d 10 Ha : P ! 10, Ho : P t 10 Ha : P  10 
X  P0
x t Degreesoffreedom=n1
s
n

3. F2testforvariancevs.hypothesizedvalue(Part6slides)
x Testofstandarddeviationorvariance
x Ho : V 10 Ha : V z 10, Ho : V d 10 Ha : V ! 10, Ho : V t 10 Ha : V  10 
s2 n 1
x F 2
Degreesoffreedom=n1
V2
o CategoricalData

4. Ztestforproportionvs.hypothesizedvalue(Part6slides)
x Twochoices(Yes/No)Testofpopulationproportion
x Ho : p 0.5 Ha : p z 0.5, Ho : p d 0.5 Ha : p ! 0.5, Ho : p t 0.5 Ha : p  0.5 
pˆ  p0
x Z Degreesoffreedom=notapplicable
po 1  po
n

5. F2Goodnessoffittest(Part8Slides)
x Multiplechoices(k)Testofmultipleproportions
x Ho : p1 0.4 p2 0.1 p3 0.5 Ha : At least one pi is different 
2
OE
x F 2
¦ E
Degreesoffreedom=k1



x TwoormorePopulationTests

o NumericDataOnescalevariablewithtwoormorepopulations(factorvariable)

6. IndependentSamples:Ztest(Part7Slides)
x Comparing2Means–LargeSampleSize(n1,n2>30)orpopulationstandarddeviation
known
x Ho : P1 P 2 Ha : P1 z P 2 , Ho : P1 d P 2 Ha : P1 ! P 2 , Ho : P1 t P 2 Ha : P1  P 2 
X 1  X 2  P1  P 2
x Z Degreesoffreedom–NotApplicable
V 12 V 22

n1 n2


7. IndependentSamplettestwithequalvariances(pooledvariancettest)(Part7Slides)
x Comparing2Means–NotLargeSampleSizes,assume V 1 V 2 
x Ho : P1 P 2 Ha : P1 z P 2 , Ho : P1 d P 2 Ha : P1 ! P 2 , Ho : P1 t P 2 Ha : P1  P 2 
X 1  X 2  P1  P 2
x t Degreesoffreedom=n1+n22(Morepower)
1 1
sp 
n1 n2


8. IndependentSamplettestwithequalvariances(Part7Slides)
x Comparing2Means–NotLargeSampleSizes,assume V 1 z V 2 
x Ho : P1 P 2 Ha : P1 z P 2 , Ho : P1 d P 2 Ha : P1 ! P 2 , Ho : P1 t P 2 Ha : P1  P 2 
X 1  X 2  P1  P 2
x t degreesoffreedom<n1+n22(Lesspower)
s12 s 22

n1 n2


9. DependentSampling–MatchedPairs(Part7Slides)
x Comparing2Means–Lookatdifferencesofmeasurements
x Ho : P d 0 Ha : P d z 0, Ho : P d d 0 Ha : P d ! 0, Ho : P d t 0 Ha : P d  0 
2
OE
x F2 ¦ E
Degreesoffreedom=n1



 
10. FtestofVariances(Part7Slides)
x Comparing2Variances
x Ho : V 1 V 2 Ha : V 1 z V 2 , Ho : V 1 d V 2 Ha : V 1 ! V 2 , Ho : V 1 t V 2 Ha : V 1  V 2 
s12 s 22
x F or F Degreesoffreedom=n11,n21orn21,n11
s 22 s12
x Usethistesttohelpchoosebetweenmodels7and8above.


11. OneFactorAnalysisofVariance(Part8Slides)
x Comparing3ormoreMeans–(ANOVA)–Ftest
x Ho : P1 P 2 P 3 ... P k Ha : at least one P i is different 
MSfactor
x (ANOVAtable) F Degreesoffreedom=k1,nk
MSerror
x PostHocPairwisecomparisons–Tukey’sHSDtest(Part8Slides)
x CategoricalvariableisFactor,NumericVariableisResponse

o CategoricalData–Comparing2ormorevariables

12. F2TestforIndependence(Part8Slides)
x Testforarelationshipbetweentwovariables(AandB)inacontingencytable
x Ho:AandBareIndependentHa:AandBaredependent
2
OE
x F 2
¦ E
Degreesoffreedom=(rows1)(columns1)

x E=(RowTotal)(ColumnTotal)/GrandTotal


o Twonumericvariables(X,Y)–bivariatedata

13. CorrelationCoefficient(Part9Slides)
StrengthofRelationshipbetweentwovariables
x correlationbetweentwonumericvariables
x XandYarenotcorrelatedHa:XandYarecorrelated
MSregression
x (ANOVAtable) F Degreesoffreedom=1,n2
MSerror


14. SimpleLinearRegression–Ftest(Part9Slides)
x SignificanceandpredictionofLinearfitbetweentwovariables
x Ho : slope 0 Ha : slope z 0 
MSregression
x (ANOVAtable) F Degreesoffreedom=1,n2
MSerror

Math 10 ‐ Homework 0 Name:_____________________________
Course Syllabus and Materials

Go to the class website at http://nebula2.deanza.edu/~mo (or simply Google “mo de anza”).


Find the link for Math 10 Sec 28, and then find the Syllabus:

Questions about the Syllabus

1. What are the required materials?

2. How many homework assignments are there?

3. What assignments can be turned in as a group?

4. What is the course policy on late homework and labs?

Questions about the Calendar (on the same page as the Syllabus)

5. What assignments are due in the first two weeks?

6. What are the dates of the midterms?

7. What is the date of the final exam for your section?

8. How many labs are there?

9. What are the deadlines for dropping and withdrawing from the course?
Exploring the Website – Find the link for Math 10 Handouts and open the PDF file for Part 1. I highly recommend you
print this out and bring it to class to take notes.

10. What is the Topic for Part 1?

11. Do you any comments about improving these handouts?

Frequently Asked Questions – Find the ”FAQ” at the top of the page. If Flash is not working, you can use the menu
sidebar on the home page.

12. Find the question that starts “I need help in this class…” What are three things you can do if you need help?

13. Read the two questions that have two do with cheating and sign and date the following statement:

I have read the course policy on cheating in both the syllabus and the Frequently Asked Questions . I understand and
agree to the terms as outlined in these policies.

___________________________________________________ __________________________
Signature Date

Please write any Comments or Questions about the Course Policies here:
Math 10 ‐ Homework 1

1. Identify the following data by type (categorical, discrete, continuous) and level (nominal, ordinal, interval, ratio)

a. Number of tickets sold at a rock concert.

b. Make of automobile.

c. Age of a fossil.

d. Temperature of a nuclear power plant core reactor.

e. Number of students who transfer to private colleges.

f. Cost per unit at a state University.

g. Letter grade on an English essay.

2. A poll was taken of 150 students at De Anza College. Each student was asked how many hours they work outside of
college. The students were interviewed in the morning between 8Am and 11 AM on a Thursday. The sample mean
for these 150 students was 9.2 hours.

a. What is the Population?

b. What is the Sample?

c. Does the 9.2 hours represent a statistic or parameter? Explain.

d. Is the sample mean of 9.2 a reasonable estimate of the mean number of hours worked for all students
at De Anza? Explain any possible bias.

3. The box plots represent the results of three exams for 40


students in a Math course.

a. Which exam has the highest median?

b. Which exam has the highest standard


deviation?

c. For Exam 2, how does the median compare to


the mean?

d. In your own words, compare the exams.


4. The following average daily commute time (minutes) for residents of two cities.
City A 2 4 4 4 4 5 7 9 13 14 16 16 16 18 19 19
21 21 21 27 30 35 37 38 47 48 50 59 70 72 87 97

City B 29 38 38 40 40 48 48 50 52 52 54 55 56 57 57 58
58 58 59 59 59 62 62 63 66 66 67 69 69 71 75 89

a. Construct a back‐to back stem and leaf diagram and interpret the results.

b. Find the quartiles and interquartile range for each group.

c. Calculate the 80th percentile for each group.

d. Construct side‐by‐side box plots and compare the two groups.

e. For each group, determine the z‐score for a commute of 75 minutes. For which group would a 75 minute
commute be more unusual.

5. The February 10, 2009 Nielsen ratings of 20 TV programs shown on commercial television, all starting between 8 PM
and 10 PM, are given below:
2.1 2.3 2.5 2.8 2.8 3.6 4.4
4.5 5.7 7.6 7.6 8.1 8.7 10.0
10.2 10.7 11.8 13.0 13.6 17.3

a. Graph a stem and leaf plot with the tens and ones units making up the stem and the tenths unit being the leaf.

b. Group the data into intervals of width 2, starting the 1st interval at 2 and obtain the frequency of each of the
intervals.

c. Graphically depict the grouped frequency distribution in (b) by a histogram.

d. Obtain the relative frequency, % and cumulative frequency and cumulative relative frequency for the intervals in
(b)

e. Construct an ogive of the data. Estimate the median and quartiles.

f. Obtain the sample mean and the median. Compare the median to the ogive.

g. Do you believe that the data is symmetric, right‐skewed or left skewed?

h. Determine the sample variance and standard deviation.

i. Assuming the data are bell shaped, between what two numbers would you expect to find 68% of the data
6. The following data represents recovery time for 16 patients (arranged in a table to help you out)

count Days (X) X −X (X − X ) 2 Z Score


#1 2
#2 3
#3 4
#4 4
#5 5
#6 5
#7 5
#8 5
#9 5
#10 6
#11 6
#12 7
#13 7
#14 8
#15 8
#16 16
Totals

a. Calculate the sample mean and median

b. Use the table to calculate the variance and standard deviation.

c. Use the range of the data to see if the standard deviation makes sense. (Range should be between 3 and
6 standard deviations)

d. Using the empirical rule between what two numbers should you expect to see 68% of the data? 95% of
the data? 99.7% of the data?

e. Calculate the Z‐score for observation. Do you think any of these data are outliers?

7. The following data represents the heights (in feet) of 20 almond trees in an orchard.

a. Construct a box plot of the data.

b. Do you think the tree with height of 45 feet is an outlier? Use both methods we covered in class to
justify your answer.
8. Rank the following correlation coefficients from weakest to strongest.

.343, ‐.318, .214, ‐.765, 0, .998, ‐.932, .445

9. If you were trying to think of factors that affect health care costs:

a. Choose a variable you believe would be positively correlated with health care costs.

b. Choose a variable you believe would be negatively correlated with health care costs.

c. Choose a variable you believe would be uncorrelated with health care costs.
Math 10 ‐ Homework 2

1. A student has a 90% chance of getting to class on time on Monday and a 70% chance of
getting to class on time on Tuesday. Assuming these are independent events, determine the
following probabilities:

a. The student is on time both Monday and Tuesday.

b. The student is on time at least once (Monday or Tuesday).

c. The student is late both days.

2. A class has 10 students, 6 females and 4 males. 3 students will be sampled without
replacement for a group presentation.

a. Construct a tree diagram of all possibilities (there will be 8 total branches at the end)

b. Find the following probabilities:


i. All male students in the group presentation.

ii. Exactly 2 female students in the group presentation.

iii. At least 2 female students in the group presentation.

3. 20% of professional cyclists are using a performance enhancing drug. A test for the drug has
been developed that has a 60% chance of correctly detecting the drug(true positive).
However, the test will come out positive in 2% of cyclists who do not use the drug (false
positive).

a. Construct a tree diagram where the first set of branches are cylcists with and without the
drug, and the 2nd set is whether or not they test positive.

b. From the tree diagram create a contingency table.

c. What percentage of cyclists will test positive for the drug?

d. If a cyclist tests positive, what is the probability that the cyclist really used the drug?
4. We wish to determine the morale for a certain company. We give each of the workers a
questionnaire and from their answers we can determine the level of their morale, whether it
is ‘Low’, ‘Medium ‘ or ‘High’; also noted is the ‘worker type’ for each of the workers. For
each worker type, the frequencies corresponding to the different levels of morale are given
below.

WORKER MORALE
Worker Type Low Medium High
Executive 1 14 35
Upper Management 5 30 65
Lower Management 5 40 55
Non‐Management 354 196 450

a. We randomly select 1 worker from this population. What is the probability that the
worker selected

• is an executive?

• is an executive with medium morale?

• is an executive or has medium morale?

• is an executive, given the information that the worker has medium morale.

b. Given the information that the selected worker is an executive, what is the probability
that the worker

• has medium morale?

• has high morale?

c. Are the following events independent or dependent? Explain your answer:

• is an executive’, ‘has medium morale’, are these independent?

• is an executive’, ‘has high morale’, are these independent?


Math 10 ‐ Homework 3

Additional Problems:

1. Explain the difference between population parameters and sample statistics. What symbols do we use for the
mean and standard deviation for each of these?

2. Consider the following probability distribution function of the random variable X which represents the number
of people in a group(party) at a restaurant:
X P(X)
1 .10
2 .25
3 .20
4 .20
5 .10
6 .05
7 .05
8 .05

a. Find the population mean of X.

b. Find the population variance and standard deviation of X.

c. Find the probability that the next party will be over 4 people.

d. Find the probability that the next three parties (assuming independence) will each be over 4 people.

3. 10% of all children at large urban elementary school district have been diagnosed with learning disabilities. 10
children are randomly and independently selected from this school district.

a. Let X = the number of children with learning disabilities in the sample. What type of random variable is this?

b. Find the mean and standard deviation of X.

c. Find the probability that exactly 2 of these selected children have a learning disability.

d. Find the probability that at least 1 of these children has a learning disability.

e. Find the probability that less than 3 of these children have a learning disability.
4. A general statement is made that an error occurs in 10% of all retail transactions. We wish to evaluate the
truthfulness of this figure for a particular retail store, say store A. Twenty transactions of this store are
randomly obtained. Assuming that the 10% figure also applies to store A and let X be the number of retail
transactions with errors in the sample

a. The probability distribution function (pdf) of X is binomial. Identify the parameters n and p.

b. Calculate the expected value of X

c. Calculate the variance of X

d. Find the probability exactly 2 transactions sampled are in error.

e. Find the probability at least 2 transactions sampled are in error.

f. Find the probability that no more than one transaction is in error.

g. Would it be unusual if 5 or more transactions were in error?

5. A newspaper finds a mean of 4 typographical errors per page. Assume the errors follow a Poisson distribution.

a. Let X equal the number of errors on one page. Find the mean and standard deviation of this random
variable.

b. Find the probability that exactly three errors are found on one page.

c. Find the probability that no more than 2 errors are found on one page.

d. Find the probability that no more than 2 errors are found on two pages.

6. Major accidents at a regional refinery occur on the average once every five years. Assume the accidents follow a
Poisson distribution.

a. How many accidents would you expect over 10 years?

b. Find the probability of no accidents in the next 10 years.

c. Find the probability of no accidents in the next 20 years.

7. 20% of the people in a California town consider themselves vegetarians. If 20 people are randomly sampled,
find the probability that:
a. Exactly 3 are vegetarians.
b. At least 3 are vegetarians.
c. At most 3 are vegetarians

8. Cargo ships arrive at a loading dock at a rate of 2 per day. The dock has the capability of handling 3 arrivals per
day. How many days per month (assume 30 days in a month) would you expect the dock being unable to handle
all arriving ships? (Hint: first find the probability that more than 3 ships arrive and then use that probability to
find the expected number of days in a month too many ships arrive.)
Math 10 ‐ Homework 4

1. A ferry boat leaves the dock once per hour. Your waiting time for the next ferryboat will follow a uniform
distribution from 0 to 60 minutes.

a. Find the mean and variance of this random variable.


b. Find the probability of waiting more than 20 minutes for the next ferry.
c. Find the probability of waiting exactly 20 minutes for the next ferry.
d. Find the probability of waiting between 15 and 35 minutes for the next ferry.
e. Find the conditional probability of waiting at least 10 more minutes after you have already waited 15
minutes.
f. Find the probability of waiting more than 45 minutes for the ferry on 3 consecutive independent days.

2. The cycle times for a truck hauling concrete to a highway construction site are uniformly distributed over the
interval 50 to 70 minutes.
a. Find the mean and variance for cycle times.
b. Find the 5th and 95th percentile of cycle times.
c. Find the interquartile range.
d. Find the probability the cycle time for a randomly selected truck exceeds 62 minutes.
e. If you are given the cycle time exceeds 55 minutes, find the probability the cycle time is between 60
and 65 minutes.

3. The amount of gas in a car’s tank (X) follows a Uniform Distribution where the minimum is zero and the
maximum is 12 gallons.
a. Find the mean and median amount of gas in the tank.
b. Find the variance and standard deviation of gas in the tank.
c. Find the probability that there is more than 3 gallons in the tank.
d. Find the probability that there is between 4 and 6 gallons in the tank.
e. Find the probability that there is exactly 3 gallons in the tank
f. Find the 80th percentile of gas in the tank.

4. A normally distributed population of package weights has a mean of 63.5 g and a standard deviation of 12.2 g.

a. What percentage of this population weighs 66 g or more?


b. What percentage of this population weighs 41 g or less?
c. What percentage of this population weighs between 41 g and 66 g?
d. Find the 60th percentile for distribution of weights.
e. Find the three quartiles and the interquartile range.
f. If you sample 49 packages, find the probability the sample mean is over 66 g. Compare this answer to
part a.

5. Assume the expected waiting time until the next RM (Richter Magnitude) 7.0 or greater earthquake somewhere
in California follows an exponential distribution with μ = 10 years.

a. Find the probability of waiting 10 or more years for the next RM 7.0 or greater earthquake.

b. Determine the median waiting time until the next RM 7.0 or greater earthquake.
6. High Fructose Corn Syrup (HFCS) is a sweetener in food products that is linked to obesity and type II diabetes.
The mean annual consumption in the United States in 2008 of HFCS was 60 lbs with a standard deviation of 20
lbs. Assume the population follows a Normal Distribution.

a. Find the probability a randomly selected American consumes more than 50 lbs of HFCS per year.

b. Find the probability a randomly selected American consumes between 30 and 90 lbs of HFCS per year.

c. Find the 80th percentile of annual consumption of HFCS.

d. In a sample of 40 Americans how many would you expect consume more than 50 pounds of HFCS per
year.

e. Between what two numbers would you expect to contain 95% of Americans HFCS annual consumption?

f. Find the quartiles and Interquartile range for this population.

g. A teenager who loves soda consumes 105 lbs of HFCS per year. Is this result unusual? Use probability to
justify your answer.

7. State in your own words the 3 important parts of the Central Limit Theorem.

8. For women aged 18‐24, systolic blood pressures (in mmHg) are normally distributed with μ=114.8 and σ=13.1.

a. Find the probability a woman aged 18‐24 has systolic blood pressure exceeding 120.

b. If 4 women are randomly selected, find the probability that their mean blood pressure exceeds 120.

c. If 40 women are randomly selected, find the probability that their mean blood pressure exceeds 120.

d. If the pdf for systolic blood pressure did NOT follow a normal distribution, would your answer to part c
change? Explain.
9. The following data represents 20 random samples from a discrete uniform distribution S={1,2,3,4,5,6,7,8,9}.
( )
The sample mean X was calculated for each group:

a. Consider the sample mean (last column) as a random variable and group the data into the following
categories and make a histogram:
( )
Interval for X Frequency Rel Freq

(4.05 to 4.50)

(4.55 to 5.00)

(5.05 to 5.50)

(5.55 to 6.00)

Total

b. Describe the shape of the data

c. Calculate the sample mean and standard deviation of these 20 values.

d. For this discrete uniform distribution, μ= 5 and σ = 2.58. Based on the Central Limit Theorem, what
would the mean and standard deviation of the sample mean random variable be? How does this
compare with sample mean and standard deviation results from part c?
Math 10 - Homework 5

1. The average number of years of post secondary education of employees in an industry is 1.5. A company claims
that this average is higher for its employees. A random sample of 16 of its employees has an mean of 2.1 years
of post secondary education with a standard deviation of 0.6 years.

a. Find a 95% confidence interval for the mean number years of post secondary education for the
company’s employees. How does this compare with the industry value?

b. Find a 95% confidence interval for the standard deviation of number years of post secondary education
for the company’s employees.

2. When polling companies report a margin of error, they are referring to a 95% confidence interval. Go to the
website www.pollingreport.com and verify the stated margins of error for 2 polls.

Constructing Confidence Intervals In Exercises 3 and4 you are given the sample mean and the sample standard
deviation. Assume the random variable is normally distributed and use a t-distribution to construct a 95%
confidence interval for the population mean µ. What is the margin of error of the confidence interval?

3. Repair Costs: Microwaves In a random sample of five microwave ovens, the mean repair cost was $75.00 and
the standard deviation was $12.50.

4. Repair Costs: Computers In a random sample of seven computers, the mean repair cost was $100.00 and the
standard deviation was $42.50.

5. You did some research on repair costs of microwave ovens and found that the standard deviation is σ = $15.
Repeat Exercise 3, using a normal distribution with the appropriate calculations for a standard deviation that is
known. Compare the results.

6. Mini-Soccer Balls A soccer ball manufacturer wants to estimate the mean circumference of mini-soccer balls
within 0.15 inch. Assume that the population of circumferences is normally distributed.

(a) Determine the minimum sample size required to construct a 99% confidence interval for the population
mean. Assume the population standard deviation is 0.20 inch.

(b) Repeat part (a) using a standard deviation of 0.10 inch. Which standard deviation requires a larger sample
size? Explain.

(c) Repeat part (a) using a confidence level of 95%. Which level of confidence requires a larger sample size?
Explain.

7. If all other quantities remain the same, how does the indicated change affect the minimum sample size
requirement (Increase, Decrease or No Change)?

(a) Increase in the level of confidence

(b) Increase in the error tolerance


(c) Increase in the standard deviation

8. Stressful Travel: In a survey of 3224 U.S. adults, 1515 said flying is the most stressful form of travel. Construct a
95% confidence interval for the proportion of all adults who say flying is the most stressful form of travel.

9. Accidents and Alcohol: A study of 2008 traffic fatalities found that 800 of the fatalities were alcohol related.
Find a 99% confidence interval for the population proportion and explain what it means.

10. Happy at Work? In a survey of 1003 U.S. adults, 662 would be happy spending the rest of their career with their
current employer. Construct a 90% confidence interval for the proportion who would be happy staying with
their current employer. Does this result surprise you?

11. Computer Repairs You wish to estimate, with 95% confidence and within 3.5% of the true population, the
proportion of computers that need repairs or have problems by the time the product is three years old
a. No preliminary estimate is available. Find the minimum sample size needed.

b. Find the minimum sample size needed, using a prior study that found that 19% of computers needed
repairs or had problems by the time the product was three years old.

c. Compare the results from parts (a) and (b).

12. Lawn Mower A lawn mower manufacturer is trying to determine the standard deviation of the life of one of its
lawn mower models. To do this, it randomly selects 12 lawn mowers that were sold several years ago and finds
that the sample standard deviation is 3.25 years. Use a 99% level of confidence to find a confidence interval for
standard deviation.

13. Monthly Income The monthly incomes of 20 randomly selected individuals who have recently graduated with a
bachelor's degree in social science have a sample standard deviation of $107. Use a 95% level of confidence to
find a confidence interval for standard deviation.

14. Read the attached article on the CBS News poll regarding the birth control pill.

a. What would the point estimator be for the proportion of adults who believe the pill has made
women’s lives better.

b. What is the sample size for this study?

c. What is the margin of error for this poll as reported in the article. Assuming a 95% level of
confidence, verify this poll by calculation.
Poll: Most Say The Pill Improved Women's Lives - CBS News http://www.cbsnews.com/stories/2010/05/07/health/main6468828.shtml?...

May 7, 2010

(CBS) Poll analysis by Jennifer De Pinto.

More than half the public -- including most women --


believes the birth control pill has been one of the most
significant medical developments of the last half century, a
new CBS News poll finds.

Most Americans say "the pill" has had an impact on


(CBS)
American society and on women’s lives in particular, and
credit it with helping women enter the work force.

The birth control pill was approved by the Food and Drug Administration in 1960. Today, 52 percent
of Americans say it has been one of the most significant medical developments of the last 50 years,
according to the poll, conducted on May 4th and 5th.

Four in five Americans think the birth control pill has had at least some effect on American society
overall, including 41 percent who say it’s had a great deal of impact.

Even more, 54 percent, think the birth control pill has had a great deal of impact on women’s lives in
particular.

The Pill: Women’s Lives Made Better

Most Americans say women’s lives were changed for the better because of the birth control pill. Only a
quarter think it made no difference, and even fewer say the pill made women’s lives worse.

Men (59 percent), women (54 percent), and women who have ever taken the pill (54 percent) say that
women’s lives were improved as a result of the birth control pill.

More specifically, Americans think the birth control pill helped women enter the work force: 57

1 of 3 5/9/2010 5:45 AM
Poll: Most Say The Pill Improved Women's Lives - CBS News http://www.cbsnews.com/stories/2010/05/07/health/main6468828.shtml?...

percent say the pill made it easier for women to have jobs and careers outside the home.

That number rises to 69 percent among Americans age 45 and over -- an age group more likely to have
felt the impact of the pill when it was first developed and put on the market. Among women age 45 and
older that figure is 64 percent.

By contrast, 53 percent of younger Americans say the birth control pill had no effect on the ability of
women to work outside the home.

Among working women, 55 percent say the birth control poll has made it easier for women to enter the
workforce.

(CBS)
Family Life and Attitudes Toward Sex

Roughly half of Americans say the birth control pill has improved American family life, while a third
doesn’t think it has had much effect.

Religion has some impact on these views. Among Catholics, whose church opposes non-natural forms
of birth control, just 38 percent believe the birth control pill has improved American family life. That
figure is 52 percent among Protestants.

Eight in ten Americans think the birth control pill has affected Americans’ attitudes toward sex,
including 51 percent who say it impacted those attitudes a great deal.

The Pill: Safety and Effectiveness

The poll finds public concerns about the safety of the birth control pill have diminished over time.

2 of 3 5/9/2010 5:45 AM
Poll: Most Say The Pill Improved Women's Lives - CBS News http://www.cbsnews.com/stories/2010/05/07/health/main6468828.shtml?...

In 1966, six years after the pill was approved by the FDA, fewer than half of Americans - 43 percent -
told a Gallup Poll that birth control pills could be used safely without danger to a person’s health.

That number has risen to 64 percent today.

Among women, 58 percent now think the birth control pill can be used safely, as do a similar
percentage of women who have ever taken it.

Nearly half of women think the birth control pill is just as safe as other forms of birth control, and
another 20 percent believe the pill is safer. Still, one in five thinks it is less safe. Views are similar
among women who have ever taken birth control pills.

More than eight in 10 Americans (including 82 percent of women) say birth control pills are effective.
In a 1966 Gallup Poll, a smaller number of Americans (though still a 61 percent majority) thought the
birth control pill was effective.

Some medical research has been done on a contraceptive for men similar to that of the birth control
pill. A majority of women do not think most men would take birth control pills if they were available.

In contrast, two-thirds of men think most men would take the pill if it were available.

This poll was conducted among a random sample of 591 adults nationwide, interviewed by telephone
May 4-5, 2010. Phone numbers were dialed from random digit dial samples of both standard
land-line and cell phones. The error due to sampling for results based on the entire sample could be
plus or minus four percentage points. The error for subgroups is higher.

This poll release conforms to the Standards of Disclosure of the National Council on Public Polls.

© MMIX, CBS Interactive, Inc. All Rights Reserved.

3 of 3 5/9/2010 5:45 AM
Math 10 - Homework 6

Part A

1. What are the two types of hypotheses used in a hypothesis test? How are they related?

2. Describe the two types of error possible in a hypothesis test decision.

True or False?
In Exercises 3-8, determine whether the statement is true of false. If it is false, rewrite it as a true
statement.

3. In a hypothesis test, you assume the alternative hypothesis is true.

4. A statistical hypothesis is a statement about a sample.

5. If you decide to reject the null hypothesis, you can support the alternative hypothesis.

6. The level of significance is the maximum probability you allow for rejecting a null hypothesis when
it is actually true.

7. A large P-value in a test will favor a rejection of the null hypothesis.

8. If you want to support a claim, write it as your null hypothesis.

Stating Hypotheses
In Exercises 9-14, use the given statement to represent a
claim. Write its complement and state which is Ho and which is Ha.

9. p >.65

10. µ ≤ 128

11. σ2 ≠ 5

12. µ =1.2

13. p ≥0.45

14. σ < 0.21


Think about the context of the claim. Determine whether you want to support or reject the claim.
a. State the null and alternative hypotheses in words.
b. Write the null and alternative hypotheses in appropriate symbols
c. Describe in words Type I error (the consequence of rejecting a true null hypothesis.)
d. Describe in words Type II error (the consequence of failing to reject a false null hypothesis.)

15. You represent a chemical company that is being sued for paint damage to automobiles. You want
to support the claim that the mean repair cost per automobile is about $650. How would you
write the null and alternative hypotheses?

16. You are on a research team that is investigating the mean temperature of adult humans. The
commonly accepted claim is that the mean temperature is about 98.6°F. You want to show that
this claim is false. How would you write the null and alternative hypotheses?

17. A light bulb manufacturer claims that the mean life of a certain type of light bulb is at least 750
hours. You are skeptical of this claim and want to refute it.

18. As stated by a company's shipping department, the number of shipping errors per million
shipments has a standard deviation that is less than 3. Can you support this claim?

19. A research organization reports that 33% of the residents in Ann Arbor, Michigan are college
students. You want to reject this claim.

20. The results of a recent study show that the proportion of people in the western United States who
use seat belts when riding in a car or truck is under 84%. You want to support this claim.
PART B – Hypothesis Testing Procedure

21. In your work for a national health organization, you are asked to monitor the amount of sodium in a
certain brand of cereal. You find that a random sample of 82 cereal servings has a mean sodium content
of 232 milligrams with a standard deviation of 10 milligrams. At α = 0.01 , can you conclude that the
mean sodium content per serving of cereal is over 230 milligrams?

(a) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(pvalue method)

(e) (DATA) Conduct the test and circle your


decision

(b) (DESIGN) State Significance Level of the test


and explain what it means,

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall


conclusion in
language that is clear, relates to the original
problem and is consistent with your decision.

(c) (DESIGN) Determine the statistical model (test


statistic)
22. A tourist agency in Florida claims the mean daily cost of meals and lodging for a family of four
traveling in Florida is $284. You work for a consumer protection advocate and want to test this claim.
In a random sample of 50 families of four traveling in Florida, the mean daily cost of meals and lodging
is $292 and the standard deviation is $25. At α = 0.05, do you have enough evidence to reject the
agency's claim?

(a) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(critical value method)

(e) (DATA) Conduct the test and circle your


decision

(b) (DESIGN) State Significance Level of the test


and explain what it means.

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall


conclusion in
language that is clear, relates to the original
problem and is consistent with your decision.
(c) (DESIGN) Determine the statistical model (test
statistic)
23. An environmentalist estimates that the mean waste recycled by adults in the United States is more than
1 pound per person per day. You want to test this claim. You find that the mean waste recycled per
person per day for a random sample of 12 adults in the United States is 1.2 pounds and the standard
deviation is 0.3 pound. At a = 0.05, can you support the claim?

(d) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(critical value method)

(e) (DESIGN) State Significance Level of the test (e) (DATA) Conduct the test and circle your
and explain what it means. decision

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall


(f) (DESIGN) Determine the statistical model (test conclusion in
statistic) language that is clear, relates to the original
problem and is consistent with your decision.
24. A government association claims that 44% of adults in the United States do volunteer work. You
work for a volunteer organization and are asked to test this claim. You find that in a random
sample of 1165 adults, 556 do volunteer work. At a = 0.05, do you have enough evidence to reject
the association's claim?

(a) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(pvalue method)

(e) (DATA) Conduct the test and circle your


decision

(b) (DESIGN) State Significance Level of the


test and explain what it means,

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall


conclusion in
language that is clear, relates to the original
problem and is consistent with your decision.

(c) (DESIGN) Determine the statistical model


(test statistic)
PART C – total hypothesis testing

25. The geyser Old Faithful in Yellowstone National Park is claimed to erupt for on average for about three
minutes. Thirty-six observations of eruptions of the Old Faithful were recorded (time in minutes)
1.8 1.98 2.37 3.78 4.3 4.53
1.82 2.03 2.82 3.83 4.3 4.55
1.88 2.05 3.13 3.87 4.43 4.6
1.9 2.13 3.27 3.88 4.43 4.6
1.92 2.3 3.65 4.1 4.47 4.63
1.93 2.35 3.7 4.27 4.47 6.13

Sample mean = 3.394 minutes. Sample standard deviation = 1.168 minutes


Test the hypothesis that the mean length of time for an eruption is 3 minutes answering ALL the
following questions:
A. General Question
a. Why do you think this test is being conducted?
B. Design
a. State the null and alternative hypotheses
b. What is the appropriate test statistic/model?
c. What is significance level of the test?
d. What is the decision rule?
C. Conduct the test
a. Are there any unusual observations that question the integrity of the data or the
assumptions of the model? (additional problem only)
b. Is the decision to reject or fail to reject Ho?
D. Conclusions - State a one paragraph conclusion that is consistent with the decision using language
that is clearly understood in the context of the problem. Address any potential problems with the
sampling methods and address any further research you would conduct.
PART D– Definitions and Power
26. Define the following terms:

a. Parameter
b. Statistic
c. Statistical Inference
d. Hypothesis
e. Hypothesis Testing
f. Null Hypothesis (Ho)
g. Alternative Hypothesis (Ha)
h. Type I Error
i. Type II Error
j. Level of Significance (α)
k. Beta (β)
l. Statistical Model
m. Test Statistic
n. Model Assumptions
o. Critical value(s)
p. Rejection Region
q. p-value
r. Decision Rule
s. Power
t. Effect Size

27. A study claims more than 60% of students text-message frequently. In a poll of 1000 students, 660 students
said they text message frequently. Can you support the study’s claim? Conduct the test with α = 1%

28. 15 I-pod users were asked how many songs were on their I-pod. Here are the summary statistics of that
study:

X = 650 s = 200

a. Can you support the claim that the number of songs on a user’s I-pod is different from 500?
Conduct the test with α= 5% .

b. Can you support the claim that the population standard deviation is under 300? Conduct the test
with α= 5% .

29. Consider the design procedure in the test you conducted in Question 28a. Suppose you wanted to conduct
a Power analysis if the population mean under Ha was actually 550. Use the online Power calculator to
answer the following questions.

a. Determine the Power of the test.

b. Determine Beta.

c. Determine the sample size needed if you wanted to conduct the test in Question 28a with 95%
power.
30. The drawing shown diagrams a hypothesis test for population mean design under the Null Hypothesis (top
drawing) and a specific Alternative Hypothesis (bottom drawing). The sample size for the test is 200.

a. State the Null and Alternative Hypotheses

b. What are the values of µ0


and µa in this problem?

c. What is the significance


level of the test?

d. What is the Power of the


test when the population
mean = 4?

e. Determine the probability


associated with Type I
error.

f. Determine the probability


associated with Type II
error.

g. Under the Null


Hypothesis, what is the
probability the sample
mean will be over 6?

h. If the significance level


was set at 5%, would the
power increase, decrease
or stay the same?

i. If the test was conducted, and the p-value was .085, would the decision be Reject or Fail to Reject
the Null Hypothesis?

j. If the sample size was changed to 100, would the shaded on area on the bottom (Ha) graph
increase, decrease or stay the same?
Math 10 ‐ Homework 7

1. What is the difference between two samples that are dependent and two samples that are
independent? Give an example of two dependent samples and two independent samples.

2. What conditions are necessary in order to use the dependent samples t‐test for the mean of the
difference of two populations?

In Problems 3‐10, classify the two given samples as independent or dependent. Explain your reasoning.

3. Sample 1: The SAT scores for 35 high school students who did not take an SAT preparation course
Sample 2: The SAT scores for 40 high school students who did take an SAT preparation course

4. Sample 1: The SAT scores for 44 high school students


Sample 2: The SAT scores for the same 44 high school students after taking an SAT preparation course

5. Sample 1: The weights of 51 adults


Sample 2: The weights of the same 51 adults after participating in a diet and exercise program for one
month

6. Sample 1: The weights of 40 females


Sample 2: The weights of 40 males

7. Sample 1: The average speed of 23 powerboats using an old hull design


Sample 2: The average speed of 14 powerboats using a new hull design

8. Sample 1: The fuel mileage of 10 cars


Sample 2: The fuel mileage of the same 10 cars using a fuel additive

9. The table shows the braking distances (in feet) for each of four different sets of tires with the car's anti‐
lock braking system (ABS) on and with ABS off. The tests were done on ice with cars traveling at 15
miles per hour.

Tire Set 1 2 3 4
Braking distance with ABS 42 55 43 61
Braking distance without ABS 58 67 59 75

10. The table shows the heart rates (in beats per minute) of five people before exercising and after.

Person 1 2 3 4 5
Heart Rate before Exercising 42 55 43 61 65
Heart Rate after Exercising 58 67 59 75 90
11. In a study testing the effects of an herbal supplement on blood pressure DATA in men, 11 randomly selected
men were given an herbal supplement for 15 weeks. The following measurements are for each subject's
diastolic blood pressure taken before and after the 15‐week treatment period. At α = .10 , can you support the
claim that systolic blood pressure was lowered?
(a) (DESIGN) State your Hypothesis (e) (DATA) Conduct the test and circle your decision

(b) (DESIGN) State Significance Level of the


test and explain what it means,

(c) (DESIGN) Determine the statistical model


(test statistic)

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall conclusion in


language that is clear, relates to the original
problem and is consistent with your decision.

(d) DESIGN) Determine decision rule (pvalue


method)
12. A random sample of 25 waiting times (in minutes) before patients saw a medical professional in a hospital's
minor emergency department had a standard deviation of 0.7 minute. After a new admissions procedure was
implemented, a random sample of 21 waiting times had a standard deviation of 0.5 minute. At α = .10 , can you
support the hospital's claim that the standard deviation of the waiting times has decreased?

(a) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(critical value method)

(e) (DATA) Conduct the test and circle your decision


(b) (DESIGN) State Significance Level of the test and
explain what it means.

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall conclusion in


language that is clear, relates to the original
problem and is consistent with your decision.
(c) (DESIGN) Determine the statistical model (test
statistic)
13. An engineer wants to compare the tensile strengths of steel bars that are produced using a conventional
method and an experimental method. (The tensile strength of a metal is a measure of its ability to resist tearing
when pulled lengthwise.) To do so, the engineer randomly selects steel bars that are manufactured using each
method and records the following tensile strengths (in Newtons per square millimeter). At α = .10 , can the
engineer claim that the experimental method produces steel with greater mean tensile strength? Should the
engineer recommend using the experimental method? First use the F test to determine whether or not to use
equal variances in choosing the model.

Experimental 395 389 421 394 407 411 389 402 422 416 402 408 400 386 411 405 389
Conventional 362 352 380 382 413 384 400 378 419 379 384 388 372 383
(a) (DESIGN) State your Hypothesis (d) (DESIGN) Determine decision rule
(pvalue method)

(e) (DATA) Conduct the test and circle your decision


(b) (DESIGN) State Significance Level of the test
and explain what it means,

Reject Ho Fail to Reject Ho

(f) (CONCLUSION) State your overall conclusion in


language that is clear, relates to the original
(c) (DESIGN) Determine the statistical model (test problem and is consistent with your decision.
statistic)
Math 10 ‐ Homework 8

1. A bicycle safety organization claims that fatal bicycle accidents are uniformly distributed throughout the
week. The table shows the day of the week for which 911 randomly selected fatal bicycle accidents
occurred. At α= 0.10, can you reject the claim that the distribution is uniform?

(a) (DESIGN) State your Hypothesis (d) (DATA) Conduct the test and circle your decision

Survey Observe pi Expected ChiSq


Sunday 118
Monday 119
Tuesday 127
(b) (DESIGN) State Significance Level of the test and
Wednesday 137
explain what it means.
Thursday 129
Friday 146
Saturday 135
Total 911

(c) (DESIGN) Determine the statistical model .


Determine decision rule (critical value method) Reject Ho Fail to Reject Ho

(e) (CONCLUSION) State your overall conclusion in


language that is clear, relates to the original
problem and is consistent with your decision.
2. Results from a survey five years ago asking where coffee drinkers typically drink their first cup of coffee are
shown in the graph. To determine whether this distribution has changed, you randomly select 581 coffee
drinkers and ask each where they typically drink their first cup of coffee. The results are shown in the table.
Can you conclude that there has been a change in the claimed or expected distribution? Use α= 0.05.

(a) (DESIGN) State your Hypothesis (d) (DATA)


Conduct the test
and circle your
decision

(b) (DESIGN) State Significance Level of the test and


explain what it means.

Survey Observe pi Expected ChiSq


Home 389
Work 110
(c) (DESIGN) Determine the statistical model . Commute 55
Determine decision rule (critical value method) Rest/Other 27
Total 581

Reject Ho Fail to Reject Ho

(e) (CONCLUSION) State your overall conclusion in


language that is clear, relates to the original
problem and is consistent with your decision

3. In a recent SurveyUSA poll, 500 Americans adults were asked if marijuana should be legalized. The results of
the poll were cross tabulated as shown in the contingency tables below. Conduct two tests for
independence to determine if opinion about legalization of marijuana is dependent on gender or age

Male Female
Should be Legal 123 90
Should Not be Legal 127 160

18‐34 35‐54 55+


Should be Legal 95 83 48
Should Not be Legal 65 126 83
4. 1000 American adults were recently polled on their opinion about effect of recent stimulus bill and the
economy. The results are shown in the following contingency table, broken down by gender:
Stimulus will hurt Stimulus will help the Stimulus will have no
economy economy effect TOTAL
Male 150 150 200 500
Female 100 200 200 500
TOTAL 250 350 400 1000

a) Are gender and opinion on the stimulus dependent variables? Test using α =1%.
b) Give a possible explanation for the conclusion you came up with in part a.

5. A clinical psychologist completed a study on hyperactivity in children using one‐way ANOVA. The model was
balanced with 5 replicates per treatment. The factor was 3 types of school district (urban, rural and
suburban). Unfortunately, hackers broke into the psychologist’s computer and wiped out all the data. All
that remained was a fragment of the ANOVA table:

Source of Sum of Degrees of Mean F statistic Critical Decision


Variation Squares Freedom Square Value of
F for α=
.05
Factor 7000
Error
Total 9000

Fill in the table and conduct the hypotheses test that compares mean level of hyperactivity in the 3 types of

districts. Explain your results.

6. A sociologist was interested in commute time for workers in the Bay Area. She categorized commuters by 4
regions (North Bay, South Bay, East Bay and Peninsula) and designed a balanced model with 8 replicates per
region. Data is round trip commute time in minutes . The results and ANOVA output are shown on the next
page:

a. Test the Null Hypothesis that all regions have the same mean commute time at a significance level
of 5%. State your decision in non‐statistical language.

b. Conduct all pairwise comparisons at an overall significance level of 5%.

c. Explain the results of this experiment as if you were addressing a transportation committee. What
would you recommend?
MINITAB OUTPUT

North South East Pen


13 91 41 17
9 45 30 16
10 28 60 13
13 17 34 26
27 89 47 7
13 36 13 9
9 23 19 21
32 66 36 18

One-way ANOVA: North, South, East, Pen

Source DF SS MS F P
Factor 3 6392 2131 7.14 0.001
Error 28 8356 298
Total 31 14748

S = 17.28 R-Sq = 43.34% R-Sq(adj) = 37.27%

Individual 95% CIs For Mean Based on


Pooled StDev
Level N Mean StDev --------+---------+---------+---------+-
North 8 15.75 8.76 (--------*-------)
South 8 49.38 29.22 (-------*-------)
East 8 35.00 14.99 (-------*--------)
Pen 8 15.88 6.20 (--------*-------)
--------+---------+---------+---------+-
15 30 45 60

Pooled StDev = 17.28

Grouping Information Using Tukey Method

N Mean Grouping
South 8 49.38 A
East 8 35.00 A B
Pen 8 15.88 B
North 8 15.75 B

Means that do not share a letter are significantly different.

North South East Pen


7. People who are concerned about their health may prefer hot dogs that are low in salt and calories. The
data contains data on the calories and sodium contained in each of 54 major hot dog brands. The hot dogs
are classified by type: beef, poultry, and meat (mostly pork and beef, but up to 15% poultry meat). Minitab
output is attached for two different hypothesis tests.

A test for a difference in calories due to hot dog type will be performed.
i. Design the test.
ii. Fill in the missing information in the ANOVA table on the next page.
iii. Conduct the test with an overall confidence level of 5%, including pairwise comparisons.

One-way ANOVA: Calories versus Type

Source DF SS MS F p-value
Type ______ 17692 ________ ________ 0.000
Error ______ 28067 ________
Total ______ 45759

112 128 144 160

Grouping Information Using Tukey Method

Type N Mean Grouping


Meat 17 158.71 A
Beef 20 156.85 A
Poultry 17 118.76 B

Means that do not share a letter are


significantly different.
Math 10 – Homework 9 Name________________________________

Q1 A manager is concerned that overtime (measured in hours) is contributing to more sickness (measured in sick
days) among the employees. Data records for 10 employees were sampled with the following results:

Partial Computer Output are attached on next page

a) Find the least square line where Sick Days is dependent on Overtime. Interpret the slope.

b) Test the hypothesis that the regression model is significant (α = .10)

c) Find and interpret the r2, coefficient of determination. (Blank Box)

d) Find he estimate of standard deviation of the residual error. (Blank Box)

e) What would your prediction of sick days be for an employee who works 100 hours overtime.

f) Analyze the residuals and determine which pair of data is the most unusual.

g) Explain why this model would not be appropriate for an employee who works 500 hours overtime.
Regression Analysis

r² n 10
r k 1
Dep.
Std. Error Var. SickDays

ANOVA table
Source SS df MS F p-value
Regression 80.6944 1 80.6944 15.05 .0047
Residual 42.9056 8 5.3632
Total 123.6000 9

Regression output confidence interval


t 95% 95%
variables coefficients std. error (df=8) p-value lower upper
Intercept 0.5369 1.3207 0.407 .6950 -2.5086 3.5824
Overtime 0.0621 0.0160 3.879 .0047 0.0252 0.0989

Residua 12
Observation SickDays Predicted l 10
SickDays

1 0.0 1.8 -1.8 8


2 0.0 2.1 -2.1 6
3 2.0 2.7 -0.7 4
4 7.0 3.0 4.0 2
5 3.0 3.2 -0.2 0
6 5.0 3.9 1.1
0 50 100 150 200
7 4.0 4.7 -0.7
8 11.0 7.5 3.5 Overtime
9 7.0 8.9 -1.9
10 9.0 10.2 -1.2

Predicted values for: SickDays


95% Confidence Intervals 95% Prediction Intervals

Overtime Predicted lower upper lower upper Leverage


100 6.742 4.696 8.788 1.023 12.461 0.147
500 31.564 15.563 47.564 14.696 48.432 8.977

You might also like