Lecture 02 - Review of Statistics - McLave - 2 Per Page
Lecture 02 - Review of Statistics - McLave - 2 Per Page
Statistics
1
11/9/2020
Statistics
Descriptive Inferential
Statistics Statistics
2
11/9/2020
3
11/9/2020
4
11/9/2020
◼ Designed Experiment
Strict control over the experiment and the units
in the experiment
◼ Observational Study
Observe units in natural settings
No control over behavior of units
◼ Survey
Gallup, Harris and other polls
Nielsen
McClave, Statistics, 11th ed. Chapter 1: 9
Statistics, Data and Statistical Thinking
Statistics
10
5
11/9/2020
x i
x= i =1
11
x i x i
x= i =1
= i =1
n N
12
6
11/9/2020
◼ If x1 = 1, x2 = 2, x3 = 3 and x4 = 4,
x
i =1
i
13
50% 50%
14
7
11/9/2020
(x − x ) i
2
s2 = i =1
n −1
McClave, Statistics , 11th ed. Chapter 2: 15
Methods for Describing Sets of Data
15
(x − x ) i
2
s = s2 = i =1
n −1
16
8
11/9/2020
(x − x )
i
2
s2 = i =1
= (3 − 2) 2 + (2 − 2) 2 + (1 − 2) 2 / (3 − 1)
n −1
(
s = 1 + 02 + 12 / 2 = 2 / 2 = 1
2 2
)
s = s2 = 1 = 1
17
18
9
11/9/2020
19
20
10
11/9/2020
21
22
11
11/9/2020
23
24
12
11/9/2020
30 35 40 45 50 55
Wins by Team at the 2007 MLB All-Star Break
25
BoxPlot
Interquartile Range (IQR) = QU - QL
30 35 40 45 50 55
Wins by Team at the 2007 MLB All-Star Break
26
13
11/9/2020
27
BoxPlot
20 30 40 50 60 70 80 90 100 110
Wins by Team at the 2007 MLB All-Star Break
(One team had its total wins for 2006 recorded)
28
14
11/9/2020
29
30
15
11/9/2020
Basic Graphs
◼ Bar graphs to show numbers that are
independent of each other. Example data might
include things like the number of people who
preferred each of Chinese takeaways, Indian
takeaways and fish and chips.
◼ Pie charts to show you how a whole is divided
into different parts. You might, for example, want
to show how a budget had been spent on
different items in a particular year.
◼ Line graphs show you how numbers have
changed over time. They are used when you
have data that are connected, and to show
trends, for example, average night time
temperature in each month of the year.
McClave, Statistics, 11th ed. Chapter 1: 31
Statistics, Data and Statistical Thinking
31
Advanced Graphs
32
16
11/9/2020
Dashboards
33
34
17
11/9/2020
Statistics
35
36
18
11/9/2020
37
38
19
11/9/2020
39
= E ( x) = xp( x).
◼ The variance of a discrete random variable x is
s 2 = E[( x − ) 2 ] = ( x − ) 2 p( x).
40
20
11/9/2020
On average, bettors lose about a nickel for each dollar they put down on a bet like this.
(These are the best bets for patrons.)
41
42
21
11/9/2020
43
44
22
11/9/2020
n x n− x
P( x) = p q
x
McClave, Statistics, 11th ed. Chapter 4: 45
Discrete Random Variables
45
n
P( x) = p x q n − x
x
McClave, Statistics, 11th ed. Chapter 4: 46
Discrete Random Variables
46
23
11/9/2020
47
Mean = np
Variance s 2 = npq
Standard Deviation s = npq
48
24
11/9/2020
49
50
25
11/9/2020
51
x e − 35 e −3
P( x = 5) = = = .1008
x! 5!
52
26
11/9/2020
x e − 1.55 e −1.5
P( x = 5) = = = .0141
x! 5!
53
54
27
11/9/2020
55
56
28
11/9/2020
b−a
P ( a x b) = c+d
d −c Mean: =
cabd 2
d −c
Standard Deviation: s=
McClave: Statistics, 11th ed. Chapter 5:
Continuous Random Variables
12 57
57
58
29
11/9/2020
s 2
µ = the mean of x
= the standard deviation of x
= 3.1416…
e = 2.71828 …
McClave: Statistics, 11th ed. Chapter 5: 59
Continuous Random Variables
59
60
30
11/9/2020
P(−1.00 z 0) = .3413
P(1 z 1.25) =
P(0 z 1.25) − P(0 z 1.00)
= .3944 − .3413 = .0531
61
62
31
11/9/2020
63
For a normally
So any normally
distributed random
variable x, if we know distributed variable
µ and , can be analyzed
xi − with this single
zi = distribution
s
64
32
11/9/2020
3100 − 3000
P( x 3100) = P z =
50
P( z 2.00) = 1 − P( z 2.00) =
1 − .5 − P(0 z 2.00) =
1 − .5 − .4772 = .0228
McClave: Statistics, 11th ed. Chapter 5: 65
Continuous Random Variables
65
Standard Deviation: =
McClave: Statistics, 11th ed. Chapter 5: 66
Continuous Random Variables
66
33
11/9/2020
a
−
P( x a) = e
60
−
P ( x 60) = e 45
= e −1.33 = .2645
60
McClave: Statistics, 11th ed. Chapter 5: 67
Continuous Random Variables
67
68
34
11/9/2020
n=1 n=2 n = 3 ( = N)
1 1 1, 2 1.5
2 2 1, 3 2 1, 2, 3 2
3 3 2, 3 2.5
x =2 x =2 x =2
3 3 1
s x = .82 s x = .41 sx = 0
69
70
35
11/9/2020
71
Statistics
72
36
11/9/2020
73
74
37
11/9/2020
75
76
38
11/9/2020
= x zs x
77
78
39
11/9/2020
79
80
40
11/9/2020
81
82
41
11/9/2020
83
Statistics
84
42
11/9/2020
Confidence Interval
µ? Where on the number line do the data point us? µ?
(No prior idea about the value of the parameter.)
Hypothesis Test
Do the data point us to this particular value? µ0?
(We have a value in mind from the outset.)
85
Null Hypothesis: H0
•This will be supported
unless the data provide
evidence that it is false
• The status quo
Alternative Hypothesis: Ha
•This will be supported if
the data provide sufficient
evidence that it is true
• The research hypothesis
86
43
11/9/2020
87
88
44
11/9/2020
Type I Error:
H0 is false
rejecting a false null
hypothesis
Correct!
P(Type II error) = β
Note: Null hypotheses are either rejected, or else there is insufficient evidence
to reject them. (I.e., we don’t accept null hypotheses, but, we fail to reject it.)
89
90
45
11/9/2020
91
92
46
11/9/2020
93
94
47
11/9/2020
95
Ha : µ < or > µ0 Ha : µ ≠ µ0
x − 0 x − 0
Test Statistic: z = Test Statistic: z =
sx sx
Rejection Region: | z | > z α Rejection Region: | z | > z α /2
96
48
11/9/2020
Do not reject H0
McClave, Statistics, 11th ed. Chapter 8: Inferences 97
Based on a Single Sample: Tests of Hypotheses
97
Suppose z = 2.12.
P(z > 2.12) = .0170.
98
49
11/9/2020
P ( z z* | H 0 )
99
100
50
11/9/2020
101
x − 0
t=
s/ n
102
51
11/9/2020
Ha : µ < or > µ0 Ha : µ ≠ µ0
x − 0 x − 0
Test Statistic: t = Test Statistic: t =
s/ n s/ n
Rejection Region: | t | > t α Rejection Region: | t | > t α /2
Conditions: 1) A random sample is selected from the target population.
2) The population from which the sample is selected is
approximately normal.
3) The value of t α is based on (n – 1) degrees of freedom
McClave, Statistics, 11th ed. Chapter 8: Inferences 103
Based on a Single Sample: Tests of Hypotheses
103
104
52
11/9/2020
105
Statistics
106
53
11/9/2020
µ1 - µ2 p1 - p 2 σ12/σ22
Difference Ratio of
Mean difference; between variances;
difference in proportions, difference in
averages percentages, variability or
fractions or spread; compare
rates; compare variation
proportions
107
Singe sample sˆ x = s n
s12 s22
Two samples sˆ x1 − x2 = +
n1 n2
108
54
11/9/2020
109
110
55
11/9/2020
s 12 s 22
( x1 − x2 ) z / 2s ( x1 − x2 ) = ( x1 − x2 ) z / 2 +
n1 n2
s12 s22
( x1 − x2 ) z / 2 +
n1 n2
111
112
56
11/9/2020
113
114
57
11/9/2020
( x1 − x2 ) − D0
Test Statistic: z=
s (x −x1 2)
s 2
s 22 s12 s22
where s (x −x ) = 1
+ +
1 2
n1 n2 n1 n2
McClave, Statistics, 11th ed. Chapter 9: 115
Inferences Based on Two Samples
115
116
58
11/9/2020
117
( x1 − x2 ) − 0 − 5.83
Test statistic: z= = = −2.799
s (x −x1 2)
2.08
118
59
11/9/2020
n1 + n2 − 2
p
119
1 1
( x1 − x2 ) t / 2s ( x1 − x2 ) = ( x1 − x2 ) t / 2 s 2p +
n1 n2
120
60
11/9/2020
( x1 − x2 ) − D0
Test Statistic: t=
1 1
s 2p +
n1 n2
121
122
61
11/9/2020
123
( x1 − x2 ) − 0 (78 − 82) − 0
Test Statistic : t = = = −.832
1 1 1 1
s 2p + 242.5 +
n1 n2 21 21
McClave, Statistics, 11th ed. Chapter 9: 124
Inferences Based on Two Samples
124
62
11/9/2020
125
( x1 − x2 ) − 0 (72 − 86) − 0
Test Statistic: t = = = −3.16
s12 s22 154 163
+ +
n1 n2 13 21
McClave, Statistics, 11th ed. Chapter 9: 126
Inferences Based on Two Samples
126
63
11/9/2020
t = −3.16
Degrees of Freedom = v =
(s 2
1 / n1 + s22 / n2 )2
=
(154 / 13 + 163 / 21)2
(s 2
1 / n1 ) (
2
s2 / n
+ 2 2
)
2
(154 / 13)2 + (163 / 21)2
n1 − 1 n2 − 1 13 − 1 21 − 1
26.15 26
t.025,df = 26 = 2.056
127
t = −3.16
Degrees ofSince
Freedom = v =,,
(s 2
1 / n1 + s22 / n2 )2
=
(154 / 13 + 163 / 21)
2
128
64
11/9/2020
Summary
129
65