Basic Review w Sampling - Copy
Basic Review w Sampling - Copy
REVIEW of
BASIC STATISTICAL CONCEPTS
for Preparation to Study Regression
Making Inferences
• Probability is the Language of Statistics
– Flip a Coin
• Population, Sample, Statistics
– Average Height of NYU Students
• Through Sampling, Make Inferences about the
Population
Standard Notation
Measure Sample Population
Mean X
Standard
S
Deviation
2 2
Variance S
Size n N
Variation:
Statistical Analysis is All About Variability
Variance &
Standard Deviation
1. Measures of dispersion
2. Most common measures
3. Consider how data are distributed
4. Show variation about mean (X or μ)
X = 8.3
4 6 8 10 12
Population Variance Formula
N
(X )
2
2
i 1
i
N
( ) (X ) … (X )
2 2 2
X 1 2 n
=
N
Use N if Population
Variance
Sample Variance Formula
n
( )
2
X X
i
S 2 i1
n 1
( ) ( ) ( )
2 2 2
X X X X … X X
1 2 n
= n 1
n - 1 in denominator!
Sample Standard Deviation Formula
2
S S
(X i )
2
X
i1
n 1
(X 1 X ) (X
2
2 X ) … (X
2
n X )
2
n 1
Interpreting Standard Deviation: Empirical
Rule
x x x
Positive Negative No
relationship relationship relationship
Probability Distributions for
Continuous Random Variables
Continuous Probability Density
Function
3. Properties
f ( x ) dx 1 x
All x (Area Under Curve) a b
Value
f ( x ) 0, a x b
Continuous Random Variable
Probability
b
Probability Is Area Under P (a x b ) af ( x ) dx
Curve!
f(x)
x
a b
s=1
.5000
P(z > 1.26)
= .5000 – .3962
.3962
= .1038
1.26 Z
m=0
The Standard Normal Table:
P(–2.78 z –2.00)
Standardized Normal Distribution
s=1
P(–2.78 ≤ z ≤ –2.00)
.4973 = .4973 – .4772
= .0201
.4772
–2.78 –2.00 Z
m=0
Shaded area exaggerated
Non-standard Normal μ = 5, σ = 10: P(5 <
X< 6.2)
X 6.2 5
Z .12
10
Normal Standardized Normal
Distribution Distribution
s = 10 s=1
.0478
m= 5 6.2 X m = 0 .12 Z
Shaded area exaggerated
Sampling Distributions
Common Statistics & Parameters
Sample Statistic Population Parameter
Mean X
Standard
Deviation S
Variance S2 2
Binomial ^
p p
Proportion
Properties of the Sampling
Distribution of x
Sampling from
Normal Populations
• Central Tendency Population Distribution
x s = 10
• Dispersion
m = 50 X
x
n
– Sampling with Sampling Distribution
replacement n=4 n =16
X = 5 X = 2.5
m - = 50 X
X
Sampling from
Non-Normal Populations
Sampling from
Non-Normal Populations
• Central Tendency
Population Distribution
x s = 10
• Dispersion
m = 50 X
x
n Sampling Distribution
n=4 n =30
X = 5 X = 1.8
m - = 50 X
X
Central Limit Theorem
As sample x
size gets n
sampling
large
distribution
enough
becomes almost
(n 30) ...
normal.
x
X