0% found this document useful (0 votes)

51 views19 pages

Sampling Distribution and Central Limit Theorem: Session 2

This document discusses sampling distributions and the central limit theorem. It defines key concepts like populations, samples, random sampling, and how sample means are distributed. The central limit theorem states that as sample size increases, the distribution of sample means approaches a normal distribution, regardless of the shape of the population. This allows inferring properties of populations from sample statistics using normal distribution probabilities.

Uploaded by

Anyone Someone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views19 pages

Sampling Distribution and Central Limit Theorem: Session 2

Uploaded by

Anyone Someone

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 19

Session 2

Sampling Distribution and

Central Limit Theorem

• Statistical Inference

• Random Sampling

• Distribution of Sample Means

• Central Limit Theorem

Session 2 2 2

Population:
The complete set of N items (people, objects,
transactions, or events) under investigation.

Example: All 1 million households in Hyderabad.

Sample:
The portion (or subset) of the population selected
for analysis.

Example: 1000 selected households in

Hyderabad.
Session 2 3 2

There are many different ways to select a sample from

a population. The simplest way…

Random Sample:
A sample of size n drawn from N items in such a
way that each of the N items has the same chance
of being selected.

Variable:
A numerical characteristic of an item (in the
population or sample.)

Example: Annual household income.

Population Parameter:
A numerical measure, computed from a
population, which describes an aspect of the
whole population.

Example: Average annual household income.

Sample Statistic:
A numerical measure, computed from a sample,
which describes an aspect of the sample.

Example: Average annual household income.

Session 2 4 2
Session 2 5 2
Session 2 6 2

Selecting the Sampling Frame

• Sampling frame is simply a list of items from

which to draw a sample

• Does the sampling frame represent the

population?
– e.g. Literary Digest vs. George Gallup polls

• The available list may differ from desired list

– e.g. we do not have list of customers who did not buy
from a store

• Sometimes, no comprehensive sampling frame

exists
– e.g., when forecasting for the future
Session 2 7 2

Typical Pitfalls in Sampling

• Collecting data only from volunteers (voluntary response
sample)
– e.g. online reviews (yelp.com, maps.google.com,
tripadvisor.com)

• Picking easily available respondents (convenience sample)

– e.g. choosing to survey in In-Orbit mall

• A high rate of non-response (more than 70%)

– e.g. CEO / CIO surveys on some industry trends
Session 2 8 2

Notation

We think of a random variable and its probability

distribution as representing a population. In the prior
chapters, we used  and  to describe parameters of
normal and binomial random variables.

To distinguish the average in a sample from the

average in a population, we use different notation.
Likewise, with the standard deviation. We summarize
this in a table:
Session 2 9 2

Estimation

In a large population, we will not know the parameters

 and  . We will need to take a sample from the
population and then compute X and s.

X is an estimator of the unknown 

s is an estimator of the unknown 

For larger sample sizes, X and s will tend to give

better estimates of the parameters.

Simulation:
We can demonstrate this with simulation. We specify any
known values of  = 100 and  = 15 (as in IQ). We take a
sample of people, one by one, and update the X and s
values after each new data point. So, if the first three people
sampled have IQ’s of 102, 110, 97 then X will be 102, 106,
103. Similarly, for s. As we sample more people, X
converges to 100 and s converges to 15.

Here are 2 possible simulations, each using 400 people. (Be

sure you use the left axis for X and the right axis for s.)

continued…
Session 2 10 2

Cumulative Mean and Std Deviation

120 30
X (left axis)
110 25

100 20

90 15

80 10
s (right axis)

70 5

60 0
0 100 200 300 400

Observation Number

Cumulative Mean and Std Deviation

120 30
X (left axis)
110 25

100 20

90 15

80 10
s (right axis)

70 5

60 0
0 100 200 300 400

Observation Number
Session 2 11 2

Tossing a Single Die

If we toss dice (or flip a coin) we consider the result a

sample from a population. Consider a single die,
tossed once. We know the probability distribution is:

Throw of one die

0.18

0.16

0.14

0.12
Probability

0.10

0.08

0.06

0.04

0.02

0.00
1 2 3 4 5 6
Result
Session 2 12 2

Sum of Two Dice

The sum of two dice is a random variable with 11

possible values between 2 and 12. Not all of the 11
results have the same probability:

Sum of 2 Dice

0.18

0.16

0.14

0.12
Probability

0.10

0.08

0.06

0.04

0.02

0.00
2 3 4 5 6 7 8 9 10 11 12
Sum

X is a random variable:
What is the probability distribution of the Average of
the 2 dice rolls instead of the Sum?
X = (X 1 + X 2 ) 2 = Sum 2
Session 2 13 2

Example:
In general, we can throw any number of dice:
X + X 2 + ... + X n
X = 1
n

Let’s do a simulation of the sum or average of

simultaneous dice rolls. This will enable us to see the
probability distribution of X .
Go to:
http://onlinestatbook.com/stat_sim/sampling_dist/index.html
Session 2 14 2

The Central Limit Theorem in Pictures

Session 2 15 2

It turns out we have simple formulas to determine the

mean and standard deviation of X in terms of n,  , 
that we computed for a single die:

The mean of the X is the mean of each individual roll:

X = 

The standard deviation of the X is smaller than the

standard deviation of each individual roll:

X =
n

In addition to the mean and standard deviation, we can also

say something about the probability distribution of X for
large values of n …
Session 2 16 2

Statement of the Central Limit Theorem (C.L.T.):

X behaves more and more like a normally distributed

random variable as n increases.

******************************************************************
This is very important because it says we can use the
z- table for problems that start with any distribution.

Note that the mean of X stays the same (the dotted

line) but the density function gets narrower as n
increases. This is also obvious from the formulas:

X =  X =
n

Note from the picture that for n = 30 the distribution of

X looks like a normal (last row). That depends on
what kind of distribution we start with. If it is very
skewed (asymmetric), then we might need a larger n.
Session 2 17 2

CLT is Valid When…

• Each data point in the sample is independent of

the other.

• The sample size is large enough.

• A sample size of 30 is usually considered large

enough to make X normal but there are more
precise conditions:
• n > 10 (K3)2, where K3 is sample skewness, and
• n > 10 |K4|, where K4 is sample kurtosis

• Adequate sample size depends on the

distribution of data.

• If data is quite symmetric and has few outliers,

even smaller samples are fine. Otherwise, we
need larger samples.
Session 2 18 2
Session 2 19 2

Summary of Session 2

What is statistical inference?

• Statistical inference is the process of making probabilistic inferences about
population parameters based on sample statistics

How to (and how not to) choose a sample?

• You want a simple random sample (SRS). To do so, you require a sampling frame
that represents the population and a randomization device

What are sample statistics and their properties?

• Sample statistics are random variables because they vary across samples drawn
from the same population. They can be used as point estimates of the population
parameters

What is the central limit theorem and how is it useful?

• Central limit theorem implies that no matter what the population distribution is,
the sample mean ( X ) is normally distributed with mean (µ) and standard error
  
 
 n , approximately.

Bizstat ssn2
No ratings yet
Bizstat ssn2
55 pages
M-Iii Unit-3ln
No ratings yet
M-Iii Unit-3ln
44 pages
Civil Eng. Stats Guide for Students
No ratings yet
Civil Eng. Stats Guide for Students
66 pages
Pre FinalExam Reviewer
No ratings yet
Pre FinalExam Reviewer
4 pages
Lectorial Slides 6a
No ratings yet
Lectorial Slides 6a
30 pages
2020-Introduction of Biostatistics
No ratings yet
2020-Introduction of Biostatistics
10 pages
COMM162 - Week 05 - Sampling
No ratings yet
COMM162 - Week 05 - Sampling
45 pages
FDSA Unit - 3
No ratings yet
FDSA Unit - 3
59 pages
Philippine Christian University: Week 1
No ratings yet
Philippine Christian University: Week 1
6 pages
CH 7
No ratings yet
CH 7
18 pages
Biostatistics Revision DR - NJ
No ratings yet
Biostatistics Revision DR - NJ
67 pages
Business Statistics: Sampling Distribution
No ratings yet
Business Statistics: Sampling Distribution
83 pages
Chapter 6-8 Sampling and Estimation
No ratings yet
Chapter 6-8 Sampling and Estimation
48 pages
Statistical Inference Course Overview
No ratings yet
Statistical Inference Course Overview
30 pages
Sampling Sta414
No ratings yet
Sampling Sta414
44 pages
Business Statistics CH
No ratings yet
Business Statistics CH
29 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
Engineering Mathematics - IV (15MAT41) Module-V: SAMPLING THEORY and Stochastic Process
100% (1)
Engineering Mathematics - IV (15MAT41) Module-V: SAMPLING THEORY and Stochastic Process
28 pages
Random Variables & Sampling
100% (1)
Random Variables & Sampling
5 pages
SAMPLING by Naresh Vasant Afre 13.04.23 Shareable
No ratings yet
SAMPLING by Naresh Vasant Afre 13.04.23 Shareable
58 pages
Why "Sample" The Population? Why Not Study The Whole Population?
No ratings yet
Why "Sample" The Population? Why Not Study The Whole Population?
9 pages
Slideset 2
No ratings yet
Slideset 2
63 pages
Introductory Statistics for Business
No ratings yet
Introductory Statistics for Business
15 pages
Sampling & Confidence Intervals
No ratings yet
Sampling & Confidence Intervals
72 pages
MODULE 4 Updated Notes
No ratings yet
MODULE 4 Updated Notes
19 pages
Unit 10 - Sampling
No ratings yet
Unit 10 - Sampling
11 pages
Brief Lecture Notes
No ratings yet
Brief Lecture Notes
13 pages
Chapter 2
No ratings yet
Chapter 2
37 pages
Selvanathan 7e - 09
No ratings yet
Selvanathan 7e - 09
46 pages
Statistics Notes
No ratings yet
Statistics Notes
17 pages
Expected Value & Standard Error in Sampling
No ratings yet
Expected Value & Standard Error in Sampling
7 pages
Foundations of Statistical Inference
No ratings yet
Foundations of Statistical Inference
22 pages
Why "Sample" The Population? Why Not Study The Whole Population?
No ratings yet
Why "Sample" The Population? Why Not Study The Whole Population?
9 pages
LM 09
No ratings yet
LM 09
7 pages
Hypothesis Testing and Critical Values
No ratings yet
Hypothesis Testing and Critical Values
66 pages
Engineering Data Sampling Guide
No ratings yet
Engineering Data Sampling Guide
37 pages
Statistics and Probability Module 3 CLT - RPUNO - Digital
No ratings yet
Statistics and Probability Module 3 CLT - RPUNO - Digital
17 pages
6sampling Distribution
No ratings yet
6sampling Distribution
82 pages
Sampling Techniques and CLT Overview
No ratings yet
Sampling Techniques and CLT Overview
36 pages
P&S - Lec 6 - Sampling Distribution
No ratings yet
P&S - Lec 6 - Sampling Distribution
32 pages
Ss-Chapter 12: Sampling: Final and Initial Sample Size Determination
No ratings yet
Ss-Chapter 12: Sampling: Final and Initial Sample Size Determination
14 pages
Sampling Techniques Explained
No ratings yet
Sampling Techniques Explained
8 pages
Sampling and Estimation in Statistics
No ratings yet
Sampling and Estimation in Statistics
20 pages
Radiation Counting Statistics Guide
100% (1)
Radiation Counting Statistics Guide
36 pages
EECM3724 Unit 4 Ch7 Slides 2022
No ratings yet
EECM3724 Unit 4 Ch7 Slides 2022
24 pages
CH-5 Sampling Distribution Lecture
No ratings yet
CH-5 Sampling Distribution Lecture
19 pages
Sampling Theory - Notes
100% (2)
Sampling Theory - Notes
43 pages
Session 09 Lecture Notes 0215
No ratings yet
Session 09 Lecture Notes 0215
17 pages
Z-Values for Normal Distribution Probabilities
No ratings yet
Z-Values for Normal Distribution Probabilities
35 pages
AP Stats - Vocab List
No ratings yet
AP Stats - Vocab List
28 pages
Hypothesis Testing 23.09.2023
No ratings yet
Hypothesis Testing 23.09.2023
157 pages
Unit-Iii P&S
No ratings yet
Unit-Iii P&S
21 pages
Concept of Sampling Distribution
No ratings yet
Concept of Sampling Distribution
21 pages
4 Sampling-Distributions
No ratings yet
4 Sampling-Distributions
22 pages
Lecture06 Ch6 Forsyth Inf Stats FA24
No ratings yet
Lecture06 Ch6 Forsyth Inf Stats FA24
56 pages
Notes PDF
No ratings yet
Notes PDF
54 pages
Suprateek Bose: ISB Operations & Strategy
No ratings yet
Suprateek Bose: ISB Operations & Strategy
1 page
Aspiring Business Leader's Journey
No ratings yet
Aspiring Business Leader's Journey
1 page
Markstrat Pointers 2 Market Analysis
No ratings yet
Markstrat Pointers 2 Market Analysis
1 page
ISB Finance & Strategy Expert with Goldman Sachs Experience
No ratings yet
ISB Finance & Strategy Expert with Goldman Sachs Experience
1 page
Aspiring Business Consultant's Journey
No ratings yet
Aspiring Business Consultant's Journey
1 page
Markstrat Pointers 1 Breakeven Analysis
No ratings yet
Markstrat Pointers 1 Breakeven Analysis
1 page
CV PDF
No ratings yet
CV PDF
1 page
Aspiring Business Consultant's Journey
No ratings yet
Aspiring Business Consultant's Journey
1 page
Vodites Market Semantic Scale Analysis
No ratings yet
Vodites Market Semantic Scale Analysis
5 pages
Management of Organizations
No ratings yet
Management of Organizations
4 pages
Contact Information and Office Hours
No ratings yet
Contact Information and Office Hours
4 pages
The Importance of Self-Esteem: - A Popular Management Idea
No ratings yet
The Importance of Self-Esteem: - A Popular Management Idea
4 pages
Apple TNC PDF
No ratings yet
Apple TNC PDF
174 pages
YP Guidelines 2018 PDF
No ratings yet
YP Guidelines 2018 PDF
14 pages
Course Objectives: Through People, Using Conceptual Knowledge, Case Studies, and
No ratings yet
Course Objectives: Through People, Using Conceptual Knowledge, Case Studies, and
4 pages
ISB Term 2 GDP Practice Questions
No ratings yet
ISB Term 2 GDP Practice Questions
4 pages
Importance of Organizational Behavior
No ratings yet
Importance of Organizational Behavior
4 pages
Apple TNC PDF
No ratings yet
Apple TNC PDF
174 pages
General Knowledge Quiz
No ratings yet
General Knowledge Quiz
1 page
Young Professionals Selected
No ratings yet
Young Professionals Selected
2 pages
AnnualFee PDF
No ratings yet
AnnualFee PDF
1 page
Financial Accounting Midterm Exam 2018
No ratings yet
Financial Accounting Midterm Exam 2018
14 pages
Amex Discounted PDF
No ratings yet
Amex Discounted PDF
2 pages
Session2 Demand v2
No ratings yet
Session2 Demand v2
61 pages
Ahmedabad Airport Lounge Access Guide
No ratings yet
Ahmedabad Airport Lounge Access Guide
2 pages
Government Intervention: MGEC Post Session 3
No ratings yet
Government Intervention: MGEC Post Session 3
42 pages
Session1 Practice Problems Ak
No ratings yet
Session1 Practice Problems Ak
2 pages
Session3 Supplyandcosts
No ratings yet
Session3 Supplyandcosts
74 pages
Economic Costs and Opportunity Costs
No ratings yet
Economic Costs and Opportunity Costs
1 page
Financial Accounting For Decision Making (FADM) : ISB 2020-21 Additional Problems For Sessions 1-5
No ratings yet
Financial Accounting For Decision Making (FADM) : ISB 2020-21 Additional Problems For Sessions 1-5
32 pages
Week 4.chi-Square Test
No ratings yet
Week 4.chi-Square Test
13 pages
Vol2 4 1 PDF
No ratings yet
Vol2 4 1 PDF
17 pages
Advanced Probability Concepts
No ratings yet
Advanced Probability Concepts
57 pages
Tom Opim Module 4 Question
No ratings yet
Tom Opim Module 4 Question
3 pages
Cheat Sheet Econometrics
No ratings yet
Cheat Sheet Econometrics
4 pages
Patient Satisfaction Analysis
No ratings yet
Patient Satisfaction Analysis
7 pages
Engineering Mathematics III Exam Questions
No ratings yet
Engineering Mathematics III Exam Questions
7 pages
DES Y3 Photo-z Catalog Analysis
No ratings yet
DES Y3 Photo-z Catalog Analysis
19 pages
Genstat Release 12.1 (Pc/Windows Vista) 22 June 2019 12:10:56
No ratings yet
Genstat Release 12.1 (Pc/Windows Vista) 22 June 2019 12:10:56
23 pages
No 2 (SPSS) : Variables Entered/Removed
No ratings yet
No 2 (SPSS) : Variables Entered/Removed
3 pages
Fuzzy Time Series: Modeling and Forecasting
No ratings yet
Fuzzy Time Series: Modeling and Forecasting
42 pages
Statistics MCQs
No ratings yet
Statistics MCQs
9 pages
Practical File Of: "Research Methodology Lab"
100% (1)
Practical File Of: "Research Methodology Lab"
73 pages
Offline Assessmnet Hypothesis Testing2
No ratings yet
Offline Assessmnet Hypothesis Testing2
3 pages
Understanding Two-Tailed Tests
No ratings yet
Understanding Two-Tailed Tests
21 pages
Anova 105104
No ratings yet
Anova 105104
7 pages
Bidirectional Communication Theory
No ratings yet
Bidirectional Communication Theory
7 pages
EVPI
No ratings yet
EVPI
27 pages
Bayesian Statistics Exam Guide
No ratings yet
Bayesian Statistics Exam Guide
6 pages
MG221: Applied Probability & Statistics: Syllabus 2018
No ratings yet
MG221: Applied Probability & Statistics: Syllabus 2018
2 pages
Stat Mid Fall 2024
No ratings yet
Stat Mid Fall 2024
2 pages
Qte Simulation
No ratings yet
Qte Simulation
15 pages
Cui Et Al 1999 Modification of Sample Size in Group Sequential Clinical Trials
No ratings yet
Cui Et Al 1999 Modification of Sample Size in Group Sequential Clinical Trials
5 pages
Topic 6. Two-Way Designs: Randomized Complete Block Design
No ratings yet
Topic 6. Two-Way Designs: Randomized Complete Block Design
21 pages
Least Squares Adjustment Guide
No ratings yet
Least Squares Adjustment Guide
17 pages
Chapter3 Sampling Proportions Percentages
No ratings yet
Chapter3 Sampling Proportions Percentages
10 pages
Chapter 2 Statistics Review 2023
No ratings yet
Chapter 2 Statistics Review 2023
21 pages
Understanding Errors in Chemical Analyses
No ratings yet
Understanding Errors in Chemical Analyses
19 pages
Statistical Analysis Questions
No ratings yet
Statistical Analysis Questions
32 pages
Which Test When: 1 Exploratory Tests
No ratings yet
Which Test When: 1 Exploratory Tests
5 pages

Sampling Distribution and Central Limit Theorem: Session 2

Uploaded by

Sampling Distribution and Central Limit Theorem: Session 2

Uploaded by

Session 2

Sampling Distribution and

• Distribution of Sample Means

• Central Limit Theorem

Example: All 1 million households in Hyderabad.

Example: 1000 selected households in

There are many different ways to select a sample from

Example: Annual household income.

Example: Average annual household income.

Example: Average annual household income.

Selecting the Sampling Frame

• Sampling frame is simply a list of items from

• Does the sampling frame represent the

• The available list may differ from desired list

• Sometimes, no comprehensive sampling frame

Typical Pitfalls in Sampling

• Picking easily available respondents (convenience sample)

• A high rate of non-response (more than 70%)

We think of a random variable and its probability

To distinguish the average in a sample from the

In a large population, we will not know the parameters

X is an estimator of the unknown 

For larger sample sizes, X and s will tend to give

Here are 2 possible simulations, each using 400 people. (Be

Cumulative Mean and Std Deviation

Cumulative Mean and Std Deviation

Tossing a Single Die

If we toss dice (or flip a coin) we consider the result a

Throw of one die

Sum of Two Dice

The sum of two dice is a random variable with 11

Let’s do a simulation of the sum or average of

The Central Limit Theorem in Pictures

It turns out we have simple formulas to determine the

The mean of the X is the mean of each individual roll:

The standard deviation of the X is smaller than the

In addition to the mean and standard deviation, we can also

Statement of the Central Limit Theorem (C.L.T.):

X behaves more and more like a normally distributed

Note that the mean of X stays the same (the dotted

Note from the picture that for n = 30 the distribution of

CLT is Valid When…

• Each data point in the sample is independent of

• The sample size is large enough.

• A sample size of 30 is usually considered large

• Adequate sample size depends on the

• If data is quite symmetric and has few outliers,

What is statistical inference?

How to (and how not to) choose a sample?

What are sample statistics and their properties?

What is the central limit theorem and how is it useful?

You might also like