0 ratings0% found this document useful (0 votes) 36 views16 pagesLecture 16 Confidence Interval
this document tells what confidence interval is and how to calculate it
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
© Scribbr
Understanding Confidence Intervals | Easy Examples &
Formulas
Published on August 7, 2020 by Rebecca Bevans. Revised on June 22, 2023.
When you make an estimate in statistics, whether it is a summary statistic or a test statistic,
there is always uncertainty around that estimate because the number is based on a sample of
the population you are studying.
The confidence interval is the range of values that you expect your estimate to fall between
a certain percentage of the time if you run your experiment again or re-sample the population
in the same way.
The confidence level is the percentage of times you expect to reproduce an estimate
between the upper and lower bounds of the confidence interval, and is set by the alpha value.
Table of contents
What exactly is a confidence interval?
Calculating a confidence interval: what you need to know
Confidence interval for the mean of normally-distributed data
Confidence interval for proportions
Confidence interval for non-normally distributed data
Reporting confidence intervals
Caution when using confidence intervals
Other interesting articles
©PNOKRwONSA
Frequently asked questions about confidence intervalsWhat exactly is a confidence interval?
A confidence interval is the mean of your estimate plus and minus the variation in that
estimate. This is the range of values you expect your estimate to fall between if you redo your
test, within a certain level of confidence.
Confidence, in statistics, is another way to describe probability. For example, if you construct
a confidence interval with a 95% confidence level, you are confident that 95 out of 100 times
the estimate will fall between the upper and lower values specified by the confidence interval.
Your desired confidence level is usually one minus the alpha (a) value you used in your
statistical test:
Confidence level =
So if you use an alpha value of p < 0.05 for statistical significance, then your confidence level
would be 1 - 0.05 = 0.95, or 95%.
When do you use confidence intervals?
You can calculate confidence intervals for many kinds of statistical estimates, including:
Proportions
Population means
Differences between population means or proportions
Estimates of variation among groups
These are all point estimates, and don’t give any information about the variation around the
number. Confidence intervals are useful for communicating the variation around a point
estimate.
Example: Variation around an estimate
You survey 100 Brits and 100 Americans about their television-watching habits, and
find that both groups watch an average of 35 hours of television per week.
However, the British people surveyed had a wide variation in the number of hours
watched, while the Americans all watched similar amounts.Even though both groups have the same point estimate (average number of hours
watched), the British estimate will have a wider confidence interval than the American
estimate because there is more variation in the data.
What can proofreading do for your paper?
Scribbr editors not only correct grammar and spelling mistakes, but also strengthen
your writing by making sure your paper is free of vague language, redundant words,
and awkward phrasing.
©
See editing exampleCalculating a confidence interval: what you need to
know
Most statistical programs will include the confidence interval of the estimate when you run a
statistical test.
If you want to calculate a confidence interval on your own, you need to know:
1. The point estimate you are constructing the confidence interval for
2. The critical values for the test statistic
3. The standard deviation of the sample
4. The sample size
Once you know each of these components, you can calculate the confidence interval for your
estimate by plugging them into the confidence interval formula that corresponds to your data.
Point es'
ate
The point estimate of your confidence interval will be whatever statistical estimate you are
making (e.g., population mean, the difference between population means, proportions,
variation among groups).
Example: Point estimate
In the TV-watching example, the point estimate is the mean number of hours watched:
35.
ding the critical value
Critical values tell you how many standard deviations away from the mean you need to go in
order to reach the desired confidence level for your confidence interval.
There are three steps to find the critical value.
1. Choose your alpha (a) value.
The alpha value is the probability threshold for statistical significance. The most common
alpha value is p = 0.05, but 0.1, 0.01, and even 0.001 are sometimes used. It's best to look at
the research papers published in your field to decide which alpha value to use.2. Decide if you need a one-tailed interval or a two-tailed interval.
You will most likely use a two-tailed interval unless you are doing a one-tailed t test.
For a two-tailed interval, divide your alpha by two to get the alpha value for the upper and
lower tails.
3. Look up the critical value that corresponds with the alpha value.
If your data follows a normal distribution, or if you have a large sample size (n > 30) that is
approximately normally distributed, you can use the z distribution to find your critical values.
For az statistic, some of the most common values are shown in this table:
alpha for one-tailed Cl
oa
0.05
0.01
164
196
287
If you are using a small dataset (n = 30) that is approximately normally distributed, use the ¢
distribution instead.The t distribution follows the same shape as the z distribution, but corrects for small sample
sizes. For the t distribution, you need to know your degrees of freedom (sample size minus 1).
Check out this set of t tables to find your t statistic. We have included the confidence level
and p values for both one-tailed and two-tailed tests to help you find the t value you need.
For normal distributions, like the t distribution and z distribution, the critical value is the same
on either side of the mean.
Example: Critical value
In the TV-watching survey, there are more than 30 observations and the data follow an
approximately normal distribution (bell curve), so we can use the z distribution for our
test statistics.
For a two-tailed 95% confidence interval, the alpha value is 0.025, and the
corresponding critical value is 1.96.
This means that to calculate the upper and lower bounds of the confidence interval, we
can take the mean *1.96 standard deviations from the mean.
Finding the standard deviation
Most statistical software will have a built-in function to calculate your standard deviation, but
to find it by hand you can first find your sample variance, then take the square root to get the
standard deviation.
1. Find the sample variance
Sample variance is defined as the sum of squared differences from the mean, also known as
the mean-squared-error (MSE):
To find the MSE, subtract your sample mean from each value in the dataset, square the
resulting number, and divide that number by n ~ 1 (sample size minus 1).Then add up all of these numbers to get your total sample variance (s?). For larger sample
sets, it's easiest to do this in Excel.
2. Find the standard deviation.
The standard deviation of your estimate (s) is equal to the square root of the sample
variance/sample error (s*):
Example: Standard deviation
In the television-watching survey, the variance in the GB estimate is 100, while the
variance in the USA estimate is 25. Taking the square root of the variance gives us a
sample standard deviation (s) of:
+ 10 for the GB estimate.
+ 5 for the USA estimate.
Sample size
The sample size is the number of observations in your data set.
Example: Sample size
In our survey of Americans and Brits, the sample size is 100 for each group.
Confidence interval for the mean of normally-
distributed data
Normally-distributed data forms a bell shape when plotted on a graph, with the sample mean
in the middle and the rest of the data distributed fairly evenly on either side of the mean.
The confidence interval for data which follows a standard normal distribution is:Where:
Cl = the confidence interval
X = the population mean
2* = the critical value of the z distribution
the population standard deviation
Yn = the square root of the population size
The confidence interval for the t distribution follows the same formula, but replaces the Z*
with the t*
In real life, you never know the true values for the population (unless you can do a complete
census). Instead, we replace the population values with the values from our sample data, so
the formula becomes:
Where:
+ °x = the sample mean
+ s = the sample standard deviation
Example: Calculating the confidence interval
In the survey of Americans’ and Brits’ television watching habits, we can use the
sample mean, sample standard deviation, and sample size in place of the population
mean, population standard deviation, and population size.
To calculate the 95% confidence interval, we can simply plug the values into the
formula.
For the USA:So for the USA, the lower and upper bounds of the 95% confidence interval are 34.02
and 35.98.
For GB:
So for the GB, the lower and upper bounds of the 95% confidence interval are 33.04
and 36.96.
Confidence interval for proportions
The confidence interval for a proportion follows the same pattern as the confidence interval
for means, but place of the standard deviation you use the sample proportion times one
minus the proportion:
Where:
+ “p= the proportion in your sample (e.g. the proportion of respondents who said they
watched any television at all)
* Z*= the critical value of the z distribution
+ n= the sample sizeHere's why students love Scribbr's proofreading
services
Excellent FSESESEIE
Rated 4.7 / 5 based on 3,256 reviews on
ec Trustpilot
Katharina Rotté, 2 days ago
All good and quick
@>
4
Reply from Seribbr 2 days ago
Read more y
Discover proofreading & editing
Confidence interval for non-normally distributed data
To calculate a confidence interval around the mean of data that is not normally distributed,
you have two choices:
1. You can find a distribution that matches the shape of your data and use that distribution
to calculate the confidence interval.
2. You can perform a transformation on your data to make it fit a normal distribution, and
then find the confidence interval for the transformed data.
Performing data transformations is very common in statistics, for example, when data follows
a logarithmic curve but we want to use it alongside linear data. You just have to remember to
do the reverse transformation on your data when you calculate the upper and lower bounds
of the confidence intervalReporting confidence intervals
Confidence intervals are sometimes reported in papers, though researchers more often
report the standard deviation of their estimate.
If you are asked to report the confidence interval, you should include the upper and lower
bounds of the confidence interval.
Example: Reporting a confidence interval
"We found that both the US and Great Britain averaged 35 hours of television watched
per week, although there was more variation in the estimate for Great Britain (95% Cl
= 33.04, 36.96) than for the US (95% Cl = 34.02, 35.98)."
One place that confidence intervals are frequently used is in graphs. When showing the
differences between groups, or plotting a linear regression, researchers will often include the
confidence interval to give a visual representation of the variation around the estimate.
Example: Confidence interval in a graph
You may decide to plot the point estimates of the mean number of hours of television
watched in the USA and Great Britain, with the 95% confidence interval around the
mean.
Caution when using confidence intervals
Confidence intervals are sometimes interpreted as saying that the ‘true value’ of your
estimate lies within the bounds of the confidence interval.This is not the case. The confidence interval cannot tell you how likely it is that you found the
true value of your statistical estimate because it is based on a sample, not on the whole
population.
The confidence interval only tells you what range of values you can expect to find if you re-do
your sampling or run your experiment again in the exact same way.
The more accurate your sampling plan, or the more realistic your experiment, the greater the
chance that your confidence interval includes the true value of your estimate. But this
accuracy is determined by your research methods, not by the statistics you do after you have
collected the data!
Other interesting articles
If you want to know more about statistics, methodology, or research bias, make sure to check
out some of our other articles with explanations and examples.
luil Statistics
+ Normal distribution
+ Kurtosis
* Descriptive statistics
+ Measures of central tendency
* Correlation coefficient
+ pvalue
#= Methodology
* Cluster sampling
Stratified sampling
Types of interviews
Case study
Cohort study+ Thematic analysis
@ Research
+ Implicit bias
* Cognitive bias
* Survivorship bias
* Availability heuristic
+ Nonresponse bias
* Regression to the mean
Frequently asked questions about confidence intervals
What is the difference between a confidence interval and a confidence level? >
How do you calculate a confidence interval? >
What is a standard normal distribution? >
What are z-scores and t-scores? >
What is a critical value? >
What does it mean if my confidence interval includes zero? >
How do | calculate a confidence interval if my data are not normally distributed? >
Cite this Scribbr article
If you want to cite this source, you can copy and paste the citation or click the "Cite this
Scribbr article” button to automatically add the citation to our free Citation Generator.Bevans, R. (2023, June 22). Understanding Confidence intervals | Easy
Examples & Formulas. Scribbr. Retrieved September 11, 2023, from Cite this article
[Link]
Is this article helpful?
581 100
Rebecca Bevans
Rebecca is working on her PhD in soil ecology and spends her free time writing. She's very
happy to be able to nerd out about statistics with all of you.
Other students also liked
Understanding P values | Definition and Examples
The p-value shows the likelihood of your data occurring under the null hypothesis. P-values help determine statistical
significance,
1045
Test statistics | Definition, Interpretation, and ExamplesThe test statistic is a number, calculated from a statistical test, used to find if your data could have occurred under
the null hypothesis.
235
How to Calculate Standard Deviation (Guide) | Calculator &
Examples
The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each
score lies from the mean,
1706
Seribbr
Our editors
Jobs
Partners
FAQ
Our services
Plagiarism Checker
Proofreading Services
Citation Generator
Free Al Detector
Paraphrasing Tool
Grammar Checker
Free Text Summarizer
Citation Checker
Knowledge Base
Contact
info@[Link]
& +1 (510) 822-806647
Terms of Use
Privacy Policy
Copyright Policy
Happiness guarantee