0% found this document useful (0 votes)
36 views16 pages

Lecture 16 Confidence Interval

this document tells what confidence interval is and how to calculate it

Uploaded by

agam taneja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
36 views16 pages

Lecture 16 Confidence Interval

this document tells what confidence interval is and how to calculate it

Uploaded by

agam taneja
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
© Scribbr Understanding Confidence Intervals | Easy Examples & Formulas Published on August 7, 2020 by Rebecca Bevans. Revised on June 22, 2023. When you make an estimate in statistics, whether it is a summary statistic or a test statistic, there is always uncertainty around that estimate because the number is based on a sample of the population you are studying. The confidence interval is the range of values that you expect your estimate to fall between a certain percentage of the time if you run your experiment again or re-sample the population in the same way. The confidence level is the percentage of times you expect to reproduce an estimate between the upper and lower bounds of the confidence interval, and is set by the alpha value. Table of contents What exactly is a confidence interval? Calculating a confidence interval: what you need to know Confidence interval for the mean of normally-distributed data Confidence interval for proportions Confidence interval for non-normally distributed data Reporting confidence intervals Caution when using confidence intervals Other interesting articles ©PNOKRwONSA Frequently asked questions about confidence intervals What exactly is a confidence interval? A confidence interval is the mean of your estimate plus and minus the variation in that estimate. This is the range of values you expect your estimate to fall between if you redo your test, within a certain level of confidence. Confidence, in statistics, is another way to describe probability. For example, if you construct a confidence interval with a 95% confidence level, you are confident that 95 out of 100 times the estimate will fall between the upper and lower values specified by the confidence interval. Your desired confidence level is usually one minus the alpha (a) value you used in your statistical test: Confidence level = So if you use an alpha value of p < 0.05 for statistical significance, then your confidence level would be 1 - 0.05 = 0.95, or 95%. When do you use confidence intervals? You can calculate confidence intervals for many kinds of statistical estimates, including: Proportions Population means Differences between population means or proportions Estimates of variation among groups These are all point estimates, and don’t give any information about the variation around the number. Confidence intervals are useful for communicating the variation around a point estimate. Example: Variation around an estimate You survey 100 Brits and 100 Americans about their television-watching habits, and find that both groups watch an average of 35 hours of television per week. However, the British people surveyed had a wide variation in the number of hours watched, while the Americans all watched similar amounts. Even though both groups have the same point estimate (average number of hours watched), the British estimate will have a wider confidence interval than the American estimate because there is more variation in the data. What can proofreading do for your paper? Scribbr editors not only correct grammar and spelling mistakes, but also strengthen your writing by making sure your paper is free of vague language, redundant words, and awkward phrasing. © See editing example Calculating a confidence interval: what you need to know Most statistical programs will include the confidence interval of the estimate when you run a statistical test. If you want to calculate a confidence interval on your own, you need to know: 1. The point estimate you are constructing the confidence interval for 2. The critical values for the test statistic 3. The standard deviation of the sample 4. The sample size Once you know each of these components, you can calculate the confidence interval for your estimate by plugging them into the confidence interval formula that corresponds to your data. Point es' ate The point estimate of your confidence interval will be whatever statistical estimate you are making (e.g., population mean, the difference between population means, proportions, variation among groups). Example: Point estimate In the TV-watching example, the point estimate is the mean number of hours watched: 35. ding the critical value Critical values tell you how many standard deviations away from the mean you need to go in order to reach the desired confidence level for your confidence interval. There are three steps to find the critical value. 1. Choose your alpha (a) value. The alpha value is the probability threshold for statistical significance. The most common alpha value is p = 0.05, but 0.1, 0.01, and even 0.001 are sometimes used. It's best to look at the research papers published in your field to decide which alpha value to use. 2. Decide if you need a one-tailed interval or a two-tailed interval. You will most likely use a two-tailed interval unless you are doing a one-tailed t test. For a two-tailed interval, divide your alpha by two to get the alpha value for the upper and lower tails. 3. Look up the critical value that corresponds with the alpha value. If your data follows a normal distribution, or if you have a large sample size (n > 30) that is approximately normally distributed, you can use the z distribution to find your critical values. For az statistic, some of the most common values are shown in this table: alpha for one-tailed Cl oa 0.05 0.01 164 196 287 If you are using a small dataset (n = 30) that is approximately normally distributed, use the ¢ distribution instead. The t distribution follows the same shape as the z distribution, but corrects for small sample sizes. For the t distribution, you need to know your degrees of freedom (sample size minus 1). Check out this set of t tables to find your t statistic. We have included the confidence level and p values for both one-tailed and two-tailed tests to help you find the t value you need. For normal distributions, like the t distribution and z distribution, the critical value is the same on either side of the mean. Example: Critical value In the TV-watching survey, there are more than 30 observations and the data follow an approximately normal distribution (bell curve), so we can use the z distribution for our test statistics. For a two-tailed 95% confidence interval, the alpha value is 0.025, and the corresponding critical value is 1.96. This means that to calculate the upper and lower bounds of the confidence interval, we can take the mean *1.96 standard deviations from the mean. Finding the standard deviation Most statistical software will have a built-in function to calculate your standard deviation, but to find it by hand you can first find your sample variance, then take the square root to get the standard deviation. 1. Find the sample variance Sample variance is defined as the sum of squared differences from the mean, also known as the mean-squared-error (MSE): To find the MSE, subtract your sample mean from each value in the dataset, square the resulting number, and divide that number by n ~ 1 (sample size minus 1). Then add up all of these numbers to get your total sample variance (s?). For larger sample sets, it's easiest to do this in Excel. 2. Find the standard deviation. The standard deviation of your estimate (s) is equal to the square root of the sample variance/sample error (s*): Example: Standard deviation In the television-watching survey, the variance in the GB estimate is 100, while the variance in the USA estimate is 25. Taking the square root of the variance gives us a sample standard deviation (s) of: + 10 for the GB estimate. + 5 for the USA estimate. Sample size The sample size is the number of observations in your data set. Example: Sample size In our survey of Americans and Brits, the sample size is 100 for each group. Confidence interval for the mean of normally- distributed data Normally-distributed data forms a bell shape when plotted on a graph, with the sample mean in the middle and the rest of the data distributed fairly evenly on either side of the mean. The confidence interval for data which follows a standard normal distribution is: Where: Cl = the confidence interval X = the population mean 2* = the critical value of the z distribution the population standard deviation Yn = the square root of the population size The confidence interval for the t distribution follows the same formula, but replaces the Z* with the t* In real life, you never know the true values for the population (unless you can do a complete census). Instead, we replace the population values with the values from our sample data, so the formula becomes: Where: + °x = the sample mean + s = the sample standard deviation Example: Calculating the confidence interval In the survey of Americans’ and Brits’ television watching habits, we can use the sample mean, sample standard deviation, and sample size in place of the population mean, population standard deviation, and population size. To calculate the 95% confidence interval, we can simply plug the values into the formula. For the USA: So for the USA, the lower and upper bounds of the 95% confidence interval are 34.02 and 35.98. For GB: So for the GB, the lower and upper bounds of the 95% confidence interval are 33.04 and 36.96. Confidence interval for proportions The confidence interval for a proportion follows the same pattern as the confidence interval for means, but place of the standard deviation you use the sample proportion times one minus the proportion: Where: + “p= the proportion in your sample (e.g. the proportion of respondents who said they watched any television at all) * Z*= the critical value of the z distribution + n= the sample size Here's why students love Scribbr's proofreading services Excellent FSESESEIE Rated 4.7 / 5 based on 3,256 reviews on ec Trustpilot Katharina Rotté, 2 days ago All good and quick @> 4 Reply from Seribbr 2 days ago Read more y Discover proofreading & editing Confidence interval for non-normally distributed data To calculate a confidence interval around the mean of data that is not normally distributed, you have two choices: 1. You can find a distribution that matches the shape of your data and use that distribution to calculate the confidence interval. 2. You can perform a transformation on your data to make it fit a normal distribution, and then find the confidence interval for the transformed data. Performing data transformations is very common in statistics, for example, when data follows a logarithmic curve but we want to use it alongside linear data. You just have to remember to do the reverse transformation on your data when you calculate the upper and lower bounds of the confidence interval Reporting confidence intervals Confidence intervals are sometimes reported in papers, though researchers more often report the standard deviation of their estimate. If you are asked to report the confidence interval, you should include the upper and lower bounds of the confidence interval. Example: Reporting a confidence interval "We found that both the US and Great Britain averaged 35 hours of television watched per week, although there was more variation in the estimate for Great Britain (95% Cl = 33.04, 36.96) than for the US (95% Cl = 34.02, 35.98)." One place that confidence intervals are frequently used is in graphs. When showing the differences between groups, or plotting a linear regression, researchers will often include the confidence interval to give a visual representation of the variation around the estimate. Example: Confidence interval in a graph You may decide to plot the point estimates of the mean number of hours of television watched in the USA and Great Britain, with the 95% confidence interval around the mean. Caution when using confidence intervals Confidence intervals are sometimes interpreted as saying that the ‘true value’ of your estimate lies within the bounds of the confidence interval. This is not the case. The confidence interval cannot tell you how likely it is that you found the true value of your statistical estimate because it is based on a sample, not on the whole population. The confidence interval only tells you what range of values you can expect to find if you re-do your sampling or run your experiment again in the exact same way. The more accurate your sampling plan, or the more realistic your experiment, the greater the chance that your confidence interval includes the true value of your estimate. But this accuracy is determined by your research methods, not by the statistics you do after you have collected the data! Other interesting articles If you want to know more about statistics, methodology, or research bias, make sure to check out some of our other articles with explanations and examples. luil Statistics + Normal distribution + Kurtosis * Descriptive statistics + Measures of central tendency * Correlation coefficient + pvalue #= Methodology * Cluster sampling Stratified sampling Types of interviews Case study Cohort study + Thematic analysis @ Research + Implicit bias * Cognitive bias * Survivorship bias * Availability heuristic + Nonresponse bias * Regression to the mean Frequently asked questions about confidence intervals What is the difference between a confidence interval and a confidence level? > How do you calculate a confidence interval? > What is a standard normal distribution? > What are z-scores and t-scores? > What is a critical value? > What does it mean if my confidence interval includes zero? > How do | calculate a confidence interval if my data are not normally distributed? > Cite this Scribbr article If you want to cite this source, you can copy and paste the citation or click the "Cite this Scribbr article” button to automatically add the citation to our free Citation Generator. Bevans, R. (2023, June 22). Understanding Confidence intervals | Easy Examples & Formulas. Scribbr. Retrieved September 11, 2023, from Cite this article [Link] Is this article helpful? 581 100 Rebecca Bevans Rebecca is working on her PhD in soil ecology and spends her free time writing. She's very happy to be able to nerd out about statistics with all of you. Other students also liked Understanding P values | Definition and Examples The p-value shows the likelihood of your data occurring under the null hypothesis. P-values help determine statistical significance, 1045 Test statistics | Definition, Interpretation, and Examples The test statistic is a number, calculated from a statistical test, used to find if your data could have occurred under the null hypothesis. 235 How to Calculate Standard Deviation (Guide) | Calculator & Examples The standard deviation is the average amount of variability in your dataset. It tells you, on average, how far each score lies from the mean, 1706 Seribbr Our editors Jobs Partners FAQ Our services Plagiarism Checker Proofreading Services Citation Generator Free Al Detector Paraphrasing Tool Grammar Checker Free Text Summarizer Citation Checker Knowledge Base Contact info@[Link] & +1 (510) 822-8066 47 Terms of Use Privacy Policy Copyright Policy Happiness guarantee

You might also like