0% found this document useful (0 votes)
653 views

Stat Bivariate Data Analysis Project Olympics Fall 2020-1

This document analyzes data on winning times for the men's 110-meter hurdles at the Olympics from 1948 to 2016. It finds a strong linear negative relationship between Olympic year and winning time, with times decreasing by about 0.0138 seconds per year. The linear regression equation derived from the data accurately predicts past winning times and estimates a winning time of 12.83 seconds for the 2020 Olympics.

Uploaded by

api-403963757
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
653 views

Stat Bivariate Data Analysis Project Olympics Fall 2020-1

This document analyzes data on winning times for the men's 110-meter hurdles at the Olympics from 1948 to 2016. It finds a strong linear negative relationship between Olympic year and winning time, with times decreasing by about 0.0138 seconds per year. The linear regression equation derived from the data accurately predicts past winning times and estimates a winning time of 12.83 seconds for the 2020 Olympics.

Uploaded by

api-403963757
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Ila Langelotti 1

110m Men’s Hurdles

Bivariate Data Analysis Olympics Project


1.
1948 1952 1956 1960 1964 1968 1972 1976 1980
13.9 13.7 13.5 13.8 13.6 13.3 13.24 13.30 13.39
1984 1988 1992 1996 2000 2004 2008 2012 2016
13.2 12.98 13.12 12.95 13.00 12.91 12.93 12.92 13.05
2.

3. Direction= negative
Form= Linear
Strength= Fairly Strong
4.
Ila Langelotti 2
110m Men’s Hurdles
5. The scatterplot is appropriate for displaying the linear relationship between Olympic year and the winning
time of men's 110-meter hurdles because it shows a strong linear and negative relationship. The residual
plot is also appropriate for predicting winning times because it supports the plot's strong correlation.
However, the residential plot is more appropriate for displaying the data because the data is in a curved
pattern rather than randomly scattered. There are also more data points below the line, not equal amounts
above and below. the scatter plot is appropriate for describing the data but there is residual plot is not.
6. I think that postponing the 2020 Summer Olympics will not have an effect on my data set because there
just will not be data for that year. Meaning that it is not going to skew my data or cause an outlier, there
just simply will be an open spot for the year 2020.
7. Y=40.7057-0.0138x
Y= predicted gold time
X=Olympic year
8. The slope is -0.0138. For each additional year, the predicted winning time decreases by 0.0138 seconds.
9. The Y intercept is 40.7057. If zero years are spent at the Olympics, the predicted winning time is 40.7057
However, this statement is not possible for there to be a winning time if no one competes.
10. Y=40.7057-0.0138(2020)
= 12.83seconds
11. Y=40.7057-0.0138(1996)
=13.22
Residual= actual y-predicted y
=13.12-13.22
=-.1
12. The correlation coefficient is -0.9185. The data is strong, linear, and negative since the correlation
coefficient is close to a value of -1
13. The coefficient of determination is 0.8436(-.91852). In other words, 84.36% of variation in winning times
can be explained by the linear relationship with the Olympic year.
14. Independent variable: Olympic Year
Mean=1984
Standard deviation=21.354
Dependent variable: Winning Time
Mean=13.27
Standard deviation=.3218
15. b= (-.9185) (.3218/21.354)
= -.01384
16. a=13.27- (-.01384) (1982)
= 40.7008
Yes, the LSRL goes through the ( x́ , ý ) point

You might also like