Studying Bivariate Data
Math 9E Summative Assessment
Task:
You will find a data set that includes a minimum of 30 points (I recommend somewhere between 30 -
60 data points). You will then use the skills from this unit to complete the graphic organizer below by
analyzing the data and explaining why your conclusions are important to the general population.
Criterion C: communicating
Level 1-2 Level 3-4 Level 5-6 Level 7-8
i. Use limited i. Use some i. Usually use i. Consistently use
mathematical appropriate appropriate appropriate
language. mathematical mathematical mathematical
ii. Use limited forms language. language. language.
of mathematical ii. Use appropriate ii. Usually use ii. Use appropriate
representation to forms of appropriate forms forms of
present mathematical of mathematical mathematical
information. representation to representation to representation to
iii. Communicate present present information consistently present
through lines of information correctly. information
reasoning that are adequately. iii. Usually move correctly.
difficult to iii. Communicate between different iii. Move effectively
understand. through lines of forms of between different
reasoning that are mathematical forms of
complete. representation. mathematical
iv. Adequately iv. Communicate representation.
organize through lines of iv. Communicate
information using reasoning that are through lines of
a logical structure. complete and reasoning that are
coherent. complete, coherent
v. Present work that and concise.
is usually organized v. Present work that
using a logical is consistently
structure. organized using a
logical structure.
Task Specific Clarifications:
Reminders in order to do well:
● Be sure to use the technical/mathematical language that we practiced in this unit.
● Make sure that your explanations are clear and that they make sense.
● Make sure that you discuss both your graph and your equation to show that you understand both
representations of the data AND how they are related to each other.
Student Justification Teacher Justification
I think I
Criterion D: applying maths in real world contexts
Level 1-2 Level 3-4 Level 5-6 Level 7-8
i. Identify some of i. Identify the i. Identify the i. Identify the
the elements of the relevant elements relevant elements relevant elements
authentic real-life of the authentic of the authentic of the authentic
situation. real-life situation. real-life situation. real-life situation.
ii. Apply ii. Select, with some ii. Select adequate ii. Select adequate
mathematical success, adequate mathematical mathematical
strategies to find a mathematical strategies to model strategies to model
solution to the strategies to the authentic the authentic
authentic real-life model the real-life situation. real-life situation.
situation, with authentic real-life iii. Apply the selected iii. Apply the selected
limited success. situation. mathematical mathematical
iii. Apply strategies to reach strategies to reach
mathematical a valid solution to a correct solution to
strategies to reach the authentic the authentic
a solution to the real-life situation. real-life situation.
authentic real-life iv. Explain the degree iv. Justify the degree
situation. of accuracy of the of accuracy of the
iv. Discuss whether solution. solution.
the solution makes v. Explain whether the v. Justify whether the
sense in the solution makes solution makes
context of the sense in the context sense in the context
authentic real-life of the authentic of the authentic
situation. real-life situation. real-life situation.
Task Specific Clarifications:
Reminders in order to do well:
● Make sure that your responses show critical thinking.
● Your responses need to show me how you connect the math part of the unit (statistics) with real
world applications (the “so what” part of the unit).
● Ensure you discuss the accuracy of your answers (trendlines, interpolating, extrapolating).
Student Justification Teacher Justification
I think I should get a 7 on this because I
use the technical/mathematical
language that we practiced in this
unit to communicatie my thought.
All of the responses show my
critical thinking. Last I my responses
show you how can I connect the
math part of the unit (statistics)
with real world applications.
Your Work Begins Here
Provide the reference for your data set here.
Data link:
https://www.kaggle.com/mdhrumil/top-5000-youtube-channels-data-from-socialblade
APA citation:
Socialblade. (2018, September 13). Top 5000 youtube channels data from socialblade. Retrieved
December 13, 2018, from https://www.kaggle.com/mdhrumil/
top-5000-youtube-channels-data-from-socialblade
Describe your data set. What are you comparing? Why do you think that this will be
interesting or important data to study? ( Minimum 4 sentences)
About the data:
Socialblade was a well known company that mainly focused on provide datas and statistics.
This company worked with YouTube, instagram and many more different companies by
recorded the datas of these companies and analyzed the datas to Statistical chart. At the
official website of Socialblade, we found a data set that was about the top 5000 YouTube
channels and some basic information of them.
There were 3 variables that showed in this data. Video uploads, Subscribers and Video views
of the video.
The first time I noticed this data was because in one day typed the key word “youtube” on
the research bar of a data app and this data set just showed up on the first of the research
list. So I just click this data set. After I really looked at this data, I felt this was a interesting
data just because my favorite youtuber was on the top 5000 youtuber list. All because of
my favorite youtuber I decided to choose this data set.
Identify the two variables that are being compared in this data. Which is the independent
variable and which is the dependent variable? (M
inimum 2 sentences)
The variables I am comparing:
As the data said, there were 3 different variables in this data. However for keep the matters
simple I would only compare 2 variables. The 2 variables I would like to compare were Video
uploads and video views. Because the two variables that I interested the most were Video
uploads and video views.
I think my X variable will be video uploads, and my Y variable will be video views. I think like
that because I think the video views of a video was dependent on how much videos you
uploaded.
Select a Global Context and explain how this data relates to that Global Context. ( Minimum
4 sentences)
The global context:
I chose this data set because I had been researching a data set for Identities and
relationships and this data set were the really interesting data that fixed in this global
contexts. This data set fixed in this global contexts because this data set were all about how
one individual connected to the big environment through internet. To be more detail, a video
had been upload to Youtube. We could think that video as a individual,because it
represented that individual by sharing the person’s life,hobbies, music etc. Others looked the
video and gave likes and comments. Which gave their ideas and some of their experiment
through the video. At this point, 2 different individuals connected to each other by one video
which meant there were a new relationship between 2 individual while they never knew each
other before. Video views could gave you a basic information of how many other individuals
connected with you by viewed your life. Video uploads showed you how much you wanted
to drew relationships with others and subscribers showed you how much people enjoyed
your life.
Predict what kind of correlation you will find when you graph this data and e
xplain why you
think it will be that type of correlation. ( Minimum 3 sentences)
Prediction:
My prediction for this data set would be more videos was uploaded by the youtuber then
more video views which the youtuber would received. Based on my prediction, this data
would be a passive linear association. I believed there will be a strong association between
these two variables because the two variables were very connected to each other.
Now use Google Sheets to c
reate a scatterplot of your data. Insert your graph with a
trendline here.
What is the e
quation of the trendline and the R2 value
? What does your equation tell you
about the data? What does your R value tell you about your equation? ( Minimum 3
2
sentences)
Equation:
Y(views)= 145881x(uploads) + 5616336614
This equation came from the formula of linear function which is Y=mx+b. M meant the slope
of the line, and b represented the y-intercept which was t he point where the line crosses the
vertical y-axis.
In this equation y-intercept (92, 5616336614)meant when you at least had to published 92
videos to got yourself famous. Every time you posted 1 more videos you will got 145881
more video views than before.(slop)
R2 value: 0.117
The R2 value
told me the association between x and y variable. More the R2 value was close
to 1 then the stronger association between x and y variable were. when R2 value
was equal
to 1 then the association between x and y variable were perfect. In this particular example
the association between video views (X variable) and video uploads (Y variable) were weak.
Because the R2 value
was equal to 0.117 which was not even close to 1. This meant my
prediction of the association (Details was on the periodic section)were wrong, the correct
association were passive (weak) linear association.
Based on your trendline, What predictions or inferences can you make about your data?
Provide an expected value (show your work) AND j ustify your result. (Minimum 4 sentences)
● You should have at least one piece of information that would be considered
interpolating.
● You should have at least one prediction that requires you to extrapolate your data.
● You should explain why both those points are significant.
● Number Predictions:
Interpolating prediction: If a youtuber uploaded 75000 videos then the video views would be
16557411614 Views.
● How to get this answer:
Equation:
Y(views)= 145881x(uploads) + 5616336614
Y= (145881 х 75000) + 5616336614=16557411614
● Extrapolation prediction:
If a youtuber wanted 70000000000 video views of each video, then the youtuber would
uploaded 441344 videos.
● H ow to get this answer:
Equation:
Y(views)= 145881x(uploads) + 5616336614
70000000000= (145881 х X) + 5616336614
X= 441344 videos
● Perdiction:
m
ore videos was uploaded by the youtuber then more video views which the youtuber
would received.
● Justify the graph:
1. Cluster- from the graph we know that most of the youtuber only uploaded around
676 to 4710 videos in their channel. I think this is because even knew many of the
youtubers are creative and liked to share as many video as possible. However it just
to unreal to posted that much video because it was hard to produced a good quality
video and they needed some time to relax too.
2. Outliers - From this graph we knew that there were a outlier in this graph.
(12661,47548839843) After checking the data I found out that the youtube account of
this outlier was called T-SERIES. T-SERIES was India's largest Music Label & Movie
Studio and almost whole entirely of India was a fan of T-SERIES and India was one of
the countries that had the most population. Just think if every of the Indians watch
the videos produced by T-SERIES in TV then how much video views will produce in a
day. Well probably a lot. So I am not surprised that T-SERIES will be a O utliers.
Why might someone be interested in your study? In other words, what makes this
interesting data to study? What information or deeper understanding does it provide? How
could people use the results of your study? (Minimum 4 sentences)
The youtuber probably would love this data set. Because the youtubers made money out of
video views. The more views the more money the youtuber get. If a professional youtuber
looked this data then that youtuber would get a idea of how much amount of videos they
should produce to made the amount of video views that they expected. So they would get
the amount of the moneys they want.
What could you r ecommend as a follow-up study? I.e.: How could your conclusions be
used to help people make decisions or to create a new study? ( Minimum 4 sentences)
Since many youtubers need video views to make money. So I think the follow-up study will
be the relationship between video views and how much money do the youtubers get. So the
youtubers can have a very specific goal of how much video views they need for each video.
This study will still be able to fix in the same global context which is Identities and
relationships. Because all of the studies that related to social medias has the same idea of
one individual connected to the big environment through internet.