0% found this document useful (0 votes)

41 views

Predictive Numericals 20 Questions

Uploaded by

fwtngwf47h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views

Predictive Numericals 20 Questions

Uploaded by

fwtngwf47h

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

1.

Given the following datasets:

(a) Calculate the mean, median, and mode (if it exists) for the following data.

Data: 78, 85, 92, 67, 70, 88, 92, 81, 85, 79

(b) Determine the missing value x for the following data, given that the mean is 27.

Data: 23, 27, 31, 22, 30, 28, x, 26, 29

(c) Calculate all three measures of central tendency, and identify which measure(s) is/are
affected by the outlier (250).

Data: 42, 45, 47, 44, 43, 46, 44, 250

2. The number of books read by students in a semester is as follows:

Number of Books Frequency

0-2 4
3-5 8
6-8 6
9-11 2

• Estimate the mean number of books read using the midpoint method.
• Determine the modal class.
• Find the median class.

3. A company conducted a survey of annual incomes (in thousands) of employees in two depart-
ments:

Department A: 35, 40, 38, 50, 100, 120 Department B: 35, 38, 39, 40, 41, 42

• Calculate the mean, median and mode for both departments.

• Which department has more income disparity? Justify your answer using the calculated
measures.

4. The following are the weekly working hours of employees in a company:

Data: 32, 35, 40, 38, 42, 45, 39, 44, 46, 41, 48, 50, 36, 43, 49

• Calculate the first quartile Q1 , the second quartile Q2 (median) and the third quartile Q3 .
• Find the five-number summary and construct a box plot based on it.

5. Two sets of exam scores are given:

Class A: 55, 60, 65, 70, 75, 80, 85, 90 Class B: 45, 50, 55, 60, 75, 80, 85, 95

• Calculate the quartiles for both classes.

• Which class has a higher interquartile range (IQR)?

6. The following are the yearly expenses (in thousands) of a group of individuals:

Data: 150, 160, 165, 170, 175, 180, 185, 200, 250, 500

• Identify any outliers in the data using the IQR method.

7. The following table shows the ages of employees in a company:

Age (years) Frequency

20-25 5
26-30 7
31-35 12
36-40 8
41-45 5

• Estimate the first quartile Q1 , the median/second quartile Q2 and the third quartile Q3 .

8. Two employees track their monthly sales over the past year:

Employee A: 12, 15, 14, 16, 18, 17, 19, 20, 22, 24, 21, 25

Employee B: 10, 30, 20, 40, 50, 25, 35, 45, 55, 30, 25, 60

• Calculate the variance for both employees’ monthly sales.

• Which employee shows more variability in their sales?
Justify your answer with a proper statistical measure.

9. Given the following dataset of the sizes (in square feet) and corresponding prices (in thousands)
of 8 houses:

Size (sq ft) Price (in 1000)

1500 300
1800 360
2100 420
2400 480
2600 520
3000 600
3200 640
3500 700

• Calculate the variance and standard deviation of the house sizes.

• Calculate the population variance and population standard deviation of the house prices.
• Calculate the sample variance and sample standard deviation of the house prices, assuming
this is a sample of a larger population.
• If the price of houses are increased by 100 thousand to the current house price, calculate
the new variance and standard deviation. Comment on the effect of adding a constant to
all data points on variance and standard deviation.

10. The following table shows the distribution of marks obtained by students in a test:

Marks (Range) Frequency

0-20 3
21-40 7
41-60 12
61-80 6
81-100 2

• Estimate the variance and standard deviation of the marks.

11. The following are the exam scores of 20 students:

Data: 45, 50, 55, 60, 62, 63, 65, 68, 70, 72, 74, 75, 78, 80, 82, 83, 85, 88, 90, 95

• Construct a histogram for the data using class intervals of width 10.
• Describe the shape of the distribution (e.g., symmetric, skewed).
12. Given the following data set of incomes (in thousands):

Data: 22, 25, 28, 30, 35, 40, 42, 45, 48, 50

• Calculate the quartiles (Q1 , Q2 , and Q3 ).

• Construct a quantile plot using the calculated quantiles.

13. The following are the heights (in cm) of 10 individuals:

Data: 150, 152, 154, 156, 158, 160, 162, 164, 166, 168

• Generate a Q-Q plot to check if the data follows a normal distribution.

• Interpret the Q-Q plot and discuss whether the data appears to be normally distributed.

14. The following table provides data on hours studied and exam scores for 8 students:

Hours Studied Exam Scores

5 50
6 55
7 60
8 65
9 72
10 74
11 80
12 85

• Create a scatter plot for the data.

• Comment on the relationship between hours studied and exam scores. Is there a positive
or negative correlation?

15. The following table provides data on the size of houses (in square feet), the number of bedrooms,
and the corresponding house prices (in thousands) for 6 houses:

Size (sq ft) Number of Bedrooms Price (in 1000s)

1500 3 300
1800 4 360
2400 4 480
3000 5 600
3500 5 700
4000 6 800

Perform the following analyses:

• Calculate the correlation coefficient between the size of houses and their prices. Interpret
the result. Does it indicate positive, negative, or no correlation?
• Calculate the covariance between the size of houses and their prices.
• Calculate the covariance matrix for the variables: Size, Number of Bedrooms, and Price.
Interpret the signs of the covariances.
• Apply standardization to the house prices.
• Apply normalization (Min-Max scaling) to the house prices.

16. Consider the following two data points representing the ratings of two users on seven different
movies:
User A: (5, 4, 3, 2, 1, 3, 5) User B: (1, 2, 3, 4, 5, 2, 4)

• Compute the Manhattan distance, Euclidean distance and Minkowski distance (with h =
3) between the two users’ ratings across all seven movies.
• Discuss how these distance metrics reflects the similarity or dissimilarity between User A
and User B.
17. Consider the following two data points representing house prices (in thousands), house sizes (in
square feet), number of bedrooms, and number of bathrooms:

House A: (250, 1800, 3, 2) House B: (300, 2100, 4, 3)

• Calculate the Euclidean distance between the two houses without any feature scaling.
• Discuss the importance of feature scaling when calculating distance measures in machine
learning and re-calculate the Euclidean distance after scaling the features using min-max
normalization.

18. You are given the following data points representing two documents in a text classification task,
with each value representing the frequency of a certain term across six terms:

Document 1: (4, 2, 0, 3, 6, 1) Document 2: (3, 1, 2, 4, 5, 0)

• Calculate the cosine similarity between the two documents across all six terms.
• Compute the Euclidean distance and discuss how it differs from cosine similarity in inter-
preting document similarity.

19. Consider the following binary vectors representing the presence (1) or absence (0) of certain
features for three users in a machine learning dataset:

User A: (1, 0, 1, 0, 1, 0, 1) User B: (0, 1, 1, 0, 1, 1, 0) User C: (1, 1, 0, 1, 0, 1, 0)

• Compute the Hamming distance between the following pairs:

– User A and User B
– User A and User C
– User B and User C
• Calculate the Jaccard’s coefficien between the following pairs:
– User A and User B
– User A and User C
– User B and User C
• Compare the results obtained from Hamming distance and Jaccard’s coefficient for the
three pairs of users.

20. The following table provides data on the observed frequency of customers’ preference for three
types of products (A, B, and C) based on their income levels (Low, Medium, and High):

Income Level Product A Product B Product C Total

Low 20 30 50 100
Medium 40 40 20 100
High 40 20 40 100
Total 100 90 110 300

• Formulate the null hypothesis H0 and alternative hypothesis H1 for testing the indepen-
dence between Income Level and Product Preference.
• Calculate the expected frequencies for each cell under the assumption of independence.
• Perform the Chi-square (χ2 ) -test by calculating the χ2 statistic:
X (O − E)2
χ2 =
E
where O is the observed frequency and E is the expected frequency.
• Given a significance level of α = 0.05 and appropriate degrees of freedom, compare the
calculated χ2 -statistic with the critical value from the χ2 distribution table.
• Interpret the result and conclude whether there is a significant correlation between Income
Level and Product Preference.

STAT 231 Course Notes Winter
100% (1)
STAT 231 Course Notes Winter
358 pages
ss6th Grade Statistical Variability Chapter Questions
No ratings yet
ss6th Grade Statistical Variability Chapter Questions
7 pages
Week Two Assignment, Econometrics
No ratings yet
Week Two Assignment, Econometrics
4 pages
BS Sums
No ratings yet
BS Sums
22 pages
Quiz2 Source
No ratings yet
Quiz2 Source
8 pages
Module 3 Numericals
No ratings yet
Module 3 Numericals
3 pages
MTH 106_PROBLEM SHEET _2
No ratings yet
MTH 106_PROBLEM SHEET _2
3 pages
Chapter 11
No ratings yet
Chapter 11
5 pages
Business Statistics Practice Questions
No ratings yet
Business Statistics Practice Questions
8 pages
3 4 Worksheet For Loacation and Dispersion
No ratings yet
3 4 Worksheet For Loacation and Dispersion
17 pages
3 4 Worksheet For Loacation and Dispersion - PDF
No ratings yet
3 4 Worksheet For Loacation and Dispersion - PDF
17 pages
V Nishal 24MID0281
No ratings yet
V Nishal 24MID0281
17 pages
PGDRS 112-Exam Questions
No ratings yet
PGDRS 112-Exam Questions
3 pages
Homework Index: To See If The Questions Have Been Changed, or If You Are Required To Use Different Data or Examples
No ratings yet
Homework Index: To See If The Questions Have Been Changed, or If You Are Required To Use Different Data or Examples
86 pages
Practice Questions
No ratings yet
Practice Questions
5 pages
Stats Reviewer
No ratings yet
Stats Reviewer
41 pages
DA_Answer-Key
No ratings yet
DA_Answer-Key
12 pages
Dispersion Mise Exercise
No ratings yet
Dispersion Mise Exercise
10 pages
Assignment2
No ratings yet
Assignment2
6 pages
Mock Exam - Summer 2024 (Business Stat 1)
No ratings yet
Mock Exam - Summer 2024 (Business Stat 1)
10 pages
Core 6
No ratings yet
Core 6
5 pages
Getting To Know Your Data: 2.1 Exercises
100% (1)
Getting To Know Your Data: 2.1 Exercises
8 pages
Test 1 Review A
No ratings yet
Test 1 Review A
7 pages
ds_imp_qs
No ratings yet
ds_imp_qs
4 pages
Statistics and Probability Solved Assignments - Semester Spring 2008
67% (3)
Statistics and Probability Solved Assignments - Semester Spring 2008
31 pages
CAB-Topics, August 2021
No ratings yet
CAB-Topics, August 2021
12 pages
S.4 Notes Melanie Bulseco
No ratings yet
S.4 Notes Melanie Bulseco
5 pages
Business Statistics QP
No ratings yet
Business Statistics QP
21 pages
Data Management test
No ratings yet
Data Management test
7 pages
PS2 Sol
No ratings yet
PS2 Sol
7 pages
Topic 1 Numerical Measure
No ratings yet
Topic 1 Numerical Measure
11 pages
11 4variationswithinadataset
No ratings yet
11 4variationswithinadataset
4 pages
HW_837 (1)
No ratings yet
HW_837 (1)
3 pages
Important Questions for PU1 STATISTICS
No ratings yet
Important Questions for PU1 STATISTICS
14 pages
Business Stats
No ratings yet
Business Stats
11 pages
Data Mining Assignment 2
No ratings yet
Data Mining Assignment 2
2 pages
BST Unit Wise QP
No ratings yet
BST Unit Wise QP
13 pages
PROBLEM-SET-3-1
No ratings yet
PROBLEM-SET-3-1
4 pages
AD3491 QB
No ratings yet
AD3491 QB
17 pages
Worksheet For Dspersion
No ratings yet
Worksheet For Dspersion
7 pages
Data Test PDF
No ratings yet
Data Test PDF
7 pages
Ch14
No ratings yet
Ch14
13 pages
Assignment 1 Statistik
No ratings yet
Assignment 1 Statistik
15 pages
R_-_III_UNIT[1]
No ratings yet
R_-_III_UNIT[1]
34 pages
Course - Quantitative Methods - I (MBAG10)
No ratings yet
Course - Quantitative Methods - I (MBAG10)
5 pages
3 Review
No ratings yet
3 Review
5 pages
EXP-1- Statistics and Plotting
No ratings yet
EXP-1- Statistics and Plotting
23 pages
2 6 Review of Lessons 2 1 Through 2 5
No ratings yet
2 6 Review of Lessons 2 1 Through 2 5
5 pages
Chapter 1.3 Data description (B)
No ratings yet
Chapter 1.3 Data description (B)
26 pages
Mathematics As A Tool (Descriptive Statistics) (Midterm Period) Overview: This Module Tackles Mathematics As Applied To Different Areas Such As Data
No ratings yet
Mathematics As A Tool (Descriptive Statistics) (Midterm Period) Overview: This Module Tackles Mathematics As Applied To Different Areas Such As Data
33 pages
Lecture IV Measures of relative positioning
No ratings yet
Lecture IV Measures of relative positioning
7 pages
Worksheets-Importance of Mathematics
No ratings yet
Worksheets-Importance of Mathematics
38 pages
Measurement of Dispersion
No ratings yet
Measurement of Dispersion
4 pages
Exercises
100% (1)
Exercises
37 pages
chapter 4
No ratings yet
chapter 4
9 pages
Quantitative Analysis: Dr. Basheer Ahmad Samim
No ratings yet
Quantitative Analysis: Dr. Basheer Ahmad Samim
71 pages
Chapter 2B QS (PC)
No ratings yet
Chapter 2B QS (PC)
15 pages
S TATISTICS
No ratings yet
S TATISTICS
8 pages
Assignment 1 Solved
No ratings yet
Assignment 1 Solved
4 pages
AP Statistics Flashcards, Fifth Edition: Up-to-Date Practice
From Everand
AP Statistics Flashcards, Fifth Edition: Up-to-Date Practice
Barron's Educational Series
No ratings yet
Understanding Educational Statistics Using Microsoft Excel and SPSS
From Everand
Understanding Educational Statistics Using Microsoft Excel and SPSS
Martin Lee Abbott
No ratings yet
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
STAT 1000 Assignment - Solutions
No ratings yet
STAT 1000 Assignment - Solutions
7 pages
Statistics Model Exam
No ratings yet
Statistics Model Exam
15 pages
Histogram
No ratings yet
Histogram
7 pages
Mean Deviation Problems PDF
No ratings yet
Mean Deviation Problems PDF
10 pages
Applied Statistics Using Stata A Guide for the Social Sciences 1st Edition Mehmet Mehmetoglu Tor Georg Jakobsen pdf download
No ratings yet
Applied Statistics Using Stata A Guide for the Social Sciences 1st Edition Mehmet Mehmetoglu Tor Georg Jakobsen pdf download
52 pages
Regression With Stata
75% (4)
Regression With Stata
108 pages
Principal Components Analysis (PCA) in SPSS Statistics - Laerd Statistics
No ratings yet
Principal Components Analysis (PCA) in SPSS Statistics - Laerd Statistics
8 pages
Hubungan Tipe Kepribadian Dengan Pilihan Karir Peserta Didik Kelas Xi Man 1 Pontianak
No ratings yet
Hubungan Tipe Kepribadian Dengan Pilihan Karir Peserta Didik Kelas Xi Man 1 Pontianak
10 pages
Forecasting_Session1_2_3_4 (2)
No ratings yet
Forecasting_Session1_2_3_4 (2)
58 pages
MGSC5111-Assignment#2
No ratings yet
MGSC5111-Assignment#2
17 pages
Correlation and Linear
No ratings yet
Correlation and Linear
68 pages
Classification: Evaluation: Data Mining and Text Mining (UIC 583 at Politecnico Di Milano)
No ratings yet
Classification: Evaluation: Data Mining and Text Mining (UIC 583 at Politecnico Di Milano)
53 pages
Econometrics
No ratings yet
Econometrics
25 pages
DBB2102 Unit-05
No ratings yet
DBB2102 Unit-05
22 pages
chapter4-estimation (1)
No ratings yet
chapter4-estimation (1)
28 pages
2006 Chapter 08 Assignment
No ratings yet
2006 Chapter 08 Assignment
6 pages
effects_of_plyometric_training_volume_on_physical.11
No ratings yet
effects_of_plyometric_training_volume_on_physical.11
5 pages
Chp4 Soln r3
100% (2)
Chp4 Soln r3
12 pages
Regression With STATA
No ratings yet
Regression With STATA
17 pages
BRM Model Paper With Solution - 2022-23
No ratings yet
BRM Model Paper With Solution - 2022-23
16 pages
Pengaruh Akuntabilitas, Transparansi, Dan Pengawasan Terhadap Kinerja Pengelolaan Anggaran
No ratings yet
Pengaruh Akuntabilitas, Transparansi, Dan Pengawasan Terhadap Kinerja Pengelolaan Anggaran
21 pages
B.Sc. (H) Probability and Statistics 2011-2012
No ratings yet
B.Sc. (H) Probability and Statistics 2011-2012
2 pages
Probability & Statistics For Engineers: An Introduction and Overview
No ratings yet
Probability & Statistics For Engineers: An Introduction and Overview
93 pages
WFM 5201: Data Management and Statistical Analysis: Lecture-2: Descriptive Statistics
No ratings yet
WFM 5201: Data Management and Statistical Analysis: Lecture-2: Descriptive Statistics
12 pages
IMDB Movie Analysis Project Report
No ratings yet
IMDB Movie Analysis Project Report
8 pages
Assumptions For Statistical Tests
No ratings yet
Assumptions For Statistical Tests
13 pages
IP 200 L Biostatistics
No ratings yet
IP 200 L Biostatistics
86 pages
Statistical Analysis of Corrosion Data
No ratings yet
Statistical Analysis of Corrosion Data
18 pages