ITM (SLS) BARODA UNIVERSITY, VADODARA
B. Tech. Sem-II (Branch – CSE, IT, School - SOCSET)
DEPARTMENT OF APPLIED MATHEMATICS (SOS)
Subject: Probability and Statistical Modelling for Computer Science (C2710D4)
Tutorial – II
PART-A -THEORY QUESTIONS
(i) Define correlation with its type.
(ii) What are the different methods to find correlation coefficients?
(iii)Define Regression analysis with applications.
(iv) Coefficient of determination
PART-B-MULTIPLE CHOICE QUESTIONS
Answer the following multiple-choice questions.
1) Multiple regression is the line where number of independent variables atleast…..
(a) 0 (b) 2 (c) 4 (d) 1
2) If the relationship between x and y is positive, as variable y decreases, variable x
(a) increases (b) decreases (c) remains same (d) changes linearly
3) In a ‘negative’ relationship
(a) as x increases, y increases (b) as x decreases, y decreases
(c) as x increases, y decreases (d) both (a) and (b)
4) The highest strength of association is reflected by which of the following correlation
coefficients?
(a) -1.0 (b) 0.1 (c) -0.95 (d) 0.85
5)What type of relationship between the two variables is indicated by the sign of r
(a)direct relation (b) indirect relation
(c) both (a) and (b) (d) none of these
6) If two coefficients of regression are 0.8 and 0.2, then the value of coefficient of correlation
is
(a) 0.16 (b)–0.16 (c) 0.40 (d) -0.40
7)If two regression lines are: x + 3y + 7 = 0 and 2x + 5y = 12, then the average value x and
y are respectively
(a) 2, 1 (b) 1, 2 (c) 2, 3 (d) 2, 4
8) If bxy is negative, then byx is
(a) negative (b) positive (c) zero (d) none of these
Two regression lines are perpendicular to each other when
(a) r = 0 (b) r = 1/3 (c) r = –1/2 (d) r = ±1
9) The line of ‘best fit’ to measure the variation of observed values of dependent variable
in the sample data is:
(a) regression line (b) correlation coefficient
(c) standard error (d) none of these
10) If both dependent and independent variables increase in an estimating equation, then
coefficient of correlation falls in the range.
(a)–1 ≤ r ≤ 1 (b) 0 ≤ r ≤ 1 (c)–3 ≤ r ≤ 3 (d) none of these
PART-C-NUMERICAL QUESTIONS
Do as directed:
Q.1 Making use of the data summarized below find the correlation:
Case A B C D E F G H
X1 10 6 9 10 12 13 11 9
X2 9 4 6 9 11 13 8 4
Q.2 The article “Objective Measurement of the Stretchability of Mozzarella Cheese” (J. of
Texture Studies, 1992: 185–194) reported on an experiment to investigate how the
behavior of mozzarella cheese varied with temperature. Consider the accompanying
data on and (%) at failure of the cheese. [Note: The researchers were Italian and used
real mozzarella cheese, not the poor cousin widely available in the United States.]
x= temperature and y= elongation
x 59 63 68 72 74 78 83
y 118 182 247 208 197 135 132
Suggest about the nature of the relationship between the two variables?
Q.3 Group of newly admitted students write the program in C-language. Following is data
summarized the number of lines and number of errors occurred after compiling the
program:
Lines 78 89 99 60 50 79 68 61
Errors 125 137 156 112 107 136 123 108
Find the association between above using 69 and 112 as assumed mean respectively.
Q.4 Seven methods are decided to teach the students for imparting programming skills, by
two of the instructors (A & B). Following is the rank given to them by students:
Rank of A 2 1 5 3 4 7 6
Rank of B 1 3 2 4 7 5 6
Calculate the rank correlation and comment on its value.
Q.5 In an office some data entry operators who were already ranked for their speed are
ranked for their accuracy by their supervisor. The results are as follows:
Operator A B C D E F G H I J
Speed 1 2 3 4 5 6 7 8 9 10
Accuracy 7 9 3 4 1 6 8 2 10 5
Calculate appropriate coefficient of correlation and comment on result.
Q.6 An examination was conducted for post of software developer by a firm of two
programming languages Python and Java. Following are the marks obtained (out of 80) by
the candidates. Compute the rank correlation.
Applicants A B C D E F G H
Marks in Python 15 20 28 12 40 60 20 80
Marks in Java 40 30 50 30 20 10 30 60
Q.7 A financial analyst wanted to find out whether expenditure in advertisement influences
the sales. A random sample of 7 companies listed for which following data was recorded in
percentage. Find rank correlation between the same.
Company I II III IV V VI VII
Advertisement Exp. 4 5 7 8 6 3 5
(in %)
Sales (in %) 11 9 13 7 13 8 8
Q.8 The following data gives the experience of machine operators and their performance
ratings given by the number of good parts turned out per 100 pieces:
Operators 1 2 3 4 5 6 7 8
Experience (x) 16 12 18 4 3 10 5 12
Performance 87 88 89 68 78 80 75 83
Ratings (y)
Calculate the regression lines of performance ratings on experience and estimate the
probable performance if an operator has 7 years’ experience.
Q.9 The following data relates to the scores obtained by a salesman of a company in an
intelligence test and their weekly sales (in Rs. 1000’s):
Salesman I II III IV V VI VII VIII IX
Test Score 50 60 50 60 80 50 80 40 70
Weekly Sales 30 60 40 50 60 30 70 50 60
(a) Obtain the regression equation of sales on intelligence test scores of the salesmen.
(b) If the intelligence test score of a salesman is 65, what would be his expected weekly
sales?
Q.10 Distinguish between correlation and regression analysis.
Q.11 Find the equation of the least squares line fitting the following data:
x 1 2 3 4 5
y 2 6 5 3 4
Q.12 You are given below the following information about advertisement expenditure and
sales:
Adv. Exp. (in Lakhs) Sales (in Lakhs)
Mean 20 120
Standard Deviation 5 25
Correlation coefficient 0.8
(a) Calculate the two regression equations.
(b) Find the likely sales when advertisement expenditure is Rs 25 crore.
(c) What should be the advertisement budget if the company wants to attain sales target of Rs
150 crore?
Q.13 Two random variables have the regression equations:
3x + 2y – 26 = 0 and 6x + y – 31 = 0
(a) Find the mean values of x and y and the coefficient of correlation between x and y.
(b) If the variance of x is 25, then find the standard deviation of y from the data.
Case Study:
Q.1 The personnel manager of an electronic manufacturing company devises a manual test
for job applicants to predict their production rating in the assembly department. In order
to do this, he selects a random sample of 10 applicants. They are given the test and later
assigned a production rating. The results are as follows:
Worker A B C D E F G H I J
Test Score 53 36 88 84 86 64 45 48 39 69
Production 45 43 89 79 84 66 49 48 43 76
Rating
Fit a linear least squares regression equation of production rating on test score.
Q.2 Suppose that you are interested in using past expenditure on R&D by a firm to predict
current expenditures on R&D. You got the following data by taking a random sample
of firms, where x is the amount spent on R&D (in lakh of rupees) 5 years ago and y is
the amount spent on R&D (in lakh of rupees) in the current year:
x 30 50 20 80 10 20 20 40
y 50 80 30 110 20 20 40 50
(a) Find the regression equation of y on x.
(b) If a firm is chosen randomly and x = 10, can you use the regression to predict the value
of y? Discuss.