0% found this document useful (0 votes)
11 views45 pages

Solution Question Bank Unit-3

The document outlines the syllabus for Mathematics IV (BAS-303) at Rajkumar Goel Institute of Technology, covering topics such as measures of central tendency, skewness, kurtosis, and regression analysis. It includes various statistical problems with solutions related to mean, median, mode, and correlation coefficients. The course aims to provide a foundational understanding of key statistical concepts and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views45 pages

Solution Question Bank Unit-3

The document outlines the syllabus for Mathematics IV (BAS-303) at Rajkumar Goel Institute of Technology, covering topics such as measures of central tendency, skewness, kurtosis, and regression analysis. It includes various statistical problems with solutions related to mean, median, mode, and correlation coefficients. The course aims to provide a foundational understanding of key statistical concepts and their applications.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

RAJKUMAR GOEL INSTITUTE OF TECHNOLOGY

GZIABAD
Session - (2025-26)
Mathematics - IV (BAS-303)
UNIT-3
Contents: Measures of central tendency, Skewness, Kurtosis, Curve Fitting, Method of least squares, fitting
of straight lines, fitting of second-degree parabola, Exponential curves, Correlation and Rank correlation,
Regression Analysis: Regression lines of y on x and x on y,regression coefficients, properties of regressions
coefficients and nonlinear regression.
Course Outcome (CO3): Understand the basic statistical concept like moments, skewness, kurtosis, curve
fitting, correlation and regression.

Question 1

A cooperative bank has two branches employing 50 and 70 workers respectively. The average salaries
paid by two respective branches are Rs. 360 and Rs. 390 per month. Calculate the mean of the
salaries of all the employees.

Solution:
{
Note:
To calculate the mean salary of all employees, we use the Mean of Composite Serirs: If xi , (i = 1, 2, ..., k)
are the means of k-component series of sizes ni .(i = 1, 2, ..., k) respectively, then the meanx of the size
n1 , (i = 1, 2, ..., k) respectively then mean x̄ of the composite series is given by the formula:
n1 x1 + n2 x2 + ... + nk xk X X
x̄ = = nk xk / nk
n1 + n2 + ... + nk
i i

}
Let nl and n2 denote respectively the number of male and female employees in the concern and x1 and x2
denote respectively their average salary (in rupees). Let x̄ denote the average salary of all the workers in
the firm.
We are given that:
x1 = 360, x2 = 390, n1 = 50, n2 = 70

Mean Salary:

50(360) + 70(390) 18000 + 27300 45300


Mean Salary = = = = 377.5
50 + 70 120 120

Mean Salary = 377.5 Rs/month

Final Answer: The mean salary of all the employees is Rs. 377.5 per month.

Question 2

Find the median of the dataset: 6, 8, 9, 10, 11, 12, 13


Solution:
1. Arrange the numbers in ascending order:

6, 8, 9, 10, 11, 12, 13

2. Count the total number of data points: The total number of data points (n) etis 7, which is odd.
3. Find the position of the median: For an odd number of data points, the median is the value at the
position:
n+1
Median Position =
2
Substituting n = 7:
7+1
Median Position = =4
2

4. The 4th number in the dataset is 10.


Final Answer: The median of the dataset is: 10

Question 3

Find the mode of the following marks obtained by 15 students:

4, 6, 5, 7, 9, 8, 10, 4, 7, 6, 5, 8, 7, 7, 9

Solution:
1. Arrange the data and count the frequency of each number:

x 4 5 6 7 8 9 10
Frequency 2 times 2 times 2 times 4 times 2 times 2 times 1 time

2. Identify the mode: The mode is the number that appears the most frequently. Here, 7 appears 4
times, which is more than any other number.
Final Answer: The mode of the given data is:7

Question 4

Find the arithmetic mean of the following distribution.

x 1 2 3 4 5 6 7
f 5 9 12 17 14 10 6

Solution:
The formula for the arithmetic mean is:
P
fx
Arithmetic Mean = P
f

where:
• f is the frequency of each observation.

Page 2
• x is the value of each observation.
Step 1: Calculate f x for each x

x f fx
1 5 5
2 9 18
3 12 36
4 17 68
5 14 70
6 10 60
7 6 42
P P
Sum f = 73 f x = 299

Step 2: Compute the Arithmetic Mean


P
fx 299
Arithmetic Mean = P = ≈ 4.10
f 73

Final Answer: The arithmetic mean is approximately: 4.10

Question 5

In an asymmetrical distribution, the mean is 16 and the median is 20. Calculate the mode of the
distribution.

Solution:
The empirical relationship between the mean, median, and mode is given by:

Mode = 3 × Median − 2 × Mean

Given:
Mean = 16, Median = 20

Substitute the values:


Mode = 3 × 20 − 2 × 16
Mode = 60 − 32 = 28

Final Answer: The mode of the distribution is:28

Question 6

The first three central moments of a distribution are 0, 15, -31. Find the moment of coefficient of
Skewness.(2019-20 2 Marks)

Page 3
Solution:
Given:
µ1 = 0, µ2 = 15, µ3 = −31
The formula for the moment coefficient of skewness is:
µ3
γ1 = 3/2
µ2

Hence
−31
γ1 = √ ≈ −.53
153
The moment coefficient of skewness is approximately:

γ1 ≈ −0.533

Question 7

The first two moments of a distribution about the value ‘2’ of the variable are 1, 16. Show that
mean is 3, and variance is 15. (2020-2021 2 marks)

Solution:
Given that
A = 2, µ′1 = 1 and µ′2 = 16
Now
x̄ = µ1 + A = 1 + 2 = 3
variance
σ 2 = µ2
and
µ2 = µ′2 − µ′2 2
1 = 16 − (1) = 15

Hence
variance = 15

Question 8

The fourth central moment is µ4 = 48. What must be its standard deviation (σ) in order for the
distribution to be mesokurtic?

Solution:
The kurtosis (β2 ) is given as:

µ4
β2 = (µ2 = σ 2 )
σ4
where:
• µ4 is the fourth central moment.
• σ is the standard deviation.

Page 4
Step 1: Substitute the known values: For a mesokurtic distribution, β2 = 3, and µ4 = 48. Substituting
these into the formula:
48
3= 4
σ
Step 2: Solve for σ 4 :
48
σ4 = = 16
3
Step 3: Solve for σ: Taking the fourth root (or square root twice) of both sides:
√4

σ = 16 = 4 = 2

Final Answer: The standard deviation (σ) =2

Question 9

Write the normal equations to fit the curve y = ax2 + b by the method of least squares.

Solution: Fitting of second degree parabola

Let
y = ax2 + b (1)
be the given equation of best fit to set of n points (xi , yi ), i = 1, 2, ..., n, Using the principle of least squares,
we have to determine the constants a,b and c so that
n
X
E= (y − ax2 − b)2 = 0
i=1

is [Link] to zero the partial derivatives of E with respect to a and b separately, we get the
normal equations for estimating a and b as
n
∂E X
= −2 x2 (y − ax2 − b) = 0
∂a
i=1
n
∂E X
= −2 (y − ax2 − b) = 0
∂b
i=1
=⇒
n
X X
yx2 = a x4 + bx2
i=1
n
X n
X
y=a x2 + b
i=1 i=1

summation taken over i from 1 to n. For given set of points (xi , yi )

Question 10

Write the formula for Karl Pearson’s correlation coefficient and state the range of the correlation
coefficient.

Page 5
Solution:
Karl Pearson Correlation Coefficient Formula:
The Karl Pearson correlation coefficient (r) is given by the formula:

cov(x, y)
r=
σx σy
P
(x − x̄)(y − ȳ)
or r = pP
(x − x̄)2 (y − ȳ)2
P
P P P
n xy − x y
or r=p P p P
n x2 − ( x)2 n y 2 − ( y)2
P P

where:
• x and y are the individual data points of the two variables X and Y ,
• x̄ and ȳ are the means of the variables X and Y , respectively.
Range of the Correlation Coefficient:
The value of the correlation coefficient r lies between -1 and +1, inclusive:

−1 ≤ r ≤ 1

r = 1: Perfect positive correlation r = −1: Perfect negative correlation r = 0: No correlation

Question 11

If the covariance between variables x and y is 10, and the variances of x and y are 16 and 9
respectively, find the coefficient of correlation.

Solution:
The formula for the coefficient of correlation (r) is:

Cov(x, y)
r=
σx σy

where:
• Cov(x, y) is the covariance between x and y,
• σx and σy are the standard deviations of x and y, respectively.
Given:
• Cov(x, y) = 10,

• covariance σx2 = 16, so σx = 16 = 4,

• covariance σy2 = 9, so σy = 9 = 3.
Hence:
10 10 5
r= = = = 0.833
4×3 12 6
Hence the coefficient of correlation r = 0.833.

Page 6
Question 12

The lines of regression of y on x and x on y are respectively:

y =x+5 and 16x − 9y = 94,

find the correlation coefficient.

Solution
Given:
• Line of regression of y on x: y = x + 5
Slope of this line (byx ) = 1.
• Line of regression of x on y: 16x − 9y = 94
9
Rewrite it as x = 16 y + 94
16 , so the slope (bxy ) =
9
16 .

Formula for Correlation Coefficient:


p
r = ± byx · bxy

Substitute the values: r


9
r =± 1·
16
r
9 3
r=± =±
16 4

Determining the Sign of r:


The sign of r depends on the direction of the relationship. Since both regression lines have positive slopes,
r is positive.
3
r=
4

Question 13

If the regression coefficients are byx = 0.8 and bxy = 0.2, find the value of the coefficient of correlation
(r).

Solution:
The coefficient of correlation in terms of correlation coefficients:
p
r = ± byx · bxy

Given
byx = 0.8, bxy = 0.2

We get √ √
r = ± 0.8 · 0.2 = ± 0.16

r = ±0.4

Page 7
The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are positive the
correlation coefficient is positive.
r = 0.4

Question 14

If the regression coefficients are byx = 0.8 and bxy = 0.8, find the value of the coefficient of correlation
(r).

Solution:
The formula for the coefficient of correlation is:
p
r = ± byx · bxy

Given:
byx = 0.8, bxy = 0.8

Substitute these values into the formula: √


r = ± 0.8 · 0.8

Simplify: √
r = ± 0.64

r = ±0.8

The Sign of r:
The sign of r depends on the signs of the correlation coefficients. Since both byx and bxy , are positive the
correlation coefficient is positive.
r = 0.8

Question 15

What is the relation between the regression coefficients and the coefficient of Correlation?

Relationship Between Regression Coefficients and the Coefficient of Correlation


The relationship between the regression coefficients (byx and bxy ) and the coefficient of correlation (r) is
as follows:

p
r = ± byx · bxy

Question 16
c0 √
Write the normal equations to fit a curve y = x + c1 x

Solution: Fitting give equation of curve


Let √
c0
y= + c1 x
x

Page 8
be the given equation of best fit to set of n points (xi , yi ), i = 1, 2, ..., n, Using the principle of least squares,
we have to determine the constants a,b and c so that
X c0 √ 2
E= y− − c1 x
x

is [Link] to zero the partial derivatives of E with respect to co and c1 separately, we get the
normal equations for estimating co and c1 as
∂E X1 c0 √ 
= −2 y− − c1 x = 0
∂a x x
∂E X√  c0 √ 
= −2 x y− − c1 x = 0
∂b x
=⇒
X y X 1 X 1
( ) = c0 + c1 √
x x2 x
X √ X 1 X
(y x) = c0 √ + c1 x
x

summation taken over i from 1 to n. For given set of points (xi , yi )

Question 17

Write the formula for rank correlation in the case of tied rank

Spearman’s Rank Correlation with Tied Ranks


In the case of tied ranks, the formula for Spearman’s rank correlation coefficient (rs ) is:
1 P
6{ d2i + 12 mi (m2i − 1)}
P
rs = 1 −
n(n2 − 1)

Where:
mi is number of repetition of the ranks.

Question 18

The first three moments of a distribution, about the value ’2’ of the variable are 1, 16 and -40. Find
the mean and variance of distribution.

Solution
Given moments are about A = 2, µ′1 = 1, µ′2 = 16, µ′3 = −40.

Step 2: Calculate Mean (µ) The mean is related to the first moment about 2:

µ′1 = x̄ − A =⇒ 1 = x̄ − 1 =⇒ x̄ = 2 + 1 = 3 =⇒ x̄ = 3

Page 9
Calculate Variance (σ 2 ) Variance can be found using the second moment about the mean. First, we
find the second moment about 2:
µ2 = µ′2 − 2µ′2
1 = 16 − 1 = 15

Then
σ 2 = µ2 = 15 =⇒ σ 2 = 15

Question 19

The first three moments of a distribution are 6, 25,-41. Find the moment of coefficient of Skewness.

Solution
Given moments are about any number µ′1 = 6, µ′2 = 25, µ′3 = −41.

Calculate Variance (σ 2 ) Then

µ3 = µ′3 − 3µ′2 µ′1 + 2µ′3


1
= −41 − 3 × 25 × 6 + 2 × 6
= −41 − 450 + 12 = −479

µ23
3 =
µ32
−4792
= = 9.407
293


Skewness = 9.407 = 3.067

Hence negatively skewed due to negative of µ3 .

Question 20

Find the regression coefficient of y on x for the following:

x 1 2 3 4 5
y 2 4 5 4 5

Solution

x 1 2 3 4 5
y 2 4 5 4 5

Table 1: Observed data points


Here,
n=5
P P P
n xy − x y (5 × 66) − (15 × 20)
byx = P 2 P 2 = = 0.6
n x − ( x) 5 × 55 − (152 )

Page 10
x y xy x2
1 2 2 1
2 4 8 4
3 5 15 9
4 4 16 16
5 5 25 25
P P P P 2
x = 15 y = 20 xy = 66 x = 55

Question 21

Define the coefficient oof skewness based on moments. CO3 K1 2 Marks

Sol. Definition:
The coefficient of skewness based on moments, denoted by γ1 (gamma-one) or β1 (beta-one), is defined as
the ratio of the third central moment to the cube of the standard deviation.

Mathematical Formula:
If µ2 is the second central moment (variance) and µ3 is the third central moment, the coefficient of skewness
is given by:
µ3 µ3
γ1 = 3 = √ 3
σ ( µ2 )

The β1 Notation:
Sometimes the square of this value is used to define skewness:
µ23
β1 =
µ32

Where γ1 = ± β1 .

Question 22

What is mean by regression line Y on X ? CO3 K1 2 Marks

Sol. The Regression Equation:


The equation for the regression line of Y on X is given by:
Y − ȳ = byx (X − x̄)

Where:
• Y is the dependent (predicted) variable.
• X is the independent (predictor) variable.
• x̄, ȳ are the means of X and Y respectively.
• byx is the Regression Coefficient of Y on X.

Formula for Regression Coefficient (byx ):


σy
byx = r
σx
Where r is the correlation coefficient, σy is the standard deviation of Y , and σx is the standard deviation
of X.

Page 11
Question Description (7 Marks)

Question 23

Calculate the first four central moments and also comment upon Skewness and Kurtosis from the
following data:

Class Interval Frequency


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2

Solution:
Given Data:

Class Interval Frequency(f )


0 − 10 1
10 − 20 4
20 − 30 3
30 − 40 2
Calculate the Mean (x̄)
Class Interval f x fx
0–10 1 5 5
10–20 4 15 60
20–30 3 25 75
30–40 2 35 70
Total 10 210

P
(f x)
x̄ = P
f
210
x̄ = = 21
10

Calculation of Central Moments, Skewness, and Kurtosis


Class Interval f x (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
0–10 1 5 -16 -16 256 -4096 65536
10–20 4 15 -6 -24 144 -864 5184
20–30 3 25 4 12 48 192 768
30–40 2 35 14 28 392 5488 76832
Total 10 S 0 840 720 148320

Page 12
First Central Moment :

µ1 = 0
f (x − x̄)2
P
840
µ2 = P = = 84
f 10
f (x − x̄)3
P
720
µ3 = P = = 72
f 10
f (x − x̄)4 148320
P
µ4 = P = 14832
f 10

Hence
µ23 722 5184
Skewness β1 = 3 = 3
= = .0087
µ2 84 592704
µ4 14832 14832
Kurtosis β2 = 2 = 2
= = 2.102
µ2 84 7056

Skewness: Since (β1 > 0)the distribution is positively skewed, meaning the tail on the right side is longer
or fatter than the left side.
Kurtosis: The kurtosis (β2 < 3) suggests a relatively low peak than a normal distribution, indicating a
platikurtic distribution .

Question 24

Calculate the first four central moments about the mean, Skewness, and Kurtosis for the following
data (2021-22):

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1

Solution:
Solution:
Given Data:

x 0 1 2 3 4 5 6 7 8
f 1 8 28 56 70 56 28 8 1
P
(f · x)
The mean x̄ = P
f

Page 13
x . f . fx .
0 1 0
1 8 8
2 28 56
3 56 168
4 70 280
5 56 280
6 28 168
7 8 56
8 1 8
Total 256 1024

1024
x̄ = =4
256
Now for central moments
x f (x − x̄) f (x − x̄) f (x − x̄)2 f · (x − x̄)3 f (x − x̄)4
0 1 −4 16 16 −64 256
1 8 −3 9 72 −216 648
2 28 −2 4 112 −224 448
3 56 −1 1 56 −56 56
4 70 0 0 0 0 0
5 56 1 1 56 56 56
6 28 2 4 112 224 448
7 8 3 9 72 216 648
8 1 4 16 16 64 256
Total 256 0 512 0 2816

P
f (x − x̄) 0
µ1 = P = =0
f 256
f (x − x̄)2
P
512
µ2 = P = =2
f 256
f (x − x̄)3
P
0
µ3 = P = =0
f 256
f (x − x̄)4
P
2816
µ4 = P = = 11
f 256
Hence

p µ3
Skewness γ1 = β1 = 3/2 = 0
µ2
µ4 11
Kurtosis γ2 = β2 − 3 = 2 − 3 = 2 = −0.75
µ2 2

Question 25

Compute Skewness and Kurtosis, if the first four moments of a frequency distribution about the
value 4 of the variable are 1, 4, 10, and 45.

Page 14
Solution:
We are given the first four moments about the value A = 4:

µ′1 = 1, µ′2 = 4, µ′3 = 10, µ′4 = 45

Hence

µ2 = µ′2 − (µ′1 )2 = 4 − (1)2 = 4 − 1 = 3


µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 10 − 3(4)(1) + 2(1)3 = 0
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4 = 45 − 4(10)(1) + 6(4)(1)2 − 3(1)4 = 26

Skewness (γ1 ):
µ3 0
γ1 = 3/2
= =0
µ2 (3)3/2
symmetric distribution.
Kurtosis :
µ4 26 26
β2 = 2 = 2
= ≈ 2.89
µ2 (3) 9

The kurtosis is slightly lower than the normal value of 3, indicating a distribution close to normal.

Question 26

The first four moments of a frequency distribution about the value 4 of the variable are -1.5, 17,-30
and 80. Find µ1 , µ2 , µ3 , µ4 about mean. Also find β1 and β2 .

Solution
We are given the first four moments about the value A = 4:

µ′1 = −1.5, µ′2 = 17, µ′3 = −30, µ′4 = 80

The formulae for central moments (µr ) in terms of moments about A (µ′r ) are:

µ1 = 0
µ2 = µ′2 − (µ′1 )2
µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 ,
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4

Step 1: Calculate Central Moments


1. Second Central Moment (µ2 ):

µ2 = µ′2 − (µ′1 )2
= 17 − (−1.5)2
= 17 − 2.25 = 14.75

Page 15
2. Third Central Moment (µ3 ):
µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3
= −30 − 3(17)(−1.5) + 2(−1.5)3
= −30 + 76.5 + 2(−3.375)
= −30 + 76.5 − +6.75 = 39.75

3. Fourth Central Moment (µ4 ):


µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4
= 80 − 4(−30)(−1.5) + 6(17)(−1.5)2 − 3(−1.5)4
= 80 − 4(−30)(−1.5) + 6(17)(2.25) − 3(5.0625)
= 80 − 180 + 229.5 − 15.1875
= 114.3125

Step 2: Skewness and Kurtosis


Skewness (β1 ):
µ23
β1 =
µ32
Substitute µ3 = 39.75 and µ2 = 14.75:
39.752
β1 = ≈ .492
14.753
µ4
β2 = 2
µ2
Substitute µ4 = 114.3125 and µ2 = 14.75:
114.3125 114.3125
β2 = = ≈ 0.525
(14.75)2 217.5625

Question 27

The first four moments of a frequency distribution about the value 2 of the variable are 2, 20, 40
and 50 respectively. Comment upon the skewness and kurtosis of the distribution.

Solution
Analysis of Skewness and Kurtosis
The first four moments about A = 2 are given as:
µ′1 = 2, µ′2 = 20, µ′3 = 40, µ′4 = 50.

µ1 = 0.

Central Moments
The central moments are calculated using the formula:
r  
X r ′ ′r−k
µr = µ µ .
k r 1
k=0

Page 16
Second Central Moment (µ2 ):

µ2 = µ′2 − (µ′1 )2 = 20 − 22 = 20 − 4 = 16.

Third Central Moment (µ3 ):


µ3 = µ′3 − 3µ′2 µ′1 + 2µ′3
1

µ3 = 40 − 3(20)(2) + 23 = 40 − 120 + 8 = −64.

Fourth Central Moment (µ4 ):

µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 µ′2 ′ 4


1 − 3(µ1 )

µ4 = 50 − 4(40)(2) + 6(20)(2)2 − 3(24 ) = 162

Skewness (γ1 )
Skewness is calculated as:
µ3
γ1 = 3/2
µ2
−64
γ1 = = −1.
(16)3/2
Interpretation: Since γ1 is negative, the distribution is negatively skewed.

Kurtosis (γ2 )
Kurtosis is calculated as:
µ4
γ2 =
µ22
162
γ2 = = 0.6328.
162
Excess kurtosis is:
Excess Kurtosis = γ2 − 3 = 0.6328 − 3 = −2.367.
Interpretation: The negative excess kurtosis indicates that the distribution is platykurtic (flatter than a
normal distribution).

Question 28

The first four moments of a frequency distribution about the value 5 of the variable are 1, 2.5, 5.5
and 16 [Link] the four central moments, moments about origin and coefficient of skewness.

Solution:
Given:

µ′1 = 1,
µ′2 = 2.5,
µ′3 = 5.5,
µ′4 = 16.

The value of A = 5. The mean is given by:

x̄ = A + µ′1 = 2 + 1 = 3.

Page 17
Step 1: Central Moments
The central moments µr are related to the moments about A (µr ) as follows:

µ1 = 0,
µ2 = µ′2 − (µ′1 )2 ,
µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 ,
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4 .

Substitute the values:

µ2 = µ′2 − (µ′1 )2 = 2.5 − 12 = 2.5 − 1 = 1.5,


µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 = 5.5 − 3(2.5)(1) + 2(1)3 = 5.5 − 7.5 + 2 = 0,
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 mu′2 ′ 4 2 4
1 − 3(µ1 ) = 16 − 4(5.5)(1) + 6(2.5)(1) − 3(1) = 16 − 22 + 15 − 3 = 6.

Thus, the central moments are:


µ2 = 1.5, µ3 = 0, µ4 = 6.

Step 2: Moments About the Origin


The moments about the origin µ′n are related to the central moments µn and the mean µ′1 = x̄ as follows:

µ′1 = x̄ = 3,
µ′2 = µ2 + (µ′1 )2 = 1.5 + 32 = 10.5,
µ′3 = µ3 + 3µ2 µ′1 + µ′3 3
1 = 0 + 3(1.5)(3) + 3 = 40.5,
µ′4 = µ4 + 4µ3 µ′1 + 6µ2 mu′2 4 2 4
1 + x̄ = 6 + 4(0)(3) + 6(1.5)(3 ) + 3 = 168.

Thus, the moments about the origin are:

µ′1 = 3, µ′2 = 10.5, µ′3 = 40.5, µ′4 = 168.

Hence
The coefficient of skewness γ1 is given by:
µ3 0
γ1 = 3/2
= = 0.
µ2 (1.5)3/2

Thus, the distribution is symmetric.

Question 29

Determine the Skewness and Kurtosis for the following data:

Marks 10-20 20-30 30-40 40-50 50-60


No. of students 18 20 30 22 10

Page 18
Solution
We are given the following frequency distribution:

Class Interval 10-20 20-30 30-40 40-50 50-60


No. of students 18 20 30 22 10

Step 1: Calculate the Mean The mean x̄ is calculated as:


P
fx
x̄ = P
f
We first calculate f x:
Class Interval f Mid Point(x) f x
10 − 20 18 15 270
20 − 30 20 25 500
30 − 40 30 35 1050
40 − 50 22 45 990
50 − 60 10 55 550
T otal 100 3360

Thus, the mean is:


3360
x̄ = = 33.6
100
Step 2: Calculate Moments

Class Interval x f (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4


10 − 20 15 18 −18.6 −334.8 6227.28 −115827 2154390
20 − 30 25 20 −8.6 −172 1479.9 −12721.1 109401.6
30 − 40 35 30 1.4 42 58.80 82.32 115.248
40 − 50 45 22 11.4 250.8 2859.12 32593.968 371571.2
50 − 60 55 10 21.4 214 4579.60 98003.44 22097274
f (x − x̄)2 = 15204
P P
Total x̄ = 100 f (x − x̄) = 0 2131.2 4732752

f (x − x̄)2
P
15204
µ2 = P = = 152.04
f 100

f (x − x̄)3
P
2131.2
µ3 = P = = 21.312
f 100
f (x − x̄)4
P
4732752
µ4 = P = = 47327.52
f 100
Hence, Skewness
p µ3 21.312
γ1 = β1 = 3/2 = = 0.
µ2 (15204)3/2
Kurtosis
µ4 47327.5
γ2 = β 2 − 3 = 2 = − 3 = −0.953
µ2 123.042

Question 30

Find the coefficient of correlation from the following points of observation (1,3),(2,2),(3,5),(4,4),(5,6).

Page 19
Solution:
To find the coefficient of correlation r for the given points of observation, we use the Pearson correlation
coefficient formula: P P P
n xy − x y
r=p P
[n x2 − ( x)2 ][n y 2 − ( y)2 ]
P P P

Given Points of Observation:


(1, 3), (2, 2), (3, 5), (4, 4), (5, 6)

x y xy x2 y2
1 3 3 1 9
2 2 4 4 4
3 5 15 9 25
4 4 16 16 16
5 6 30 25 36
P P P P 2 P 2
x = 15 y = 20 xy = 68 x = 55 y = 90

n=5 (number of points)


Now, substitute the values into the formula for the Pearson correlation coefficient:
5 × 68 − (15 × 20)
r=p
[5 × 55 − (15)2 ][5 × 90 − (20)2 ]
340 − 300
Simplifying each part: = p
[275 − 225][450 − 400]
40
=p
[50][50]
40 40
=√ = = 0.8
2500 50
Answer: The coefficient of correlation r is 0.8. This indicates a strong positive correlation between X and Y .

Question 31

A random sample of 5 college students is selected and their grades in Mathematics and Statistics
are found to be:

Students 1 2 3 4 5
Mathematics 85 60 73 40 90
Statistics 93 75 65 50 80
Calculate the rank correlation coefficient.

Solution:
Solution:
Spearman’s Rank Correlation Coefficient
The formula for the rank correlation coefficient ρ is:
6 d2i
P
ρ=1−
n(n2 − 1)

Page 20
Where: n is the number of data points (in this case, n = 5), di is the difference between the ranks of
corresponding values of Mathematics and Statistics for each student.
Arrange X and Y series into ascending order and give them ranks starting from 1

X-Series: 90 85 73 60 40
Rank: 1 2 3 4 5
Y-Series: 93 80 75 65 50
Rank: 1 2 3 4 5
Calculate the Differences in Ranks and Square Them Now, we calculate di = RankX − RankY and d2i :

Student RankX RankY di = RankX − RankY d2i


1 2 1 1 1
2 4 3 1 1
3 3 4 −1 1
4 5 5 0 0
5 1 2 −1 1
P 2
Total di = 4

Now, substitute into the formula:


6×4 24 24 24
ρ=1− 2
=1− =1− =1− = 1 − 0.2 = 0.8
5(5 − 1) 5(25 − 1) 5 × 24 120

The rank correlation coefficient is ρ = 0.8.

Question 32

Calculate the coefficient of correlation for the following heights (in inches) of fathers (X) and their
sons (Y ):
Father’s Height (X) 65 66 67 67 68 69 70 72
Son’s Height (Y) 67 68 65 68 72 72 69 71

Solution:
The Pearson correlation coefficient r is given by the formula:
P P P
n xy − x y
r=p P
[n x2 − ( x)2 ][n y 2 − ( y)2 ]
P P P

We are given the data for 8 students, so n = 8.

X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
P P P 2 P 2 P
x = 544 y = 552 x = 37028 y = 38132 xy = 37560

Page 21
Now, substitute the values into the formula:

8 × 37560 − 544 × 552


r=p = 0.603
[8 × 37028 − (544)2 ][8 × 38132 − (552)2 ]
SHORT-CUT METHOD

X Y U = X − 68 V = X − 69 U2 V2 UV
65 67 -3 -2 9 4 6
66 68 -2 -1 4 1 2
67 65 -1 -4 1 16 4
67 68 -1 -1 1 1 1
68 72 0 3 0 9 0
69 72 1 3 1 9 3
70 69 2 0 4 0 0
72 71 4 2 16 4 8
P P P P P 2 P 2 P
X = 544 Y = 552 U =0 V =0 U = 36 V = 44 U V = 24
1X 1X
Ū = U = 0, V̄ = V =0
n n
1X 1
Cov(U, V ) = U V − Ū V̄ = × 24 = 3
n 8
1X 1
σU2 = U − Ū 2 = × 36 = 4.5
n 8
2 1X 2 1
σV = V − V̄ = × 44 = 5.5
n 8

Question 33

Fit a parabolic curve of second degree to the following data:

x 0 1 2 3 4
y 1 1.8 1.3 2.5 6.3

Solution:
Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:
X X X
y =n·a+b x+c x2
X X X X
xy = a x+bx2 + c x3
X X X X
x2 y = a x2 + b x3 + c x4

x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
P P P 2 P 3 P 4 P P 2
x = 10 y = 12.9 x = 30 x = 100 x = 354 xy = 37.1 x y = 130.3

Page 22
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:

a = 1.42, b = −1.07, c = 0.55.

Thus the required equation of the second degree parabola is

y = 1.42 − 1.07x + 0.55x2

Question 34

Use the method of least squares to find the curve y = abx that best fits the following data:

X 2 3 4 5 6
Y 8.3 15.4 33.1 65.2 127.4

Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of both sides:

log(y) = log(abx ) = log(a) + x log(b)

Let Y = log(y), A = log(a), and B = log(b), so the equation becomes:

Y = A + Bx

Now, we apply the method of least squares to the transformed equation:


The normal equations are:

X X
Y = nA + B x
X X X
xY = A x+B x2
We compute the required sums:

x y Y xY x2
2 8.3 0.9191 1.8382 4
3 15.4 1.1875 3.5626 9
4 33.1 1.5198 6.0793 16
5 65.2 1.8142 9.0712 25
6 127.4 2.1052 12.6310 36
P P P P 2
x = 20 Y = 7.5458 xY = 33.1823 x = 90

We solve the system of equations:

7.5458 = 5A + 20B
33.1823 = 20A + 90B

Page 23
Solving this, we find
A = 0.3095, B = 0.2999
. Thus,
a = 10A = 2.0395
b = 10B = 1.9948
Therefore, the best-fitting curve is:

y = 2.0395(1.9948)x

Question 35

Use the method of least squares to find the curve y = abx that best fits the following data:

x 2 3 4 5 6
y 144 172.8 207.4 248.8 298.5

Solution:We assume the equation is of the form y = abx . Taking the natural logarithm of both sides:

log y = log abx = log a + x log b

Let Y = log y, A = log a, and B = log b, so the equation becomes:

Y = A + Bx

Now, we apply the method of least squares to the transformed equation:


The normal equations are:

X X
Y = nA + B x
X X X
xY = A x+B x2

x y Y xY x2
2 144 2.1584 4.3167 4
3 172.8 2.2375 6.7126 9
4 207.4 2.3168 9.2672 16
5 248.8 2.8142 11.9793 25
6 298.5 2.3959 14.8497 36
P P P P 2
x = 20 Y = 11.5835 xY = 47.1255 x = 90

We solve the system of equations:


11.5835 = 5A + 20B
47.1255 = 20A + 90B
Solving this, we find
A = 2.0001,
and
B = 0.0791

Page 24
.
Thus, a = 10A = 100.0230 and b = 10B = 1.1999.
Therefore, the best-fitting curve is:

y = 99.68(1.2)x

Problem 14:
Using the method of least squares to fit the curve y = ax2 + bx to the following data:

x 1 2 3 4 5 6 7 8
y 1 1.2 1.8 2.5 3.6 4.7 6.6 9.1

Solution:
The normal equations of y = ax2 + bx are:

X X X
x2 y = a x4 + b x3 .............(1)
X X X
xy = a x3 + b x2 ........(2)

x y x2 x2 y x3 x4 xy
1 1 1 1 1 1 1
2 1.2 4 4.8 8 16 2.4
3 1.8 9 16.2 27 81 5.4
4 2.5 16 40 64 256 10
5 3.6 25 90 125 625 18
6 4.7 36 169.2 216 1296 28.2
7 6.6 49 323.4 343 2401 46.2
8 9.1 64 582.4 512 4096 72.8
P 2 P 2 P 3 P 4 P
x = 204 x y = 1227 x = 1296 x = 8772 xy = 184

The system of equations after substituting the values of sums are:

1227 = 8772a + 1296b


184 = 1296a + 204b

We solve this system and find a = 0.107 and b = 0.217.


Thus, the best-fitting curve is:

y = 0.217x + 0.107x2

Page 25
Question 36

Find the exponential curve of the form K = P V γ for the following data using the method of least
squares:

V 50 100 150 200


P 135 48 26 17

Solution: We assume the curve is of the form K = P V γ . Taking the natural logarithm of both sides:

log(K) = log(P V γ ) = log(P ) + γ log(V )

Let Y = log(P ), A = log(k), and B = γ, and X = log(V ). The equation becomes:

A = Y + BX
=⇒ Y = A − BX

We apply the method of least squares to the linear equation:

X X
Y = nA − B X
X X X
XY = A X −B X2

V P X = logV Y = logP XY X2
50 135 1.698 2.13 3.616 2.883
100 48 2 1.681 3.362 4
150 26 2.176 1.414 3.077 4.735
200 17 2.301 1.23 2.830 5.295
P P P P 2
X = 8.175 Y = 6.455 XY = 12.884 X = 16.911

We substitute these values into the normal equations:

6.455 = 4A − 18.825B
12.884 = 8.175A − 16.911B

Solving this system of equations we get A = 4.713 and B = 1.516. Once we have A and B, we can compute
k = 10A = 51641.6 and γ = B = 1.516.
Thus, the exponential curve is
51641.6 = P V 1.516

Question 37
c1
Using the method of least squares, fit the curve y = c0 x + √
x
to the following data:

x 0.2 0.3 0.5 1 2


y 16 14 11 6 3

Page 26
Solution:
c1
The equation of the curve is y = c0 x + √
x
. The normal equations are:
X X√
X
x2 + c1
xy = c0 x
X y X√ X1
√ = c0 x + c1 .
x x

x y x x2 xy √y 1
x x
0.2 16 0.447 0.04 3.2 35.777 5
0.3 14 0.547 0.09 4.2 25.560 3.333
0.5 11 0.707 0.25 5.5 15.556 2
1 6 1 1 6 6 1
2 3 1.414 4 6 2.121 0.5
P√ P 2 P P y P1
x = 4.116 x = 5.38 xy = 24.9 √ = 85.015
x x = 11.833

Substituting the values:


85.015 = c1 11.833 + c0 4.116,
24.9 = c1 4.116 + c0 5.38.

Solving the equations gives:


c1 = 7.60, c0 = −1.18.

The fitted curve is:


7.60
y = √ − 1.18x
x

Question 38

Using the method of least squares, fit the curve f (x) = a + bx + cx2 to the following data:

x 1 2 3 4 5
f (x) 1 1.2 1.8 2.5 3.6

Solution: The equation of the curve is y = a + bx + cx2 . The normal equations are:
X X X
y =n·a+b x+c x2
X X XX
xy = a x2 + c
x+b x3
X X X X
x2 y = a x2 + b x3 + c x4

x y x2 x3 x4 xy x2 y
0 1 0 0 0 0 0
1 1.8 1 1 1 1.8 1.8
2 1.3 4 8 16 2.6 5.2
3 2.5 9 27 81 7.5 22.5
4 6.3 16 64 256 25.2 100.8
P P P 2 P 3 P 4 P P 2
x = 10 y = 12.9 x = 30 x = 100 x = 354 xy = 37.1 x y = 130.3

Page 27
Using normal equations
12.9 = 5a + 10b + 30c,
37.1 = 10a + 30b + 100c,
130.3 = 30a + 100b + 354c.
Solving these equations, we get:

a = 1.42, b = −1.07, c = 0.55.

Thus the required equation of the second degree parabola is

y = 1.42 − 1.07x + 0.55x2

Question 39

Using the method of least squares, fit a curve of the form:

y = aebx

to the following data:

x 1 2 3 4 5
y 1 1.2 1.8 2.5 3.6

Solution
Taking the natural logarithm of both sides:

y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)

Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:

X X
Y = nA + b x
X X X
xY = A x+b x2

x y Y x2 xY
1 1 0.000 1 0.000
2 1.2 0.182 4 0.365
3 1.8 0.588 9 1.763
4 2.5 0.916 16 3.665
5 3.6 1.280 25 6.404
P P P 2 P
x = 15 Y = 2.967 x = 55 xY = 12.197

Using sums normal equations become:

5A + 15b = 2.967,
15A + 55b = 12.197.

Solve for A and b

Page 28
Solving the equations, we get:
A = −0.3953, b = 0.3296.

Converting A to a:
a = eA = 0.6735.

Step 6: Fitted Curve


The fitted curve is:
y = 0.6735e0.3296x .

Question 40

If 4x − 5y + 33 = 0 and 20x − 9y = 107 are two line of regression of x on yand regression of y on x


respectively. Find mean values of x and y,the correlation of coefficient and the standard deviation
of y if the variance of x is 9.

Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have

4X̄ − 5Ȳ = −33,


20X̄ − 9Ȳ = 107.

Solving, we get X̄ = 13, Ȳ = 17


(ii) Let

4X − 5Y + 33 = 0 be line of regression Y on X
and 20X − 9Y = 107 be line of regression X on Y

These equations can be put in the form :


4 33
Y = X+ .
5 5
9 107
X= Y +
20 20
4
∴ bY X = Regression coefficient of Y on X =
5
9
and bXY = Regression coefficient of X on Y =
20
4 9 9
Hence r2 = bY X · bXY = · =
5 20 25
3
∴ r = ± = ±0.6
5
But since both the regression coefficients are positive, we take

r = +0.6

σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.

Page 29
Question 41

Problem Statement
In a partially destroyed laboratory record of an analysis of correlation data, the following results are
legible:
Variance of x: σx2 = 9.
The regression equations are:

8x − 10y + 66 = 0 and 40x − 18y = 214.

Calculate the following:


(a) The mean values of x and y.
(b) The standard deviation of y.
(c) The coefficient of correlation between x and y.

Solution
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have

8X̄ − 10Ȳ = −66,


40X̄ − 18Ȳ = 214.

Solving, we get X̄ = 13, Ȳ = 17


(ii) Let

8X − 10Y + 66 = 0 be line of regression Y on X


and 40X − 18Y = 214 be line of regression X on Y

These equations can be put in the form :


8 66
Y = X+ .
10 10
18 214
X= Y +
40 40
8 4
∴ bY X = Regression coefficient of Y on X = =
10 5
18 9
and bXY = Regression coefficient of X on Y = =
40 20
4 9 9
Hence r2 = bY X · bXY = · =
5 20 25
3
∴ r = ± = ±0.6
5
But since both the regression coefficients are positive, we take

r = +0.6

σY 4 3 σY
(iii) We have bY X = r · =⇒ = ×
σX 5 5 3
Hence σY = 4.

Page 30
Question 42

Two lines of regression are given by 5x − 2y = 52 and 3x − 8y = 12 and σx2 = 12.


Calculate
(a) The mean value of x and y,
(b) The variance of y,
(c) The coefficient of correlation between x and y.

Solution:
Rearrange the Regression Equations
The regression equations are:
5 52
5x − 2y = 52 ⇒ y = x−
2 2
and
8 12
3x − 8y = 12 ⇒ x= y+ .
3 3

Calculate Mean Values of x and y


To find the mean values of x and y, we substitute x = x and y = y in the regression equations and solve.
Thus, the mean values are:
x ≈ 11.528, y ≈ 2.82.

Calculate the Coefficient of Correlation


The formula for the correlation coefficient r is given by:
p
r = bxy × byx ,

where bxy is the regression coefficient of x on y, and byx is the regression coefficient of y on x.
From the regression equations: bxy = 25 , byx = 38 .
Thus: r r r
5 8 40 20
r= × = = ≈ 2.5819.
2 3 6 3

So, the coefficient of correlation is approximately r ≈ 2.5819.


Calculate the Variance of y
We are given the variance of x is σx2 = 12. The variance of y can be calculated using the formula:
σx
bxy = r .
σy

Substitute the known values: √


12
σy = 2.1819 × ,
2.5

Thus, the variance of y is approximately σy ≈ 3.023.

Page 31
Question 43

The following table gives the age (x) in years of cars and annual maintenance cost (y) in hundred
rupees.

x 1 3 5 7 9
y 15 18 21 23 22
Calculate the maintenance cost for a 4-year-old car after finding the regression equation.

Solution
The regression equation is of the form:
y = a + bx

x y xy x2 y2
1 15 15 1 225
3 18 54 9 324
5 21 105 25 441
7 23 161 49 529
9 22 198 81 484
P P P P 2 P 2
x = 25 y = 99 xy = 533 x = 165 y = 2003

Calculate x̄ and ȳ
P
x 25
x̄ = = =5
n 5
P
y 99
ȳ = = = 19.8
n 5

Calculate byx and bxy


P P P
n (xy) − x y
byx =
n x2 − ( x)2
P P

5(533) − (25)(99)
byx = = 0.95
5(165) − (25)2
P P P
n (xy) − x y
bxy =
n y − ( y)2
P 2 P

5(533) − (25)(99)
bxy = = 0.887
5(2003) − (99)2

Regression Equation y on x
y − ȳ = bxy (x − x̄) = y − 19.8 = 0.95(x − 5) = 0.95x + 15.05
y = 0.95x + 15.05

Page 32
Predict Maintenance Cost for a 4-Year-Old Car (x = 4)
y = 15.05 + 0.95(4) = 18.85 (hundred rupees).

Question 44

From the following data, determine the equations of the line of regression of y on x and x on y:

x 6 2 10 4 8
y 9 11 5 8 7

Solution
The regression equation of y on x is:
y − ȳ = byx (x − x̄),
where: P P P
n xy − x y
byx =
n x2 − ( x)2
P P

The regression equation of x on y is:


x − x̄ = bxy (y − ȳ),
where: P P P
n xy − x y
bxy =
n y 2 − ( y)2
P P

Step 1: Calculate Required Sums

x y xy x2 y2
6 9 54 36 81
2 11 22 4 121
10 5 50 100 25
4 8 32 16 64
8 7 56 64 49
P P P P 2 P 2
x= y= xy = = x y =
30 40 214 220 340

Calculate Means
P P
x
30 y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5

Calculate Regression Coefficients


P P P
n xy − x y
byx =
n x2 − ( x)2
P P

5 × 214 − 30 × 40
byx = = −0.65
5 × 220 − 302
P P P
n xy − x y
bxy =
n y 2 − ( y)2
P P

56 × 214 − 30 × 40
bxy = = −1.30
5 × 340 − 402

Page 33
Write Regression Equations
Regression of y on x:
y − 8.00 = −0.65(x − 6.00)
y = 0.65x + 11.9
Regression of x on y:
x − 6.00 = −1.30(y − 8.00)

x = 1.3y = 16.4

Question 45

Fit a parabolic curve of regression of y on x to the following data:

x 1.0 1.5 2.0 2.5 3.0 3.5 4.0


y 1.1 1.3 1.6 2.0 2.7 3.4 4.1

Solution
The parabolic regression curve is of the form:

y = a + bx + cx2

The normal equations for fitting a parabola are:


X X X
y = na + b x+c x2
X X X X
(xy) = a x+bx2 + c x3
X X X X
(x2 y) = a x2 + b x3 + c x4

x y xy x2 x2 y x3 x4
1 1.1 1.1 1 1.1 1 1
1.5 1.3 1.95 2.25 2.925 3.375 5.0625
2 1.6 3.2 4 6.4 8 16
2.5 2 5 6.25 12.5 15.625 39.0625
3 2.7 8.1 9 24.3 27 81
3.5 3.4 11.9 12.25 41.65 42.875 150.0625
4 4.1 16.4 16 65.6 64 256
P P P P 2 P 2 P 3 P 4
x= y= xy = x = x y= x = x =
17.5 16.2 47.65 50.75 154.475 161.875 548.1875

Substitute Sum and Solve Normal Equations


16.2 = 7a + b17.5 + c50.75
47.65 = a17.5 + b50.75 + c161.875
154.475 = a50.75 + b161.875 + c548.1875
Solve these simultaneous equations to find a, b, and c.

Page 34
Final Parabolic Equation
Substitute a = 0.242, b = −0.193, and c = 1.036 into the equation:

y = 0.242 − 0.193x + 1.036x2

Question 46

Find the multiple regression equation of X1 on X2 and X3 from the data given below:

X1 3 5 6 8 12 10
X2 10 10 5 7 5 2
X3 20 25 15 16 15 2

Solution
The multiple regression equation is given by:

X1 = a + bX2 + cX3

To determine a, b, and c, we use the normal equations:


X X X
X1 = na + b X2 + c X3
X X X X
(X1 X2 ) = a X2 + b X22 + c
(X3 X2 )
X X X X
(X1 X3 .) = a X3 + b (X2 X3 ) + c X32

X1 X2 X3 X1 X2 X22 X2 X3 X1 X3 X32
3 10 20 30 100 200 60 400
5 10 25 50 100 250 125 625
6 5 15 30 25 75 90 225
8 7 16 56 49 112 128 256
12 5 15 60 25 75 180 225
10 2 2 20 4 4 20 4
P P P P P 2 P P P 2
x1 = x2 = X3 = X1 X2 = X2 = X2 X3 = X1 X3 = X3 =
44 39 93 246 303 716 603 1735

Solve Normal Equations


Substitute the calculated sums into the normal equations and solve for a, b, and c.

44 = 6a + 39b + 93c

246 = a39 + 303b + 716c


603 = 93a + 716b + 1735c

Page 35
Write Final Equation
Substitute the values of a = 12.360, b = −1.398, and c = 0.262 into the regression equation:

X1 = 12.360 − 1.398X2 + 0.262X3

Question 47

For the data given , determine the lines of regression :

x 2 4 6 8 10
y 5 7 9 8 11

Solution
x y xy x2 y2
2 5 10 4 25
4 7 28 16 49
6 9 54 36 81
8 8 64 64 64
10 11 110 100 121
P P P P 2 P 2
x = 30 y = 40 xy = 266 x = 220 y = 340
P P
x 30 y 40
x̄ = = = 6.0, ȳ = = = 8.0
n 5 n 5
The regression equation of y on x is:
y − ȳ = byx (x − x̄)

Calculate Means
where: P P P
n xy − x y
byx =
n x2 − ( x)2
P P

5 × 266 − 30 × 40
byx = = 0.65
5 × 220 − 302

Regression of x on y:
The regression equation of x on y is:
x − x̄ = bxy (y − ȳ)
where: P P P
n xy − x y
bxy =
n y 2 − ( y)2
P P

5 × 266 − 30 × 40
bxy = = 1.3
5 × 340 − (40)2

Page 36
Step 3: Write the Regression Equations
1. Regression of y on x:
y − ȳ = byx (x − x̄)

y − 8 = .65(x − 6) = .65x − 3.9


y = .65x + 4.1

2. Regression of x on y:

x − x̄ = bxy (y − ȳ)
x − 6 = 1.3(y − 8)

x = 1.3y + 4.4

Question 48

If 3x + 2y = 26 and 6x + y = 31 are two lines of regression. Find (i) mean values of x and y (ii) the
coefficient of correlation between x and y (iii) find variance of y if the variance of x is 9. (2024-25)
7 Marks

Solution:
(i) Since both the lines of regression pass through the point (X̄, Ȳ ), we have

3X̄ + 2Ȳ = 26,


6X̄ + Ȳ = 31.

Solving, we get X̄ = 4, Ȳ = 7
(ii) Let

3X + 2Y = 26 be line of regression Y on X
and 6X + Y = 31 be line of regression X on Y

These equations can be put in the form :


26 3
Y = − X .
2 2
31 1
X= − Y
6 6
3
∴ bY X = Regression coefficient of Y on X = −
2
1
and bXY = Regression coefficient of X on Y = −
6
3 1 1
Hence r2 = bY X · bXY = − · (− ) =
2 6 4
1
∴ r = ± = ±0.5
2

Page 37
But since both the regression coefficients are negative, we take

r = −0.5

σY 3 1 σY
(iii) We have bY X = r · =⇒ − = − × (∵ σx2 = 9 =⇒ σx = 3)
σX 2 2 3
Hence σY = 9. =⇒ variance = 81

Question 49

Fit the curve y = aebx


x 2 4 6 8 10
y 4.077 11.084 30.128 81.897 222.62
(2024-25) 7 marks

Taking the natural logarithm of both sides:

y = aebx =⇒ ln y = ln a + bx ln e = ln a + bx.(∵ ln e = 1)

Let Y = ln y and A = ln a, so :
Y = A + bx.
The normal equations are:

X X
Y = nA + b x
X X X
xY = A x+b x2

x y Y = ln y x2 xY
2 4.077 1.4054 4 2.8107
4 11.084 2.4055 16 9.6220
6 30.128 3.4055 36 20.4327
8 81.897 4.4055 64 35.2437
10 222.62 5.4055 100 54.0547
P P P 2 P
x = 30 Y = 17.0272 x = 220 xY = 122.1638

Using sums in normal equations become:

5A + 30b = 17.0272,
30A + 220b = 122.1638.

Solve for A and b


Solving the equations, we get:
A = 0.41, b = 0.50.

Hence
a = eA = 1.50.

Page 38
The required curve is:
y = 1.5e0.5x .

Question 50

Calculate all four moments about mean and also Skewness and Kurtosis.

Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10

(2024-25) 7 marks

Solution
We are given the following frequency distribution:

Marks 0 − 10 10 − 20 20 − 30 30 − 40 40 − 50 50 − 60 60 − 70
No of Students 1 6 10 15 11 7 10

The mean x̄ is calculated as: P


fx
x̄ = P
f
We first calculate f x:
Class Interval f Mid Point(x) f x
0 − 10 1 5 5
10 − 20 6 15 90
20 − 30 10 25 250
30 − 40 15 35 525
40 − 50 11 45 495
50 − 60 7 55 385
60 − 70 10 65 650
T otal 60 2400

Thus, the mean is:


2400
x̄ = = 40
60
Step 2: Calculate Moments

Class Interval x f (x − x̄) f (x − x̄) f (x − x̄)2 f (x − x̄)3 f (x − x̄)4


0 − 10 5 1 −35 −35 1225 −42875 1500625
10 − 20 15 6 −25 −150 3750 −93750 2343750
20 − 30 25 10 −15 −150 2250 −33750 506250
30 − 40 35 15 −5 −75 375 −1875 9375
40 − 50 45 11 5 55 275 1375 6875
50 − 60 55 7 15 105 1575 23625 354375
60 − 70 65 10 25 250 6250 156250 3906250
f (x − x̄)2 f (x − x̄)3 f (x − x̄)4
P P P P
T otl f (x − x̄)
=0 = 15700 = 9000 = 8627500

f (x − x̄)2
P
15700
µ2 = P = = 261.67
f 60

Page 39
f (x − x̄)3
P
9000
µ3 = P = = 150
f 60
f (x − x̄)4
P
8627500
µ4 = P = = 143791.67
f 60
Hence, Skewness
p µ3 150
γ1 = β1 = 3/2 = = 0.035
µ2 (261.67)3/2
Kurtosis
µ4 143791.67
γ2 = β2 − 3 = 2 = − 3 = −0.90
µ2 261.672

Question 51

Given moments about working mean 28.5: µ′1 = 0.294, µ′2 = 7.144, µ′3 = 42.409, µ′4 = 454.98. Find
central moments and analyze skewness/kurtosis.

Solution: Central moments:

µ2 = µ′2 − (µ′1 )2 = 7.144 − (0.294)2 = 7.057


µ3 = µ′3 − 3µ′2 µ′1 + 2(µ′1 )3 = 42.409 − 3(7.144)(0.294) + 2(0.294)3 = 36.159
µ4 = µ′4 − 4µ′3 µ′1 + 6µ′2 (µ′1 )2 − 3(µ′1 )4
= 454.98 − 4(42.409)(0.294) + 6(7.144)(0.294)2 − 3(0.294)4 = 408.79

Shape measures:
µ23 (36.190)2
β1 = = ≈ 3.72 (Positive skew)
µ32 (7.057)3
µ4 408.79
β2 = 2 = ≈ 8.21 (Leptokurtic)
µ2 (7.057)2

Question 52

Obtain a relation of the form y = abx for the following data by the method of least squares:
x 2 3 4 5 6
y 8.3 15.4 33.1 65.2 127.4

Sol. The curve to be fitted is y = abx


or
Y = A + Bx,
where,
A = log10 a, B = log10 b and Y = log10 y.

∴ The normal equations are X X


Y = 5A + B x
and X X X
xY = A x+B x2 .

Page 40
x y Y = log10 y x2 xY
2 8.3 0.9191 4 1.8382
3 15.4 1.1872 9 3.5616
4 33.1 1.5198 16 6.0792
5 65.2 1.8142 25 9.0710
6 127.4 2.1052 36 12.6312
P P P 2 P
x = 20 Y = 7.5455 x = 90 xY = 33.1812
Substituting the above values, we get

7.5455 = 5A + 20B and 33.1812 = 20A + 90B.

On solving
A = 0.31 and B = 0.3

a = antilog A = 2.04 and b = antilog B = 1.995.

Hence the required curve is


y = 2.04(1.995)x .

Question 53

For 10 observations on price (x) and supply (y), the following data were obtained (in appropriate
units): X X X X X
x = 130, x2 = 2288, y = 220, y 2 = 5506, xy = 3467
Obtain the two lines of regression and estimate the supply when the price is 16 units.

Σx Σy
Sol. Here, n = 10, x̄ = n = 13 and ȳ = n = 22
Regression coefficient of y on x is

nΣxy − ΣxΣy (10 × 3467) − (130 × 220)


byx = 2 2
= = 1.015
nΣx − (Σx) (10 × 2288) − (130)2

∴ Regression line of y on x is

y − ȳ = byx (x − x̄)

y − 22 = 1.015(x − 13)

⇒ y = 1.015x + 8.805

Regression coefficient of x on y is

nΣxy − ΣxΣy (10 × 3467) − (130 × 220)


bxy = 2 2
= = 0.9114
nΣy − (Σy) (10 × 5506) − (220)2

Regression line of x on y is

Page 41
x − x̄ = bxy (y − ȳ)

x − 13 = 0.9114(y − 22)

x = 0.9114y − 7.0508

Since we are to estimate supply (y) when price (x) is given therefore we are to use regression line of y on
x here.
When x = 16 units,

y = 1.015(16) + 8.805 = 25.045 units

Q2(c) The following data relate to variables X and Y:


X 2 4 6 8 10
Y 5 9 13 17 21
Find the correlation coefficient and obtain the regression lines of Y on X and X on Y. CO3 K3 Marks
Sol. Given Data:
• X : {2, 4, 6, 8, 10}
• Y : {5, 9, 13, 17, 21}
• n=5

1. Computation Table
X Y X2 Y2 XY
2 5 4 25 10
4 9 16 81 36
6 13 36 169 78
8 17 64 289 136
10 21 100 441 210
P P P 2 P 2 P
X = 30 Y = 65 X = 220 Y = 1005 XY = 470
Means:
X̄ = 30
5 = 6, Ȳ = 65
5 = 13

2. Correlation Coefficient (r)


Using the formula: PP P
XY − ( X)( Y )
n
r=p P
[n X 2 − ( X)2 ][n Y 2 − ( Y )2 ]
P P P

5(470) − (30)(65)
r=p
[5(220) − 302 ][5(1005) − 652 ]
2350 − 1950 400 400
r=p =√ = =1
[1100 − 900][5025 − 4225] 200 × 800 400
Result: There is a perfect positive correlation (r = 1).

Page 42
3. Regression Line of Y on X
The regression coefficient byx is:
P P P
n XY − ( X)( Y ) 400
byx = P 2 P 2 = =2
n X − ( X) 200

Equation: Y − Ȳ = byx (X − X̄)

Y − 13 = 2(X − 6) =⇒ Y = 2X + 1

4. Regression Line of X on Y
The regression coefficient bxy is:
P P P
n XY − ( X)( Y ) 400
bxy = P 2 P 2 = = 0.5
n Y −( Y) 800

Equation: X − X̄ = bxy (Y − Ȳ )

X − 6 = 0.5(Y − 13) =⇒ X = 0.5Y − 0.5

Q5(a) The following data give the frequency distribution of a variable X:

X 10 20 30 40 50
Y 3 7 12 8 4

Find the first four moments about the mean and hence determine the coefficient of skewness and kurtosis.
Sol. Given Frequency Distribution:
• X (Variable): {10, 20, 30, 40, 50}
• f (Frequency): {3, 7, 12, 8, 4}

Step 1: Calculation of Mean (X̄)


X
N= f = 3 + 7 + 12 + 8 + 4 = 34
X
f X = (10 × 3) + (20 × 7) + (30 × 12) + (40 × 8) + (50 × 4)
X
f X = 30 + 140 + 360 + 320 + 200 = 1050
P
fX 1050
Mean X̄ = = ≈ 30.882
N 34

Step 2: Computation Table (Direct Deviations d = X − 30.882)


X f d fd f d2 f d3 f d4
10 3 -20.882 -62.646 1308.17 -27317.0 570433.6
20 7 -10.882 -76.174 828.92 -9020.3 98158.9
30 12 -0.882 -10.584 9.34 -8.2 7.2
40 8 9.118 72.944 665.10 6064.4 55295.2
50 4 19.118 76.472 1461.99 27950.3 534354.7
Total 34 ≈0 4273.52 -2330.8 1258249.6

Page 43
Step 3: Calculation of Central Moments (µr )
f (X−X̄)r
P
By the direct method, the rth central moment is µr = N :
P
fd
• First Moment: µ1 = N =0
f d2
P
• Second Moment (Variance): µ2 = N = 4273.52
34 ≈ 125.692
f d3
P
• Third Moment: µ3 = N = −2330.8
34 ≈ −68.553
P 4
fd
• Fourth Moment: µ4 = N = 1258249.6
34 ≈ 37007.34

Step 4: Skewness and Kurtosis


Coefficient of Skewness (γ1 ):
µ3 −68.553
γ1 = = ≈ −0.0487
3/2
µ2 (125.692)1.5

Coefficient of Kurtosis (β2 ):


µ4 37007.34
β2 = 2 = ≈ 2.342
µ2 (125.692)2

Q5(b) Fit a parabola of the form


y = a + bx + cx2
to the data
X 0 1 2 3 4
Y 1 2 5 10 17
CO3 K3 7 Marks
Sol. The equation of the parabola is y = a + bx + cx2 . The normal equations are:
X X X
Y = na + b X +c X2
X X X X
XY = a X +b X2 + c X3
X X X X
X 2Y = a X2 + b X3 + c X4

X Y X2 X3 X4 XY X 2Y
0 1 0 0 0 0 0
1 2 1 1 1 2 2
2 5 4 8 16 10 20
3 10 9 27 81 30 90
4 17 16 64 256 68 272
X 2 = 30 X 3 = 100 X 4 = 354 X 2 Y = 384
P P P P P P P
X = 10 Y = 35 XY = 110

Values from Data Table (n = 5)


P 2
X 3 = 100, X 4 = 354
P P P P
P X = 10, Y
P= 35, X = 30,
XY = 110, X 2 Y = 384

Page 44
System of Equations
1) 35 = 5a + 10b + 30c
2) 110 = 10a + 30b + 100c
3) 384 = 30a + 100b + 354c
Solving this system, we obtain:
a = 1, b = 0, c=1

Final Equation
The fitted parabola is:
y = 1 + x2

Page 45

You might also like