0% found this document useful (0 votes)
307 views14 pages

Moments, Skewness, and Kurtosis Analysis

This document discusses moments and distributions. It provides examples calculating the first four moments about means, obtaining central moments from non-central moments, and commenting on skewness and kurtosis based on the coefficients of skewness and kurtosis. Several exercises are included asking to find mean, median, standard deviation from frequency distributions and to calculate mean deviation from median age from an age distribution table.

Uploaded by

Mina Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
307 views14 pages

Moments, Skewness, and Kurtosis Analysis

This document discusses moments and distributions. It provides examples calculating the first four moments about means, obtaining central moments from non-central moments, and commenting on skewness and kurtosis based on the coefficients of skewness and kurtosis. Several exercises are included asking to find mean, median, standard deviation from frequency distributions and to calculate mean deviation from median age from an age distribution table.

Uploaded by

Mina Kale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ERING

MATHEMATICS-Im (Comp. Engg. and T Group) (S-)(5.29)


ENGINEE CORRELATION
AND
REGRESSION
STATISTICS,

12:he first four moments about the working mean 30.2


ho Kest fedr moments aoout the mean. Also evaluate of a distribution are 0.253,
o
B Bz and comment upon the skewness and
Nov. 2019)
(Dec. 2005, 2006, May 2010,
sol: The first four moments about the
arbitrary origin 30.2 are
A= 0.255,
= 6.222, s 30.211, = 400.25 =

ANf i-30.2) 2-30.2 --302 0.255


X = 30.455
H2H-(4 =6.222- (0.255)2 =6.15698
As H-342' 4+2(4) 30.211-3 (6.222) (0.255) +2 (0.255)
=

=
30.211 4.75983 0.03316275
A325.48433
H4 H-44 H+642 (4) -3(41)
=
440.254 (30.211) (0.255) 6 (6.222) (0.255)-3 (0.255)
H4 378.9418

(25.48433)2
(6.15698)3 2.78255

B2 378.9418
(6.15698)
B29.99625
Y= VB1 =y2.78255 1.6681
which indicates considerable positive skewness of the distribution.
Y2Ba-3 =9.99625-3 =6.99625
whieh shos that the distributionis leptokurtic
Ex. 13 :/The first four moments of a distribution about the value 5 are 2, 20, 40 and 50. From the given information obtain the
firstfour sehtral moments, mean, standard deviation and coefficient of skewness and kurtosis. (Dec. 2007; May 2015, 2019)

Sol.: A 5,
=
H' =
2, 4
20, 4'= 40 and H' 50.
= =

On the basis of given information we can calculate the various central moments, mean, standard deviation and coefficient of
skewness and kurtosis.
The first moment about zero gives the value of the distribution.

Mean x = A +H = 5 +2 = 7

Now we calculate central moments.

H2 4-(4) =
20-(2)2 =16
H3 Hs-3H 4 + 2(4
= 40-3 (2) (20)+2 (2
40 120 +16
= - 64

H4 H44 4 4+6(41) 42-3(4


= 50-4 (2) (40)+6 (2)2 (20) -3 (2)4
50-320+480 48
E 162
ENGINEERING MATHEMATICS-m (Comp. Engg. and IT Group) ( -) (5.30) STATISTICs, CORRELATION AND
The second central moment gives the value of variance.
REGRL
Variance H2 16
V16 4
Standarddeviation VA2 =

Coefficient of skewness is given by,

- 161
6)3
Coefficient of kurtosis is given by,
e 4s is negative, the distribution is negatively skewed.

(16)2 0.63

Since the value of Bzis less than 3, hence the distributionis platylkUrtic
EX. 14: The first four central moments of distribution are 0, 2.5, 0.7 and 18.75. Comment on the skewness and i
kurtosis of the
distribution.
0.7 and jH4 18./5
(May 208
Sol.:Testing of Skewness:H =0, H2 =
2.5, H3 =

Coefficient of skewness is given by,

Ba
(0.7)2
(2.5)3 0.0314

Since, is positive, the distribution is positively skewed slightly.


Testing of Kurtosis: Coefficient of kurtosis is given by

H4 18.75
B2 (2.5)
Since, B2 is exactly three, the distribution is mesokurtic.

EXERCISE 5.1
1. Find the Arithmetic Mean, Median and Standard deviation for the following frequency distribution.

59 12
15 20 24 30 42 49

36889 10
8 7 6 2

Ans. x = 22.9851, M 20, a = 1133%

2. Age distribution of 150 life insurance policy-holders is as follows

Age as on Nearest Birthday


Number
15 19.5
10
20 24.5
20
25 29.5
14
30 34.5 30
35 39.5 a 32
40 44.5
14
45 49.5
15
50 54.5
10
55 59.5
842
Calculate mean deviation from median age. M.D.
=

Ans.
(5-0(5.34)
STATISTICs, cORRELATION AND REGRtss
ENGINEERING MATNEMATICS- (Comp. [Link] IT Group)

n-8 (Total number ofpoints) and b by c we get


after replacing a by m
Substituting in (1) and (2) of (56.2) 140m 28c 140
28m 8c 16
Sm+c5

4 1) or
7m 2
2, cm-S
Solving () and (2) we get m
line is
Hence the equation of the straight
y 2x-5
data using least square criteria
br cto the following
form y= ax
x. 2: Fit a parobola ofthe 6
31 50

Sol.:
X = X-4 X Xy Xy
81 15 -45

16
5
0 0
16 31 31
5 31 100 200
16
50
27 81 219 657
73 x= Xy 364 Xy =840
y 168 IX = 0 x =28 x=0
196

n 7.
(6) and (7) of (5.6.3)
Substituting in equations (5), .(1)
28a Ob +7c 168 =

.2)
a-0+28b c0 = 364
)
840
0 b+28c
=
196a

(1), (2), 3) can be written as


(4)
4a +0 b +c = 24
.15)
0 13
a 0+b +C
... (6)
7a 0 - b +c = 30

From (5) b = 13 and from (4) and (5) we get,


a = 2, c= 16

of variable X is
Equation of parabola in
terms

y 2X+ 13X+16
14 (x-4) + 16
Putting X =X-4 y =2(x-4
y = 2x-3x-4

i sthe required fit for the data.


load Pkg) its middle point. Corresponding
to various values ofP, the
carries a concentrated
Ex. 3:A simply supported beam
cms is tabulated as:
maximum deflection y 200
100 120 140 160 180

0.90 1.10 1.20 1.40 1.60 1.70

Find a law of the form y= aP+b by using least square criteria.


REGRESSION
AND
ENGINEERINGMATNEMAT
MATICS-m (Comp. Engg. and IT roup) (S-D(5.35) STATISTICS,
c O R R E L A T I O N

Sol:
Preparing thetable as
PO)
100
XP-140 Xy
0.90
40 1600 -36
120 1.10 20 400 -22
140 1.20 0 0 0
160 1.40 20 400 28
180 1.60 40 1600 64
200 1.70 60 3600 102
y-79 EX 60
X-7600 xy-136
of points)
n= 6 (No.
From (1) and (2) of (5.6.2)
60a 6 b = 7.9
.. 1)
7600a 60b = 136
(2)
Solving (1) and (2) we get,
a 0.008143
b 1.2352
y 0.008143 X+ 1.2352
X = P-140
but
y 0.008143 (P -140)+ 1.2352
y = 0.008143 P+ 0.9518

isthe required result.


Ex. 4: Values of x and y are tabulated as under
1 1.5 2.0 2.5
25 56.2 100 156
Find the law of the form x = ay" to satisfy the given by data

Sol.: Taking logarithms, we get,


og x log a + n log y

which can be written as,


X = nY +C

whereX = log x Y = log y.

XY
1.0 25 0.0 1.3979 1.9541
1.5 56.2 0.1761 1.7497 3.0615 0.3081
2.0 100 0.301 2.0 4.0 0.602
2.5 156 0.3979 2.1931 4.8097 0.8726
0.875 7.3407 13.8253 1.7827
Substituting in (1) and (2) of (5.6.1) where x is replaced by Y andy by X, a by n, b by log a = Cc. n in (1) of (6.1) = 4 (No. of

points
7.3407n + 4c = 0.875 .
(1)
13.8253n+ 7.3407c = 1.7827 ... (2)
Solving (1) and (2) we get,
n 0.5, C = log a = - 0.6988375 a = 0.2

ence required law of the formx = ay" isx = 0.2 y"


REGRESSION

cORRELATIOw
AND
niG MATHEMATICS- I (Comp. Engg. and IT Group) (S-D(S.4))
ENGINEERINGMAT STATISTICS,

simple
Sol.:LetX:Quantity exported,Y: Quantity imported, Preparing table as followscalculationscan bemade
10 12 120
100 144
11 14 154
121 196
14 15 196 225 210
14 6 196 256 224
20 21 400 441 420
22 26 484 676 572
16 21 256 441 336
12 15 144 225 180
15 16 225 256 240
13 14 169 196 182
Total= 147 170 2291 3056 2638

hence = 14.7
Here, n= 10,

and y N10 17
Xy-n Xy
Vox-nx)x (-ny
2638-10 x 14.7 x17
V(2291-10 x 14.72) (3056 - 10 x 17)
139 0.9458
V130.1x 166
(Dec. 2012)
Ex. 2 Calculate thecorrelationcoeficientfor thefollowing weights(in kq)ofhusband (0and wife(
65 66 67 67 68 69 70 72
55 58 72 55 66 71 70 50
Sol.

55 55 4225 3025 3575


66 58 4356 3364 3828
67 72 4489 5184 4824
55 4489 3025 3685
66 4624 4354 4488
68
71 4761 5041 4899
69
70 4900 4900 4900
0
72 50 5184 2500 3600
544 497 37028 31393 33799
544 68

2 62125
Correlation coefficient between x and y is given by

Cov (X 5y-Xy
n

r (%, y) ox oy
- 3 --0)
TATISTICS,
RELATION AND NG
ENGINEERING MATHEMATICS- (Comp. Engg. and IT Group) (S 1(6.4

(33799)-68 (62.125)

37028 6B 59(62.125)
8
4224.875-4224.5
V4628.5-4624) (3924.125-3859.52)

0.375 0.375
0375 17.051
V45x64.605 290.7225

r(X.y)0.022
Mechanics
of Mathematics and Applled are aie
given o
marks obtained by each in papers
group of 10 students,
Ex. 3: From a
23 28 42 1726 3529 37 1646
x Marksin Maths 18 44
y Marksin App. Mech. 25 22 38 21 27 39 24 32
Calculate Karl Pearson's Coefficient of correlation
Sol.: The data is tabulated as uv
u2
UX-35 Vy-39 441 399
21 361
16 18 19 -

324 324
18 18 324
17 21 196 168
12 14 144
23 25 108
12 81 144
26 27 09
49 289 119
22 07 17
28 90
15 36 225
29 24 06
00 00 00
39 00 00
35 14
07 04 49
37 32 02
49 01 -07
42 38 07 01
44 11 05 121 25 55
46
u-51 v=-100 u2= 1169 v2= 1694 uv =1242
Total
5.1.
10 T-26.01
10 - 10, V
00 100

cov (u, v) 2uiVi -UV (1242)-51 732


26.01= 90.89
Ou V90.89=9.534

- 7 .1694 10-100 = 69.4

oy= 69.4 = 8.33

r (x y) r(u, v) = cOv (u,V)


ou Oy
73.2
9.534 x 8.33 0.9217

Ex. 4: Compute correlation coeficent oerween supply and price of commodity using following data.
152 158 169 182
Supply 182 160 166
Price 198 178 167 152 180 170 162
REGRESSION
NGNEERING

ATHEMATIcS-I (Comp. Engg. and IT Group) (S-m(5.45)


MATNEN
coRRELATION AND
STATISTICS

Sol.: Let
=Supply,uX-150 y price, vwy-160
152 198 76
38 1444
158 178 144
18 64 324
169 167 19
361 49 133
182 152 32 8 1024 64 256
160 180 10 20 200
100 400
166 170 16 10 160
256 100
182 162 32 64
1024 4
Total 119 87 2833 521
2385
Ueren = 7 , u = 119, 2v = 87, 2U° = 2833, v = 2385, u v = 521

u 17, v = 12.4286
UV-n uv

VE-n) x(2-ni)
521-7x 17x 124286
V(2833-7x17) (2.385-7x12.4286)2
958 -958
810 x 1303.7142 1027.6227
- 0.9322

Interpretation: There is highnegativecorrelation between supply and price


Bx. Obtain correlation coefficient between population density (per square miles) and death rate (per thousand persons) from
to 5 cities.
Wstdrelated (Dec. 2010, 2017; May 2010)
Population Density 200 500 400 700 800

Death Rate 12 18 16 21 10
Population density and y =
Death rate.
Sol.: Let x =

u = X-a and V =y-b


Let,
X-500 y-15
X u =X-500 uv
200 12 300 90000 9 900

500 18 0 0 0

400 16 100 1 10000 1 100

700 21 200 6 40000 36 1200

00 300 90000 25 - 1500


100 230000 80 500
Total
Here, n= 5, Zu =
100, v 2, Eu' =
230000, Iv =80, uv =
500

100 20

04
UV- nuv
r (u, v)

Vz-n z-n
500-5 (20) (0.4)
V230000-5 (20) V80-5 (0.4)
STATISTICS, cORRELATIOM AND B
UNGINEERNG MATHEMATICS-(Come. Enpe andTGreup) (5-46
460
V2ZBO00 VF
460
42494202082

(Dec. 2006
x.6 Calculate the coeficient of correlotion for the following distroutic 2
23

sol: Tabulating thedata as fu f fuv


ux-V =y-21
19 - 84 1176 1176
1176
- 84
-14 1296
- 108 900 1080
10 -90
208 637 364
52 -91
5 14 0 0
0 0
19 21 20 64
32 400 160
24 23 16 80
891 704 792
99 88
28 11
63 1183 567 819
32 30 7 1
fu =44 fv fu2 4758 fv2 = 4444
fuv
Total f 82 100 4391

- 05366 -0288

52 12195; 14872

COV (uv)
uiv-ü -0654 5289
-0.288 57.7364

-1.4872 = 52.708

Ou 7.598

oy 7.26

r(u, v) = cOvtuOy 5.1


0.9588
Coefficient of correlationry) =0.9588
Ex 7: Find correlotion coeficient between X and Ygiven that, n = 25, Ex = 75, y = 100, p = 250, y = 50 y- 325

Sol.: Here -3 7-4


y-nxy
Va-nx Z-ny)
325-25x3x4 25
V250-25 x9) (500-25 x16)25x100 500.5
x-X y y-y) or x-x = bxy (- y) (11)

The cOefficient byx Involvea in tne equation (Lo) Is known as regression coefficient of y on x and the coefficient b, involved

ouation (11
equation
(11) is known as regression coefficient of x on y.
in the
mark 1: For obtaining (10) and (11) we have to calculate r = r (x, y) the correlation coefficient, which can be also
and scale property.
termined using changeof origin
r = r(xy) =* cOv cov (Urlu, v)
Thus, ox Gy Ou Oy

u - vV then ay =hou, y =koy

and - and a' -7


and x a+hu, and y =b +kv
In particular, IT u = X-a, V=y-bthen, h = k = 1 and oy = Ou and oy oy

and X a + U, y =b+V
These results help us to determine (10) and (11).
0. f
Remark2: Correlation coefficient and regression coefficients have same algebraic signs. If r > 0, then by, >0 and by
r<0,then bx <0 and bx <0.
therefore correlation coeficient = r=
Vb,x bye i.e. geometric mean of regression
Remark 3: Since b x by =

coeticients. Choose positive square root, if regression coefticients are positive, otherwise negative

Remark 4: The acute angle 6 between the regression lines is given by,

6 =
tan 6,
emark 5: The point of intersection of two regression lineis(X,
TLLUSTRATIONS
(Dec. 2012, 2016)
EX, 1:Dbtain regression linesfor the following data 8
6 2 10

11 5 8
9
y
STATISTICs, CORRELATIONAND
AND RRLORUSA
ENGINEERING
MATHEMATICS-Im (Comp. Engg. and o -)(5.50)

These
icients depend
coefficiente
bxy and byx
So:To find
regression lines we require to calculate regression
coeficient
upon
2 2 x, 2y and xy. So we prepare the following tableand simplify the calculations.
xy

81 54
9 36
121 22
11
25 50
10 100
32
16
64
4 8
49 56
64
214
y= 340 xy
2** 300 y= 40 2x- 220
No. of observations = n 5

X X = 6 and y 8
n 5

n 0 =-(6) =44 -36 =8


Oy 2 n - ( 8 =68-64 4

Cov (x. y) = n
y =4-6x8
Cov (x, y) 42.8 4 8 - 5.2

byx Cov e2 -0.65


bry Cov g -1.3
Regression line of Y on X is

y-y byx (x x)
y-8 - 0.65 (x -6)

y = - 0.65 x + 3.9 +8

y = 0.65 x + 11.9

Regression line of X on Y is

x-x = bxy (y-y)

x-6 -1.3 (y-8)

x-6 -1.3 y + 10.4

X - 1.3 y +10.4 +6

X = -1.3 y 16.4
Ex. 2:Obtain regression lines for the following data:
2 3 5 7 9 10 12
2
15
8
10 12 14 15
Estimate of ( Ywhen X = 6 and (i) X when Y = 20. 16
STATISTICS, CORRELATION ANO
ENGINEERING MATHEMATICS m (Comp. Engg. and IT Group) (S-)(5.5)
Ex. 3; Find the lines of regression for the following data
26 30 34 39
10 614 19 26 29 35 38
16 18
12
and estimate y for x = 145 and x for y = 29.5. (May 2
Sol.: Tabulating the data as: 2 Uv
u2
ux-26 v =y-26 196 224
256
16 -14
10 12 144 100 120
12 -10
14 16 64 56
49
19 18 0 0 0

26 26 0 9 12
3 16
29 4
30 64 81 72
8 9
34 35 144 156
12 169
39 38 13 uv=640
v-8 u'= 698 V=594
Total u=-10
=-1429, V -=-1.143
Here n 7,
v2 1.306
u2 2.042,

cov (u, v) 2 uV uv

= (640) -(1.429) (1.143) = 89.795

o u'- u -(698)-2042=97.672
Ou 9.883

-7-594)(594) 1.306 = 83.551

Oy 9.14

r r(x, y) = r (u, v)
cOv(u, V) 89.795
ou y 9.883 x 9.14

89.795 0.9941
90.33062

9
0.9941 9.883 0.9194

x y rx 0.9941 x 9.14 = 1.0749


X a u = 26 1429 24.571

y b+ v = 26-1.143 24.857
Regression line of y on x is given by equation (10)

y-24.857 = 0.9194 (x-24.571)


Regression line of x on y is given by equation (11)

x-24.571 = 1.0749 (y- 24.857)


EMATICS I l
NEERING MATHEM (Comp. Engg. and IT Group) AND REGRESSION
(S-m(5.53) STATISTICS, cORRELATION
for x = 14.5
To estimate y

tY= 14.5 in (). y= 24.857+0.9194 (14.5-24.571) =


15.5977
Estimate of x for y = 29.5 is obtained from ( )

x
24.571 1.0749 (29.5-24.857)
29.56176
Ey.4: The table below gives the respective heights x and y of a sample of 10 fathers and their sons

(0 Find egression line of y on x.


( Find regression line of x on y.

(üo Estimate son's height if father's height is 65 inches.

(v)Estimate fother's height if son's height is 60 inches.


Compute correlation coefficient between x and y.

v Find the angle between the regression lines.

Height of Father x(inches) 65 63 6764 6862 70 66 68 67


Height ofSon y(inches) 68 66 68 65 69 66 68 65 71 67
Sol.: Let u =X- 62, V=y- 65. We prepare the table to
simplify the computations.
UV

65 8 9 9 9
63 66
67 68 3 25 9 15
64 65 0 4 0 0
68 69 6 4 36 16 24
62 56 0 0 0
70 68 8 3 64 24

66 65 0 16 0 0

68 71 6 36 36 36
67 67 5 25 4 10

Total 40 23 216 85 119


n Number of pairs = 10

04 a-142=5.6

10 23 a023 -321
Cov (u, v) = 4 x 2.3 = 2.7

2. 0.4821
bxy buv 321 0.8411, and byx = byu =
X =u 62 = 66, y = v 65 =67.3

0 Regression line ofy on x is Y-Y = byx (X-X)


Y-67.3 = 0.4821 (X-66)
Y = 0.4821X + 35.4814
MATHEMATICS-T (Comp. Engg. and IT Group) (S-m(s.55)
GINEERINGA
STATISTICS,
coRRELATION AND REGRESSIoN
NO

2 n
180-2 18-4-14
o -( 488
10-(3 48.8-9 39.8
o 3.742 and oy =6.309
dard deviation is invariant to the change of origin.
ox3.742 and ay 6.309
2
1 4 and oy 39.8
Cov (u, v)

Cov (u, V)= -9.3


-UV1-20)-33-6
Covariance is invariant to the change of origin.
Cov (X. y)= Cov (u, v) =-9.3
We have to find regression equation of y on x. It is given by

y-y byx -x)

byx
Cov (&. 14
-0.64
x
Regression equation becomes,
y-38 =-0.664 (x- 32)
y =-0.664 x + 21.248 + 38
y = - 0.664 x + 59.248

Now, we have to estimate marks in Statistics if marks in Economics are 30, i.e. we have to find value of y whenx = 30.

Substitutingx= 30 in above equation, we get


y = 0.664 x 30 59.248

y 39.328

Marks in Economics are 39.328 i.e. approximately 39.


the
Ex : Determine regression line for price, given the supply, hence estimate price when is 180 units,
klowiphinformation:X = supply, y = Price, n = 7, Z - 150) = 119, Sy- 160) = 84, Xx- 150= 2835, 2y- 160
supply from
= 2387,
-150) Jy - 160) = 525. Also, find correlation coefficient between price and supply (Dec. 2018)
Sol.: Let u = X-150, v =y-160

17, 12
2835) -(17) = 405 -289 = 116

o (2835) -(12) =
341 -144 = 197

129
covx. y) =
cov(u, v) =
(525)- 17x 12 = -

x = 150 +u = 167, y = 160 + V = 172

-129
Day bu = 116
Equation of regression line y on x is
y-y = bx (x- x)
y-172 = (-1.1121) (x-167)

Correlation coefficient r is obtain as


cOv(U, V) -129
- 0.8534
V116x197
Sine Doth the regression coefficients are negative, we take r =-0.663.
MATHEMATICS-(Comp. Engg. and IT Group) (S-m(5.57)
I N G I N E E R I N GM A T H E M A

CORRELATION AND REGRESSION


STATISTICS,

bstituting values of x and y. we get


S u b s t

9 (2)-3) =

A 18-3 15
4 (2)(-3) =
no 8 - 3 =5
regression
lines are,
Thus, the
9x+y 15 and 4x +y = 5

Let9x+y
=19 be the regression line of x on y, so it can be written as

bxy =-= -0.11


ietdr+y = 5 be the regression line of y on x. So it can be written asy = 5-4x.
byx-4
between x and y is
given as,
Correlation coefficient
r Vbyx bxy V-4)x(-0.11) =yo44 0.663
taker
Since both the regression coefticients are negative,
we =
-0.663,.
0 and 40x- 18y =214. The value of variance of x is 9. Find:
Er9: The regression equations are 8x 10y 66
+ -

wThe mean ofxandy values


x andy and
2Thecorrelation (Nov. 2015, May 2019)
standard deviation ofy.
8) The
lines pass through the point (x, y ), we have
Sol.:(1) Since both the regression
8 -10y 66 0 and 40x -18 y = 214

Solving these two equations, we get


x = 13 and y 17
66 =0 be the line of regression of y on x and 40x 18y 214 be the line of regression of x on y.
(2) Let 8x 10y+
can be written in the form
These equations
y1010 214
and x 40Y+40

y = 0.8 x+ 6.6 and x =0.45 y 5.35 +

byx Regression coefficient of y on x


0.8
and bxy Regression coefficient of x on y
= 0.45

Correlation coefficient between x and y is given by


t0.6
r
Vbxyx bypx =
V045 x 0.8 =

coefficients are positive, we take


But since both the regression
+0.6

(3) Variance of x =
9, i.e. o = 9

.. Ox 3

We have,
byx ox
0.8 = 0.6 x
ay

Gy 4

Common questions

Powered by AI

To solve a system of equations derived from logarithmic transformations of geometric data, you convert the data into logarithmic forms and substitute these into linear equations. For example, if you have x = a*y^n, taking the logarithm gives you log(x) = log(a) + n*log(y). This can be linearized as X = nY + c, where X = log(x), Y = log(y), and c is a constant derived from log(a). You can then use the method of least squares to solve the transformed linear system by setting up equations from the sum of products of transformed variables and solving them to find n and a .

To estimate the height of a son given the height of the father, use the regression line of son's height on father's height. This regression line is given by y = byx*X + a, where byx is the regression coefficient of son’s height on father’s height, X is the father's height, and a is the intercept. By substituting the father's height into this equation, you can predict the son's height. This method leverages the linear relationship established between the two variables to make estimations based on the regression analysis outcome .

High positive correlation in an engineering context signifies a strong direct relationship between two variables, whereby increases in one variable are associated with increases in the other. This relationship allows engineers to make reliable forecasts and aids in decision-making, as predictions based on highly correlated variables tend to be more accurate. For instance, knowing the positive correlation between the pressure and temperature in a closed system can help in designing safer operational parameters. However, while it improves predictive accuracy and informs strategic decisions, one must consider causality and external factors before implementing such decisions .

A negative correlation coefficient indicates an inverse relationship between two variables, meaning as one variable increases, the other decreases, and vice versa. The strength of this relationship is determined by the absolute value of the correlation coefficient, where values closer to -1 indicate a strong negative relationship. In data analysis, recognizing a negative correlation can help identify and predict trends, assess the interdependence between variables, and inform decision-making processes. However, it is crucial to consider external factors that may influence the variables, as correlation does not imply causation .

To derive the equation of a regression line for predicting a dependent variable (y) from an independent variable (x) using covariance, calculate the regression coefficient (b) as the covariance of the variables divided by the variance of the independent variable (b = Cov(x,y) / Var(x)). The regression line equation is then y = b*x + a, where a is the y-intercept calculated as the mean of y minus b times the mean of x (a = ȳ - b*x̄). This equation represents the line of best fit that minimizes the sum of squared deviations of points from the line .

To calculate the correlation coefficient between two variables using their respective u (deviation of x from mean or assumed mean) and v (deviation of y from mean or assumed mean) values, you calculate the covariance and standard deviations of the u and v series. The covariance is computed as the sum of products of u and v divided by n, the number of data points. The standard deviation for each is calculated as the square root of the sum of squares of deviations divided by n. Finally, the correlation coefficient is r = covariance of (u, v) / (standard deviation of u * standard deviation of v).

Steps involved in preparing data for a correlation coefficient calculation between supply and price include: assigning a base value to both data sets to calculate deviations (let supply deviation u = X - base, and price deviation v = Y - base); finding the sum of products of deviations (Σuv), sum of squares (Σu² and Σv²); calculating individual sums (Σu and Σv); then using these values in the correlation formula: r = (Σuv - n*mean(u)*mean(v)) / sqrt((Σu² - n*mean(u)²) * (Σv² - n*mean(v)²)). This setup facilitates accurate calculation of the correlation coefficient .

To verify the correctness of regression coefficients for a dataset involving height measurements, ensure that the coefficients satisfy the equations derived from the data. Using a dataset with height measurements, calculate the mean values of both variables and use these to determine deviations. Calculate covariance and variances, from which the regression coefficients are computed (bxy = Cov(x, y) / Var(x); byx = Cov(y, x) / Var(y)). Check if these coefficients, when substituted back into the equations used to compute them, yield the original covariance values, validating accuracy. This involves comparing calculated regression equations against original data points to see if predicted outputs match actual data trends .

The intersection point of regression lines in regression analysis, found where both regression equations have the same x and y values, signifies the point where predictions from both regression models agree. It provides insights into the relationship dynamics between two variables. This convergence emphasizes the reliability of predictive analysis at this specific point, offering potential for validation of model assumptions. It serves as a basis for comparative analysis between different datasets or models and highlights consistency, especially in bivariate regressions .

The angle θ between two regression lines can be determined using their regression coefficients bxy and byx. The tangent of the angle θ is calculated as |(bxy - byx) / (1 + bxy * byx)|. This formula arises because the regression coefficients determine the slopes of the lines. Knowing the slopes allows for the calculation of the angle between them since the tangent of the angle difference between lines is based on the difference of slopes divided by one plus the product of slopes .

You might also like