0% found this document useful (0 votes)
49 views18 pages

WINSEM2023-24 BMAT202L TH VL2023240502271 2024-02-22 Reference-Material-I

Uploaded by

mpkvarun69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
49 views18 pages

WINSEM2023-24 BMAT202L TH VL2023240502271 2024-02-22 Reference-Material-I

Uploaded by

mpkvarun69
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 18
Regression Definition: Regression is the measure of the average relationship between two or more variables in terms of the original units of data: Regression Equation: The functional relationship of a dependent variable with one or more independent variable is called regression equation. It is also called a prediction equation or estimating equation. Note: The independent variable in regression analysis is called the "predictor" or “regressor" and the dependent variable is called the regressed variable. Types of Regression: = If there are only two variables under consideration, then the regression is called simple regression. = If there are more than two variables under consideration, then the regression is called multiple regression. = If there are more than two variables under consideration, and only the relation between two variables is established, after excluding the effect of the remaining variables, then the regression is called partial regression. = If the relationship between x and y is non-linear, then the regression is a curvilinear regression. There are certain guidelines for regression lines: 1) Use regression lines when there is a significant correlation to predict values. 2) Do not use if there is not a significant correlation. 3) Stay within the range of the data. For example, if the data is from 10 to 60, do not predict a value for 400. Regression Equations (Linear Fit) * Linear regression equation of y on x + Linear regression equation of x on y Equation of the Regression Line of Y on X The regression line of Y on X is the best-fitting straight line for the observed pairs of values (x1, y1), (x2, y2), .«. (xn, yn), based on the assumption that x is the independent variable and y is the dependent variable. let the equation of the regression line of Y on X be assumed as y=ax+tb. (1) By the principle of least squares, the normal equations which give the values of a and b. are Lyj=aLxjtnb Q) and Lxy,=adx2,4b Ex, ) Dividing equation (2) by n, we get ysax+b (4) where ¥ = E(X) and = E(¥). (1)-(4) gives the required equation as aix-¥) ‘5) Eliminating b between equations (2) and (3) we get or Using (6) in (5), we get the equation of the regression line of Y on X as eee i) or y-Fe BER ®) In a similar manner, assuming the equation of the regression line of X and Y as x = ay + b and using the equations Dx, = a Dy, + nb and Bxy,=a Ly? +b Ey, we can get the equation of the regression line of X on Y as x-¥ = BY (y-5) (9) Oy or (10) called the regression coefficient of ¥ on X and is called the regression Px. ty Ov denoted by by ony SE or ME coefficient of X on ¥ and denoted by by or Byy. Clearly by 03 = Py, 1-0. ryy is the geometric mean of h, and by, ryy = + yb, by ¥ 7 o The sign of ryy is the same as that of b, or by, as by =r, 2% and Oy Oo: . a i= Te fee have the same sign as ryy (+7 Gy and dy are positive). x Also — = GEG 3. When there is perfect linear correlation between X and Y, viz., when ryy = + I, the two regression lines coincide. 4. The point of intersection of the two regression lines is clearly the point whose co-ordinates are 5. When there is no linear correlation between X and Y, v = 0, the equations of the regression lines become y which are at right angles. when ryy and x x Problem 1: For the following data, find the regression line of y on x. 1/2/3]4 {5 | 8 [10 Solution 1: x = = = 8 = 4,714 and y= 2% = 4 = 12 x | y | x | x 1|9 9 aN 2) 8 | 16 4 3 | 10| 30 9 4/12) 48 | 16 5 | 14] 70 | 25 8 | 16 | 128 | 64 10 | 15 | 150 | 100 _ ay xi ~ XV _ bn = I Seage 0887 The regression equation of y on x is Y-Y = by(x—x) > y-12 = 0.867(x 4.714) =y = 0.867x + 7.9129 Problem 2: From the following data, fit two regression equations by finding actual means (of x and y), i.e., by the actual mean method x{1/2/3/}4/5|6/7 y|2)4/7/6}5]6]5 Solution 2: x= U4 = Bo dgandy= U4 = B=5 x | y [X=x-x[ Y=y—-y] xX? Y? [XY 1/2 3B 3 a) 9 7) 9 2/4 2 A 4} 4] 2 3/7 A 2 1/1] -2 4) 6 0 1 0 0 0 5 | 5 A, 0 2D 1 0 6 | 6 2 1 al}4)2 7/5 3 0 9 9 0 28 | 35 0 0 28 | 16 | 11 GY, la by = ZO = = = 0.3928 “SX? 28 _=xY, _u by = = = =0.6875 vy" Sy? ~ 16 The regression equation of y on x is Y~Y = byx(x—X) Sy—5 = 0.3928(x —4) Sy = 0.393x + 3.428. The regression equation of x on y is xX byly-Y) =+x-4 = 0.6875(y —5) 0.688y + 0.56. =>x Problem 3: From the following results, obtain the two regression equa- tions and estimate the yield of crops when the rainfall is 29 cms and the rainfall when the yield is 600 kg. Y (yield in kgs) | X (rainfall in cms) Mean 508.4 26.7 SD. 36.8 46 Coefficient of correlation between yield and rainfall is 0.52. Solution 3: We have x = 26.7,y = 508.4,0, = 4.6,0, = 36.8 and p =0.52. Now, bye = p24 = 4.16 and by = p2* = 0.065. Ox oy The required regression equations are y = 4,16x + 397.328 and x 0.065y — 6.346. When x = 29 cm, we have y = (4.16 x 29) + 397.328 = 517.968 kg. When y = 600 kg, we have x = (0.065 x 600) — 6.346 = 32.654 cm. i.e., when the rainfall is 29 cms, the yield of the crop is 517.968 kg, and when the yield is 600 kg, the rainfall is 32.654 cms. Multiple linear Regressions If the number of independent variables in a regression model is more than one, then the model is called as multiple regression. In fact, many of the real-world applications demand the use of multiple regression models. Regression Model with wo independent variables using Normal equations: Suppose the number of independent variables is two, then Y=b,+hX,+b,X, Normal equations are EY =nb, +b X,+b,2X, DYN, =b, DX, 4h 2X +b,2 XX, EYXy =B,EXy +B EX Ky tT. Problem 1: The annual sales revenue(in crores of rupees ) of a product as a function of sales force(number of salesmen) and annual advertising expenditure(in lakhs of rupees) for the past 10 year are summarized in the following table. Annual sales|20 | 23 | 25 |27 [21 |29 |22 |24 |27 |35 revenue Y Sales force | 8 13/8 18 |23 |16 |10 |12 |14 |20 X1 Annual 28 /23 /38 |16 |20 |28 |23 |30 |26 |32 advertising expenditures X2 Let the regression model be Y =), +b,X,+b,X, Y XL X2 XV X2? XIX2— | YXI_ | YX2 20 8 28 64 784 224 160 560 23 13 23 169 529 229 299 529 25 8 38 64 1444 304 200 950 27 18 16 324 256 288 486 432 21 23 20 529 400 460 483 420 29 16 28 256 784 448 464 812 22 10 23 100 529 230 220 506 24 Ln 30 144 900 360 288 720 27 14 26 196 676 364 378 702 35 20 32 400 1024 640 700 Substituting the required values in the normal equations, we get the following simultaneous equations 253 =10b, + 142b, + 264b, To 142 26%) [b 253 61 3678 = 142b, +2246b, +3617, a aa $61 Th | = | 268 26% siz 7326] | ba ast 6751 = 264h, +3617b, + 73265, The solution to the above set of simultaneous equation is b, = 5.1483, b,=0.6190 and b, = 0.4304 Therefore, the regression model is Y = 5.1483 + 0.6190.X, + 0.4304.X,

You might also like