0% found this document useful (0 votes)
248 views

Brief Lecture Notes On Simple Linear Regression Regression Analysis

This document provides an overview of simple linear regression analysis. It defines key concepts such as the regression model, dependent and independent variables, and the regression line. It presents the formula for the least squares regression line as ŷ = a + bx. It provides an example of calculating the least squares regression line using income and food expenditure data from seven households. It also briefly introduces multiple linear regression, which uses two or more independent variables to model the relationship with a dependent variable.

Uploaded by

Maliha Farzana
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
248 views

Brief Lecture Notes On Simple Linear Regression Regression Analysis

This document provides an overview of simple linear regression analysis. It defines key concepts such as the regression model, dependent and independent variables, and the regression line. It presents the formula for the least squares regression line as ŷ = a + bx. It provides an example of calculating the least squares regression line using income and food expenditure data from seven households. It also briefly introduces multiple linear regression, which uses two or more independent variables to model the relationship with a dependent variable.

Uploaded by

Maliha Farzana
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

Lecture 4

Brief lecture notes on simple linear regression

Regression analysis
It is necessary to develop an equation to express the relationship between two
variables and estimate the value of the dependent variable Y based on a
selected value of the independent variable X . the technique used to develop the
equation for the straight line and make these predictions is called regression
analysis. So, regression is a statistical method to estimate (or predict) the
unknown values of one variable (Y ) for specified values of the other variable
( X ).

Definition
A regression model is a mathematical equation that describes the relationship
between two or more variables. A simple regression model includes only two
variables: one independent and one dependent. The dependent variable is the
one being explained, and the independent variable is the one used to explain the
variation in the dependent variable.

A (simple) regression model that gives a straight-line relationship between two


variables is called a linear regression model.

Regression lines or regression equations


In the regression model y = A + Bx + Є, A is called the y-intercept or constant
term, B is the slope, and Є is the random error term. The dependent and
independent variables are y and x, respectively.
In the model ŷ = a + bx, a and b, which are calculated using sample data, are
called the estimates of A and B.

1
SIMPLE LINEAR REGRESSION ANALYSIS cont.

Constant term or y-intercept Slope

y = A + Bx

Dependent variable Independent variable

Figure_R1 Relationship between food expenditure and income.


(a) Linear relationship. (b) Nonlinear relationship.
Food Expenditure

Food Expenditure

Linear

Nonlinear

Income Income

(a) (b)

2
Figure_R2 Plotting a linear equation.

y
y = 50 + 5x
150

100 x = 10
y = 100
50 x=0
y = 50
5 10 15 x
6

The Least Squares Line

For the least squares regression line


ŷ = a + bx,

SS xy
b and a  y  bx
SS xx

18

3
The Least Squares Line cont.

where
  x   y   x 2

SS xy   xy  and SS xx   x 
2

n n

and SS stands for “sum of squares”. The


least squares regression line ŷ = a + bx us
also called the regression of y on x.

19

Example
Find the least squares regression line for the data on incomes and food
expenditure on the seven households given in the Table_R1, Use income as an
independent variable and food expenditure as a dependent variable.

Table_R1 Incomes (in hundreds of dollars) and Food


Expenditures of Seven Households

Income Food Expenditure


35 9
49 15
21 7
39 11
15 5
28 8
25 9
12

4
Solution

Table_R2

Income Food Expenditure


x y xy x²
35 9 315 1225
49 15 735 2401
21 7 147 441
39 11 429 1521
15 5 75 225
28 8 224 784
25 9 225 625
Σx = 212 Σy = 64 Σxy = 2150 Σx² = 7222
21

 x  212  y  64
x   x / n  212 / 7  30.2857
y   y / n  64 / 7  9.1429

22

5
  x   y  (212)(64)
SS xy   xy   2150   211 .7143
n 7
 x 2
(212) 2
SS xx   x 
2
 7222   801.4286
n 7

23

Solution 13-1
SS xy 211 .7143
b   .2642
SS xx 801.4286
a  y  bx  9.1429  (.2642)(30.2857)  1.1414
Thus,
ŷ = 1.1414 + .2642x

24

6
Multiple Regression:
The simple linear regression model is as:

This model includes one independent variable, which is denoted by x, and one
dependent variable, which is denoted by y, and the term represented by in the
model is called the random error.
Usually a dependent variable is affected by more than one independent variable.
When we include two or more independent variables in a regression model, it is
called a multiple regression model. Remember, whether it is a simple or
multiple regression model, it always includes one and only one dependent
variable.
A multiple regression model with y as a dependent variable and
as independent variables is written as:

Where A represents the constant term, are the regression

coefficients of independent variables , respectively, and


represents the random error terms. This model contains independent variables
.
Comparison Chart: Correlation and Regression
Basis for
Comparison Correlation Regression
Correlation is a statistical measure Regression describes how an
which determines co-relationship or independent variable is numerically
Meaning association of two variables. related to the dependent variable.
To fit a best line and estimate one
To represent linear relationship variable on the basis of another
Usage between two variables. variable.
Dependent and
Independent
variables No difference Both variables are different.
Correlation coefficient indicates theRegression indicates the impact of a
extent to which two variables move unit change in the known variable (x)
Indicates together. on the estimated variable (y).
To estimate values of random variable
To find a numerical value expressing on the basis of the values of fixed
Objective the relationship between variables. variable.

7
Example:
A researcher wanted to find out the effect of driving experience and the number of driving
violations on auto insurance premiums. A random sample of 12 drivers insured with the same
company and having similar auto insurance policies was selected from a large city. Table-A,
lists the monthly auto insurance premiums (in dollars) paid by these drivers, their driving
experiences (in years), and the numbers of driving violations committed by them during the
past three years. Find the regression equation of monthly premiums paid by drivers on the
driving experiences and the numbers of driving violations.
Table-A:
Monthly Premium 148 76 100 126 194 110 114 86 198 92 70 120
($)
Driving 5 14 6 10 4 8 11 16 3 9 19 13
Experiences
(years)
# of driving 2 0 1 3 6 2 3 1 5 1 0 3
violations
(past 3 years)

Assignment on lecture 5

1. The Bradford Electric Illuminating Company is studying the


relationship between kilowatt-hours (thousands) and number of rooms
in a private single-family residence. A random sample of 10 homes
yielded the following.
Number of Rooms 12 9 14 6 10 8 10 10 5 7
Kilowatt-hours(thous) 9 7 10 5 8 6 8 10 4 7
(i) Determine the regression equations
(ii) Determine the number of kilowatt-hours, in thousands, for a six-room
house.
2. Mr. James McWhinney, president of Daniel-James Financial Services,
believes there is a relationship between the number of client contacts and
the dollar amount of sales. To document this assertion, Mr. McWhinney
gathered the following sample information. The X column indicates the
'number of client contacts last month, and the Y column shows the value of
sales ($ thousands) last month for each client sampled.
Number of Contacts 14 12 20 16 46 23 48 50 55 50
Sales ($ thousands) 24 14 28 30 80 30 90 85 120 110
(i) Determine the regression equation
(ii) Determine the estimated sales if 40 contacts made.

You might also like