0% found this document useful (0 votes)

88 views

Welcome To:: Multiple Regression and Model Building

The document discusses multiple linear regression analysis. It explains that multiple regression allows predicting the value of a dependent variable based on the values of two or more independent variables. The document covers topics such as ordinary least squares estimation, building regression models, interpreting regression coefficients, validating models, and measures of fit such as R-squared. It provides examples and formulas to illustrate key concepts in multiple linear regression.

Uploaded by

Aasmi

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views

Welcome To:: Multiple Regression and Model Building

Uploaded by

Aasmi

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

IBM ICE (Innovation Centre for Education)

Welcome to:
Multiple regression and model building

9.1
Unit objectives IBM ICE (Innovation Centre for Education)
IBM Power Systems

After completing this unit you should be able to:

• Understand the concept of multiple regression

• Learn the significance of ordinary least square
• Have an insight into the regression model building
• Have a conceptual clarity on interpreting the regression coefficients
• Learn the concept of standardized coefficient and categorical variables
• Have a knowledge on validating a multiple regression model
• Have a brief idea of R-squared and adjusted R-squared
• Have a clear understanding on t-Test and F-Test
Introduction IBM ICE (Innovation Centre for Education)
IBM Power Systems
• Multiple regression is an extension of simple linear regression.
• We consider the problem of regression when a study variable depends on more than one
explanatory or independent variables, called as multiple linear regression model.

• Used while predicting the value of a variable based on the value of two or more other
variables.

• For example, we may use multiple regression to understand whether examination

performance can be predicted based on revision time, test anxiety, lecture attendance and
gender.
Ordinary least squares estimation for
multiple linear regression IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Allows to estimate the relation between a dependent variable and a set of explanatory
variables.

• The dependent variable is an interval variable and in principle, take any real value between
−∞ and +∞.

• The multiple linear regression model assumes a linear relationship between a dependent
variable yi and a set of explanatory variables x’i =(xi0, xi1, ..., xiK). xik is also called an
independent variable, a covariate or a regressor. The first regressor xi0 = 1 is a constant
unless otherwise specified.

• Consider a sample of N observations i = 1, ... , N. Every single observation i follows

– yi = x’iβ + ui
• where β is a (K + 1) -dimensional column vector of parameters, x’i is a (K + 1) dimensional row vector and ui is a
scalar called the error term.
Multiple linear regression model building IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Let y denotes the dependent variable that is linearly related to k independent (or explanatory)
variables X1, X2 ,..., X k through the parameters 1, 2 ,..., k and we write,
y = X11+ X22+……+Xkk +  … (1)

• We note that the jth regression coefficient  j represents the expected change in y per unit
change in jth independent variable X j.

( )
• Assuming E( )= 0,

• A model is said to be linear when it is linear in parameters. In such a case should not
depend on any ’s.
Partial correlation and regression model
building IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Partial correlation is a measure of the strength and direction of a linear relationship between
two continuous variables whilst controlling for the effect of one or more other continuous
variables (also known as covariates or control variables).

• Suppose we want to find the correlation between Y and X controlling W. This is called the
partial correlation.

• We need to insure that no variance predictable from W enters the relationship between Y and
X. In z-score form we can predict both X and Y from W, then subtract those predictions
leaving only information in X and Y that is independent of W, as follows.
z =r z (1) (2)

where zXP and zYP are the predicted z-scores for X and Y respectively.

• Subtracting these predicted scores we get,

( ) (3) ( ) (4)

• With variance (1-rXW2), and with variance (1-rYW2), where zX(res) and zY(res) are the
residual information in X and Y controlling W.
Multiple linear regression model IBM ICE (Innovation Centre for Education)
IBM Power Systems

• A multiple linear regression model with k predictor variables X1, X2, ..., Xk and a response Y,
can be written as
– y = β0 + β1x1 + β2x2 + · · · βkxk + 

• More complex models may include higher powers of one or more predictor variables, e.g.,
– y = β0 + β1x + β2x2 +  …. (1)

• Or interaction effects of two or more variables,

– y = β0 + β1x1 + β2x2 + β12x1x2 +  …… (2)

• Example 1 and 2:
Multiple linear regression coefficients
-partial regression coefficients IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Linear regression is one of the most popular statistical techniques.

• A linear regression model with two predictor variables can be expressed with the following
equation: Y = B0 + B1*X1 + B2*X2 + e.

• The variables in the model are:

– Y, the response variable.
– X1, the first predictor variable.
– X2, the second predictor variable and
– e, the residual error, which is an unmeasured variable.

• The parameters in the model are:

– B0, the Y-intercept.
– B1, the first regression coefficient and
– B2, the second regression coefficient.

• Interpreting the intercept

– B0, the Y-intercept, can be interpreted as the value we would predict for Y if both X1 = 0 and X2 = 0.
Standardized regression coefficients IBM ICE (Innovation Centre for Education)
IBM Power Systems

• The value of a slope in a multiple regression problem depends on the units in

which the corresponding predictor xj is measured.

• Scaling is necessary to over come the problem of comparison.

• Unit normal scaling: Subtract the sample mean and divide by the sample
standard deviation of both the predictor variables and the response:
̅
• ∗

– Where sj is the estimated sample standard deviation of predictor xj and sy is the

estimated sample standard deviation of the response.

• Using these new standardized variables, our regression model becomes,

– yi∗ = b1zi1 + b2zi2 + · · · + bkzik + i, i = 1, . . . , n.

• The least squares estimator =(Z’Z)-1Z’y* is the standardized coefficient estimate.

Missing data IBM ICE (Innovation Centre for Education)
IBM Power Systems

• Missing data causes problems because multiple regression procedures require that every
case have a score on every variable that is used in the analysis.

• The most common ways of dealing with missing data are:

– pairwise deletion, listwise deletion, deletion of variables and coding of missingness.

• If data are missing randomly, then it may be appropriate to estimate each bivariate
correlation on the basis of all cases that have data on the two variables.
– pairwise deletion of missing data.

• A second procedure is to delete an entire case if information is missing on any one of the
variables that is used in the analysis
– list-wise deletion.

• A third procedure is simply to delete a variable that has substantial missing data.
– Deletion of variables.
Validation of multiple regression model IBM ICE (Innovation Centre for Education)
IBM Power Systems

• The validation process can involve analyzing the goodness of fit of the regression, analyzing
whether the regression residuals are random and checking whether the models predictive
performance deteriorates substantially when applied to data that were not used in model
estimation.

• One measure of goodness of fit is the R2 (coefficient of determination), which, in ordinary

least squares with an intercept ranges between 0 and 1.

• Numerical methods also play an important role in model validation. For example, the lack of
fit test for assessing the correctness of the functional part of the model can aid in interpreting
a borderline residual plot.

• Cross-validation is the process of assessing how the results of a statistical analysis will
generalize to an independent data set.

• A development in medical statistics is the use of out of sample cross validation techniques in
meta analysis. It forms the basis of the validation statistic, Vn, which is used to test the
statistical validity of meta analysis summary estimates. Essentially it measures a type of
normalized prediction error and its distribution is a linear combination of χ2 variables of
degree 1.
Coefficient of multiple determination
(R-Squared) IBM ICE (Innovation Centre for Education)
IBM Power Systems

• R-squared is a goodness of fit measure for linear regression models.

• Indicates the percentage of the variance in the dependent variable that the independent
variables explain collectively.

• R-squared measures the strength of the relationship between the model and the dependent
variable on a convenient 0-100% scale.

• Residuals are the distance between the observed value and the fitted value.
Adjusted R-squared IBM ICE (Innovation Centre for Education)
IBM Power Systems

• The adjusted R-squared compares the explanatory power of regression models that contain
different numbers of predictors.

• The adjusted R-squared is a modified version of R-squared that has been adjusted for the
number of predictors in the model.

• Multiple R squared is the proportion of Y variance that can be explained by the linear model
using X variables in the sample data, but it over-estimates that proportion in the population.

• Consider, for example, sample R2 = 0.60 based on k=7 predictor variables in a sample of
N=15 cases. An estimate of the proportion of Y variance that can be accounted for by the X
variables in the population is called shrunken R squared or adjusted R squared. It can be
calculated with the following formula:

~ 2 2 2 N -1 14
Shrunken R = R = 1 - (1 - R ) = 1 - (1 - .6) = .20.
N - k -1 7
Statistical significance : t-Test IBM ICE (Innovation Centre for Education)
IBM Power Systems

• A t-test is a type of inferential statistic used to determine if there is a significant difference

between the means of two groups, which may be related in certain features.

• A t-test looks at the t-statistic, the t-distribution values and the degrees of freedom to deter.
• Mathematically, the t-test takes a sample from each of the two sets and establishes the
problem statement by assuming a null hypothesis that the two means are equal. Mine the
probability of difference between two sets of data.

• For a large sample size, statisticians use a z-test. Other testing options include the chi-
square test and the f-test.
Checkpoint (1 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Multiple choice questions:

1. Multiple linear regression (MLR) is a __________ type of statistical analysis.

a) Univariate
b) Bivariate
c) Multivariate
d) None of these
2. The following types of data can be used in MLR (choose all that apply)
a) Interval or higher dependent variable (DV)
b) Interval or higher independent variables (IVs)
c) Dichotomous Ivs
d) Interval or lower independent variables
3. A LR analysis produces the equation Y = -3.2X + 7. This indicates that:
a) A 1 unit increase in X results in a 3.2 unit decrease in Y.
b) A 1 unit decrease in X results in a 3.2 unit decrease in Y.
c) A 1 unit increase in X results in a 3.2 unit increase in Y.
d) None of these
Checkpoint solutions (1 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Multiple choice questions:

1. Multiple linear regression (MLR) is a __________ type of statistical analysis.

a) Univariate
b) Bivariate
c) Multivariate
d) None of these
2. The following types of data can be used in MLR (choose all that apply)
a) Interval or higher dependent variable (DV)
b) Interval or higher independent variables (IVs)
c) Dichotomous (Ivs)
d) Interval or lower independent variables
3. A LR analysis produces the equation Y = -3.2X + 7. This indicates that:
a) A 1 unit increase in X results in a 3.2 unit decrease in Y.
b) A 1 unit decrease in X results in a 3.2 unit decrease in Y.
c) A 1 unit increase in X results in a 3.2 unit increase in Y.
d) None of these
Checkpoint (2 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Fill in the blanks:

1. The main purpose(s) of (LR) is/are (choose all that apply) is to explain ____.
2. In MLR, the square of the multiple correlation coefficient or R2 is called the _____.
3. _______ is a modified version of R-squared.
4. OLS stands for _______

True or False:

1. The major conceptual limitation of all regression techniques is that one can only ascertain
relationships, but never be sure about underlying causal mechanism. True/False
2. In MLR, a residual is the difference between the predicted Y and actual Y values.
True/False
3. Multiple regression is not an extension of simple linear regression. True/False
Checkpoint solutions (2 of 2) IBM ICE (Innovation Centre for Education)
IBM Power Systems

Fill in the blanks:

1. The main purpose(s) of (LR) is/are (choose all that apply) is to explain one variable in
terms of another.
2. In MLR, the square of the multiple correlation coefficient or R2 is called the co-efficient of
determination.
3. Adjusted R-squared is a modified version of R-squared.
4. OLS stands for Ordinary Least Square.

True or False:

1. The major conceptual limitation of all regression techniques is that one can only ascertain
relationships, but never be sure about underlying causal mechanism. True
2. In MLR, a residual is the difference between the predicted Y and actual Y values. True
3. Multiple regression is not an extension of simple linear regression. False
Question bank IBM ICE (Innovation Centre for Education)
IBM Power Systems

Two marks questions:

1. What is multiple linear regression?

2. What is meant by dependent and independent variables?
3. What is the notion of partial correlation?
4. Define validation process in multiple linear regression

Four marks questions:

1. Give an insight into standardized regression coefficients.

2. What are categorical variable? How the regression coefficients of CVs are interpreted?
3. What is the response variable? What are the explanatory variables?
4. What is adjusted R-squared? Give the mathematical representation.

Eight marks questions:

1. Describe coefficient of multiple determination (R-Squared).

2. Describe the statistical significance of individual variables in multiple linear regression: t-Test
Unit summary IBM ICE (Innovation Centre for Education)
IBM Power Systems

Having completed this unit you should be able to:

• Understand the concept of multiple regression

• Learn the significance of ordinary least square

• Have an insight into the regression model building

• Have a conceptual clarity on interpreting the regression coefficients

• Learn the concept of standardized coefficient and categorical variables
• Have a knowledge on validating a multiple regression model

• Have a brief idea of R-squared and adjusted R-squared

• Have a clear understanding on t-Test and F-Test

Psychological Assessment (Kaplan Summary C1-3)
No ratings yet
Psychological Assessment (Kaplan Summary C1-3)
12 pages
2007 AP Statistics Multiple Choice Exam
100% (1)
2007 AP Statistics Multiple Choice Exam
17 pages
Multiple Regression Analysis 1
No ratings yet
Multiple Regression Analysis 1
57 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Multiple Linear Regression Session 4
No ratings yet
Multiple Linear Regression Session 4
32 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
73 pages
Multiple Regression Analysis & Applications
No ratings yet
Multiple Regression Analysis & Applications
23 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
11 pages
Chapter 14, Multiple Regression Using Dummy Variables
No ratings yet
Chapter 14, Multiple Regression Using Dummy Variables
19 pages
Regression
No ratings yet
Regression
24 pages
Chapter 3 MLR
No ratings yet
Chapter 3 MLR
40 pages
Fsgs
No ratings yet
Fsgs
28 pages
Class Material - Multiple Linear Regression
No ratings yet
Class Material - Multiple Linear Regression
57 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
1725857551_SMA32
No ratings yet
1725857551_SMA32
30 pages
ADM2304 Multiple Regression Dr. Suren Phansalker
No ratings yet
ADM2304 Multiple Regression Dr. Suren Phansalker
12 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
Chapter 3 Multiple Linear Regression - We Use This One
No ratings yet
Chapter 3 Multiple Linear Regression - We Use This One
6 pages
UNIT 4 Updated
No ratings yet
UNIT 4 Updated
58 pages
EVSC 445 Week 11
No ratings yet
EVSC 445 Week 11
40 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
Unit 4-1
No ratings yet
Unit 4-1
29 pages
lecture_8
No ratings yet
lecture_8
29 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Multiple Linear Regression & Nonlinear Regression Models
No ratings yet
Multiple Linear Regression & Nonlinear Regression Models
51 pages
8.-Linear-Regression
No ratings yet
8.-Linear-Regression
25 pages
Multiple linear regression
No ratings yet
Multiple linear regression
39 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
Mult Regression
No ratings yet
Mult Regression
28 pages
2024 Chapter 1
No ratings yet
2024 Chapter 1
8 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
Welcome To:: Simple Linear Regression
No ratings yet
Welcome To:: Simple Linear Regression
33 pages
Linear Regression Models
No ratings yet
Linear Regression Models
3 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
No ratings yet
Experiment No.2 Title:: Predicting Missing Data Using Regression Modeling
8 pages
Mod 3C
No ratings yet
Mod 3C
36 pages
Regression Analysis
No ratings yet
Regression Analysis
49 pages
Multiple Regression
No ratings yet
Multiple Regression
60 pages
Regression Model
No ratings yet
Regression Model
30 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Lecture+8+ +Linear+Regression
No ratings yet
Lecture+8+ +Linear+Regression
45 pages
Machine Learning and Linear Regression
100% (1)
Machine Learning and Linear Regression
55 pages
Simple Regression Model: Erbil Technology Institute
No ratings yet
Simple Regression Model: Erbil Technology Institute
9 pages
4.Introduction to Multiple Linear Regression_QAns
No ratings yet
4.Introduction to Multiple Linear Regression_QAns
5 pages
L3 MLRM
No ratings yet
L3 MLRM
10 pages
Session-Multiple Regression
No ratings yet
Session-Multiple Regression
26 pages
4 Multiple Regression Analysis
No ratings yet
4 Multiple Regression Analysis
58 pages
Machine Learning
No ratings yet
Machine Learning
34 pages
Chapter 3 Econometrics
No ratings yet
Chapter 3 Econometrics
34 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
6 pages
ch2 Linear Regression
No ratings yet
ch2 Linear Regression
39 pages
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
No ratings yet
Dr. Hussin Abdullah School of Economics, Finance and Banking, Uum Cob
12 pages
Multivar 2 - Simple and Multiple Regression PDF
No ratings yet
Multivar 2 - Simple and Multiple Regression PDF
26 pages
01 - Quantitative Methods
No ratings yet
01 - Quantitative Methods
28 pages
Lecture5 Mar22 2024
No ratings yet
Lecture5 Mar22 2024
44 pages
EDA Template
No ratings yet
EDA Template
18 pages
Bio2 Module 4 - Multiple Linear Regression
No ratings yet
Bio2 Module 4 - Multiple Linear Regression
20 pages
Partial Regression Coefficients.: Herv e Abdi
No ratings yet
Partial Regression Coefficients.: Herv e Abdi
4 pages
Multiple Regression Analysis
No ratings yet
Multiple Regression Analysis
15 pages
Exercises of Logarithms and Exponentials
From Everand
Exercises of Logarithms and Exponentials
Simone Malacrida
No ratings yet
Unit 5
No ratings yet
Unit 5
53 pages
Clustering Techniques: Welcome To
No ratings yet
Clustering Techniques: Welcome To
52 pages
Unit 4
No ratings yet
Unit 4
45 pages
Unit 1
No ratings yet
Unit 1
70 pages
Unit 3
No ratings yet
Unit 3
65 pages
Unit 2
No ratings yet
Unit 2
66 pages
MED 214 Biostatistics Final Makeup Exam 2019
No ratings yet
MED 214 Biostatistics Final Makeup Exam 2019
4 pages
The Impact of Study Habits On The Academic Performance of Senior High School Students Amidst Blended Learning
No ratings yet
The Impact of Study Habits On The Academic Performance of Senior High School Students Amidst Blended Learning
8 pages
Rules For Working On AMOS: Rule No.1:: Analysis of Moment Structure (Amos)
100% (1)
Rules For Working On AMOS: Rule No.1:: Analysis of Moment Structure (Amos)
18 pages
Web Edition - Thesis-Manjunath Sadashiva
No ratings yet
Web Edition - Thesis-Manjunath Sadashiva
322 pages
Measuring Impact of Demographic and Environmental Factors On Small Business Performance - A Case Study of D.I.khan KPK Pakistan)
No ratings yet
Measuring Impact of Demographic and Environmental Factors On Small Business Performance - A Case Study of D.I.khan KPK Pakistan)
7 pages
CHAP 2 Cost Behaviour PDF
No ratings yet
CHAP 2 Cost Behaviour PDF
58 pages
Who Will Spend More Pollution Abatement Costs: Does Size Matter?
No ratings yet
Who Will Spend More Pollution Abatement Costs: Does Size Matter?
20 pages
MSF 566 Topic 04 Modeling With Volatility
No ratings yet
MSF 566 Topic 04 Modeling With Volatility
74 pages
Ameen, Muhydeen Garba (09/30GB116) : An Empirical Assesment of Causes of Building Failures in Lagos State
No ratings yet
Ameen, Muhydeen Garba (09/30GB116) : An Empirical Assesment of Causes of Building Failures in Lagos State
58 pages
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data 2nd Edition William D. Dupont 2024 Scribd Download
100% (7)
Statistical Modeling for Biomedical Researchers A Simple Introduction to the Analysis of Complex Data 2nd Edition William D. Dupont 2024 Scribd Download
40 pages
Micropolitics and Credibility of Barangay Official
No ratings yet
Micropolitics and Credibility of Barangay Official
17 pages
3828-Article Text-14097-1-10-20210623
No ratings yet
3828-Article Text-14097-1-10-20210623
20 pages
ACI SP 297 Nonlinear Modeling Parameters and Acceptance Criteria
No ratings yet
ACI SP 297 Nonlinear Modeling Parameters and Acceptance Criteria
210 pages
Factors That Influencing Customers Satisfaction: A Case Study From Marketing Branch of PDAM Tirtanadi, Medan Amplas, North Sumatra, Indonesia
No ratings yet
Factors That Influencing Customers Satisfaction: A Case Study From Marketing Branch of PDAM Tirtanadi, Medan Amplas, North Sumatra, Indonesia
9 pages
Hauke & Kossowski (2011) COMPARISON OF VALUES OF PEARSON'S AND SPEARMAN'S CORRELATION COEFFICIENTS ON THE SAME SETS OF DATA PDF
No ratings yet
Hauke & Kossowski (2011) COMPARISON OF VALUES OF PEARSON'S AND SPEARMAN'S CORRELATION COEFFICIENTS ON THE SAME SETS OF DATA PDF
7 pages
Welcoming, Trust, and Civic Engagement: Immigrant Integration in Metropolitan America
No ratings yet
Welcoming, Trust, and Civic Engagement: Immigrant Integration in Metropolitan America
21 pages
The Relationship Among Academic Social Interaction
No ratings yet
The Relationship Among Academic Social Interaction
8 pages
177 Excel Time Series Analysis
No ratings yet
177 Excel Time Series Analysis
5 pages
Gawain NG Mag-Aaral #10 - Pagbuo NG Kabanata IV
No ratings yet
Gawain NG Mag-Aaral #10 - Pagbuo NG Kabanata IV
9 pages
3.Badm - Mba Notes
No ratings yet
3.Badm - Mba Notes
13 pages
Durbin Watson Hitung Excel
No ratings yet
Durbin Watson Hitung Excel
4 pages
Department of Tropical Agriculture and International Cooperation
No ratings yet
Department of Tropical Agriculture and International Cooperation
2 pages
Multivariate Linear Regression: Nathaniel E. Helwig
No ratings yet
Multivariate Linear Regression: Nathaniel E. Helwig
84 pages
4 - Kumar Et Al. (2015) (GJRSS)
No ratings yet
4 - Kumar Et Al. (2015) (GJRSS)
10 pages
MATHEMATICS & STATISTICS (XII-COMMERCE)
No ratings yet
MATHEMATICS & STATISTICS (XII-COMMERCE)
17 pages
Casio FX 82za Plus II Maths Literacy
No ratings yet
Casio FX 82za Plus II Maths Literacy
8 pages
Xu Et Al. - 2010-Information Seeking in An Information Systems Project Team
No ratings yet
Xu Et Al. - 2010-Information Seeking in An Information Systems Project Team
12 pages

Welcome To:: Multiple Regression and Model Building

Uploaded by

Welcome To:: Multiple Regression and Model Building

Uploaded by

IBM ICE (Innovation Centre for Education)

After completing this unit you should be able to:

• Understand the concept of multiple regression

• For example, we may use multiple regression to understand whether examination

• Consider a sample of N observations i = 1, ... , N. Every single observation i follows

• Subtracting these predicted scores we get,

• Or interaction effects of two or more variables,

• Linear regression is one of the most popular statistical techniques.

• The variables in the model are:

• The parameters in the model are:

• Interpreting the intercept

• The value of a slope in a multiple regression problem depends on the units in

• Scaling is necessary to over come the problem of comparison.

– Where sj is the estimated sample standard deviation of predictor xj and sy is the

• Using these new standardized variables, our regression model becomes,

• The least squares estimator =(Z’Z)-1Z’y* is the standardized coefficient estimate.

• The most common ways of dealing with missing data are:

• One measure of goodness of fit is the R2 (coefficient of determination), which, in ordinary

• R-squared is a goodness of fit measure for linear regression models.

• A t-test is a type of inferential statistic used to determine if there is a significant difference

Multiple choice questions:

1. Multiple linear regression (MLR) is a __________ type of statistical analysis.

Multiple choice questions:

1. Multiple linear regression (MLR) is a __________ type of statistical analysis.

Fill in the blanks:

Fill in the blanks:

Two marks questions:

1. What is multiple linear regression?

Four marks questions:

1. Give an insight into standardized regression coefficients.

Eight marks questions:

1. Describe coefficient of multiple determination (R-Squared).

Having completed this unit you should be able to:

• Understand the concept of multiple regression

• Have an insight into the regression model building

• Have a conceptual clarity on interpreting the regression coefficients

• Have a brief idea of R-squared and adjusted R-squared

You might also like