NPTEL Syllabus
Regression Analysis - Web course
COURSE OUTLINE
Simple linear regression, multiple linear regression, model adequacy checking,
transformations and weighting to correct model inadequacies, diagnostics for leverage and
influence. Polynomial regression models, orthogonal polynomials. dummy variables,
variable selection and model building, multicollinearity. Nonlinear regression. Generalized
linear models, autocorrelation, measurement errors, calibration problem, bootstrapping.
NPTEL
[Link]
Mathematics
COURSE DETAIL
Lectures
Topic
Introduction: the need for statistical analysis, Straight line relationship between two variables. SIMPLE LINEAR REGRESSION: fitting a straight line by
least squares.
Useful properties of Least squares fit, Statistical properties of least squares
estimators, Analysis of Variance (ANOVA).
Confidence intervals and tests for 0 and 1. F -test for significance of regression. The correlation between X and Y .
Interval estimation of the mean response, Prediction of new observation.
Coefficient of determination.
MULTIPLE LINEAR REGRESSION, Estimation of model parameters. Properties of least squares estimators.
Hypothesis testing in multiple linear regression, Analysis of variance, Test for
significance of regression, Tests on individual regression coefficient.
Extra sum of squares method and tests for several parameters being zero,
The extra sum of squares principle, Two alternative forms of extra SS, Two
predictor variables: Example.
Multiple regression-Special topics: Testing a general linear hypothesis.
Confidence intervals in multiple regression: Confidence intervals on the
regression coefficients, Confidence interval estimation of mean response,
Prediction of new observations.
10
EVALUATING THE PERFORMANCE OF A REGRESSION MODEL, Residual
Analysis: Method for scaling residuals, Standardized residuals, Studentized
residuals, PRESS residuals.
11
Residual plots: Normal probability plot, Plot of predicted response (y)
against observed response (y), Plot of residuals (ei) against fitted values
(y). Partial residuals plot.
12
Serial correlation in residuals, The Durbin-Watson test for a certain type of
serial correlation.
13
Examining Runs in the time sequence plot of residuals: Runs test.
14
More on checking fitted models, The hat matrix H and the various types of
residuals. Variance-covariance matrix of e, Extra sum of squares attributable to ei.
15
DIAGNOSTICS FOR LEVERAGE AND INFLUENCE, Detection of influential
observations: Cooks D, DF F IT S and DF BET AS.
16
POLYNOMIAL REGRESSION MODELS, Polynomial models in one variable:
Example
Pre-requisites:
1. Probability & Statistics
2. Statistical Inference
Additional Reading:
1. Weisberg, S. (1985), Applied Linear Regression (2nd
ed.), New York: Wiley.
2. Chatterjee, S., Hadi, A., & Price, B. (2000),
Regression analysis by example. New York: Wiley.
Hyperlinks:
Chen, X., Ender, P., Mitchell, M. and Wells, C. (2003).
Regression with Stata, from
[Link]
Coordinators:
Dr. Soumen Maity
Department of MathematicsIIT Kharagpur
17
Picewise Polynomial Fitting (splines), Example: picewise linear regression.
18
Orthogonal polynomials regression.
19
Models containing functions of the predictors, including polynomial mod- els,
Worked examples of second-order surface fitting for k = 3 and k = 2
predictor variables.
20
TRANSFORMATIONS AND WEIGHTING TO CORRECT MODEL
INADEQUA- CIES. Variance-stabilizing transformations, Transformations to
linearize the model.
21
Analytical methods for selecting a transformation. Transformations on y: The
Box-Cox Method, Transformations on the regressor variables.
22
Generalized least squares and weighted least squares. An example of
weighted least squares, A numerical example of weighted least squares.
23
DUMMY VARIABLES: Dummy variables to separate blocks of data with
different intercepts, same model.
24
Interaction terms involving dummy variables, Dummy variables for segmented models.
25
SELECTING THE BEST REGRESSION EQUATION. All possible regressions and Best subset regression.
26
Forward Selection, Stepwise Selection, Backward elimination, Significance
levels for selection procedures.
27
MULTICOLLINEARITY: Sources of multicollinearity, Effects of multicollinearity.
28
Multicollinearity diagnostics: Examination of the correlation matrix, Vari- ance
Inflation Factors, Eigensystem Analysis of XtX.
29
Methods for dealing with multicollinearity: Collecting additional data, Remove variables from the model, Collapse variables.
30
Ridge regression: Basic form of Ridge Regression, In what circumstances is
ridge regression absolutely the correct way to proceed?
31
THE GENERALIZED LINEAR MODELS (GLIM): The exponential family of
distributions: examples
32
Logistic regression models: models with binary response variable. Estimating the parameters in a logistic regression model, Interpretation of the
parameters in logistic regression model, Hypothesis tests on model parameters.
33
The Generalized Linear Models (GLIM): Link functions and linear predictors, Parameter estimation and and inference in the GLM.
34
AN INTRODUCTION TO NONLINEAR ESTIMATION, Linear regression models, Nonlinear regression models, Least squares for nonlinear models
35
Estimating the Parameters of a nonlinear systems, An example.
36
Robust Regression: Least absolute deviations regression (L1 regression),
M -estimators, Steel employment example.
37
Least median of squares regression, Robust regression with ranked residuals.
38
EFFECT OF MEASUREMENT ERRORS IN REGRESSORS: Simple linear regression, The Berkson Model.
39
INVERSE ESTIMATION-The calibration problem.
40
Resampling procedures (BOOTSTRAPPING): Resampling procedures for
regression models, Example: Straight line fit.
References:
1. Draper, N. R., and Smith, H. (1998), Applied Regression Analysis (3rd ed.), New York:
Wiley.
2. Montgomery, D. C., Peck, E. A., and Vining, G. (2001), Introduction to Linear
Regression Analysis (3rd ed.), Hoboken, NJ: Wiley.
A joint venture by IISc and IITs, funded by MHRD, Govt of India
[Link]