Linear and Non-Linear Models
A model or relationship is termed as ”Linear” if it is linear in
parameters, and ”Nonlinear” if it is nonlinear in parameters.
Linear model: If the all partial derivatives of Y with respect to
each of the parameters β1 , β2 , ..., βk are independent of
parameters, the model is called linear model.
Nonlinear model: If any of the partial derivatives of Y with respect
to β1 , β2 , ..., βk is not independent of parameters, the model is
called nonlinear model.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 1 / 25
Example:
√
Y = β1 X21 + β2 X2 + β3 log(X3 ) + ε
∂Y
∂β1 = X12
∂Y
√
∂β2 = X2
∂Y
∂β3 = log X3
The model is linear in parameters and intrinsically linear in
variables.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 2 / 25
Example:
Y = β12 X1 + β2 X2 + β3 log(X3 ) + ε
∂Y
∂β1 = 2β1 X1
The model is not linear in parameters.
Examples:
Y = β0 + β1 X + ε →linear model
Y = β0 + β1 X2 + ε →linear model and it is intrinsically linear in
variables.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 3 / 25
Y = β0 + Xβ1 + ε →Nonlinear Model
Y = β0 e−β1 X ε →Nonlinear Model but this model can be linearized
using variable transformations.
ln(Y) = ln(β0 ) − β1 X + ln(ε)
Y∗ = ln(Y)
β0∗ = ln(β0 )
ε∗ = ln(ε)
Y∗ = β0∗ −β1 X+ε∗ → intrinsically linear model (Linearizable model)
If a regression model is intrinsically linear in both parameters and
regressors, it said intrinsically linear.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 4 / 25
Example
Y = β0 Xβ1 ε
Note: Intrinsically linear model is also termed as ”log-linear”
model.
Using;
Y∗ = ln(Y)
β0∗ = ln(β0 )
X∗ = ln(X)
ε∗ = ln(ε)
transformations we get Y∗ = β0∗ + β1 X∗ + ε∗ which is linear model.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 5 / 25
Steps in Regression Analysis
The regression analysis is a tool to determine the values of
parameters of parameters given the data Y and X1 , X2 , ..., Xk .
It is assumed that the regression model generates data, so the
data at hand is consisted with the proposed model.
Steps in the Regression Analysis:
• Statement of the problem under consideration
• Choice of relevant variables
• Collection of data on relevant variables
• Specification of model
In general we prefer linear model, if possible. The nonlinear
models are converted to linear models through
[Link] prior knowledge and using scatterplots
we specify the mathematical (functional) form of the
relationships. . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 6 / 25
• Choice of estimation method for fitting data
(Which estimation method ?)
• OLS (Ordinary Least Squares)
• ML (Maximum Likelihood)
• WLS (Weighted Least Squares)
• GLS (Generalized Least Squares)
• NLS (Nonlinear Least Squares)
• ...
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 7 / 25
• Fitting of model
Yi = β0 + β1 X + ε (Population Regression Model)
Using sample data, fitted sample regression model
b = βb0 + βb1 X
Y
b :Fitted value
Y
βb0 , βb1 :Estimated parameters (parameter estimations)
• Model Validation
(We should check model assumptions)
• Use fitted model for prediction and reasoning. No
cause-and-effect pattern is necessarily implied by regression
model.
X β1 Y ← ε
−
→
Y = β0 + β1 X + ε . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 8 / 25
Classification of the Regression Models
1-) In terms of number of independent variables
• Simple Regression
-1 independent variable
Y = β0 + β1 X + ε
• Multiple Regression
-More than 1 independent variable
Y = β0 + β1 X1 + ... + βk Xk + ε
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 9 / 25
2-) In terms of number of dependent variables
• UNIVARIATE
1 dependent variable
Y = β0 + β1 X1 + β2 X2 + ε
• MULTIVARIATE
More than 1 dependent variable
Y1 = β0 + β1 X1 + ε
Y2 = γ0 + γ1 X1 + δ
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 10 / 25
3-)In terms of measurement type of the independent variables
One quantitative dependent variable and categorical
independent variable(s)
ANOVA (Analysis of Variance)
·All regressors are categorical (qualitative)
ANCOVA (Analysis of Covariance)
·At least one regressor is quantitative and others are qualitative.
More than one quantitative dependent variable and categorical
independent variable(s)
MANOVA (Multivariate Analysis of Variance)
·ANOVA model with more than 1 dependent variable
MANCOVA (Multivariate Analysis of Covariance)
·ANCOVA model with more than 1 dependent variable
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 11 / 25
4-) In terms of measurement type of dependent variable
• a) Binary Y (two discrete outcomes) variable
-LOGISTIC REGRESSION (Logit model)
-TOBIT REGRESSION (Tobit model)
-PROBIT REGRESSION (Tobit model)
• b) Ordinal Y variable
-ORDINAL LOGISTIC REGRESSION (Ordinal Logit model)
• c) Polytomous Y variable
-MULTINOMIAL LOGISTIC REGRESSION (multinomial logit
model)
• d) Poisson distributed Y variable
-POISSON REGRESSION
• e) Negative binomial distributed Y variable
-NEGATIVE BINOM REGRESSION
• f).......
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 12 / 25
a
b is related to measurement type of Y
c
d
e
f is related to distribution of Y
.
.
5-) In terms of linearity of the model
• -Linear Models
• -Linearizable Models ”using transformation”
• -Non-Linear Models
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 13 / 25
Population Regression Model
Yi \Xi = β0 + β1 Xi + εi , i=1,2,...n
Y:dependent variable
X:independent variable
β0 :intercept
β1 :slope coefficient
ε :error term
E(Yi \Xi ) = E(β0 + β1 Xi + εi ) = β0 + β1 Xi + E(εi )
assuming:
E(εi ) = 0 and X is not random variable or it has fixed variable with
values determined before.
E(Yi \Xi ) = β0 + β1 Xi
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 14 / 25
Regression model is the conditional mean of Yi given Xi
E(Yi \Xi ) = β0 + β1 Xi
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 15 / 25
E(Y\X1 ) = β0 + β1 X1 = µ1
E(Y\X2 ) = β0 + β1 X2 = µ2
.
.
E(Y\Xn ) = β0 + β1 Xn = µn
E(Y\X = 0) = β0
β0 is the ”intercept” which is interpretable if there are data near=0
β1 = tan α = dY dX is the ”slope” coefficient. Under linearity
hypothesis is each of the n means E(Yi \Xi ) lies on this population
regression line.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 16 / 25
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 17 / 25
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 18 / 25
Example:
Suppose that population regression model for lot size problem as follows:
Yi = 9.5 + 2.1Xi + εi then E(Y) = 9.5 + 2.1X which is population
regression line shown in figure. Suppose that in the ith trial, a lot of
Xi = 45 units is produced and the actual number of man-hours is
Yi = 108. In that case, the error term value is εi = +4, for we have:
E(Yi ) = 9.5 + 2.1(45) = 104 and Yi = 104 + 4 = 108. This figure displays
the probability distribution of Y when X=45, and indicates from where in
this distribution the observation Yi = 108 came. Note again that the
error term εi is simply the deviation of Yi = from its mean value E(Yi ).
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 19 / 25
Assumptions of Regression Model
Specific assumptions concerning the probability distribution of ε :
Yi = β0 + β1 Xi + εi i=1,2,...,n
• A1) εi is normally distributed for all i
(Because one may expect small values of εi to occur more
frequently than large ones)
• A2) εi has zero mean or E(εi ) = 0 for all i
with A1 and A2, εi ∼ N(0, σ 2 )
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 20 / 25
• A3) The erros εi are homoscedastic or
Var(εi ) = E(εi − E(εi ))2 =E(ε2i ) = σε2 = Var(ε) constant for
all i. Each f(εi ) distribution has the same constant variance
Var(εi ) =σε2 whose value is unknown in general.
homoscedasticity = constant error variance for all i.
• A4) Non-autocorrelation or no-serial correlation assumption.
Cov(εi , εj )=E[(εi − E(εi ))(εj − E(εj ))] = E(εi εj ) = 0
(Successive εi values are uncorrelated) or no-serial correlation.
Error terms are independent. So there is no covariance (or
correlation) between εi and εj , i ̸= j
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 21 / 25
• A5) X and ε are independent. So, Cov(Xi , εj )=0 for all i and j
or X and ε are uncorrelated.
• A6) X is not a random variable, it is nonrandom or
nonstochastic variable, with finite variance (The Xi values are
fixed, i.e controllable, in repeated sampling from the same
population.)
If A2-A6 are considered, then the resulting model is ”weak
classical linear regression” model.
If A1 is also included, then strong case emerges. Assumption
A1 allows us to conduct tests of hypothesis and to construct
confidence intervals for β0 , β1 , ..., βk , σ 2 , E(Y\X) = β0 + β1 X,
Y0 \X0 ....
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 22 / 25
E(Yi \Xi ) = µi = β0 + β1 Xi
Var(Yi \Xi ) = Var(β0 + β1 Xi + εi )
= Var(εi ) = Var(ε) = σε2 = σ 2
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 23 / 25
From A1-A6 the probability distribution of Y are concerned:
• E(Yi ) = E(β0 + β1 Xi + ε) = E(Yi \Xi ) + E(εi ) = E(Yi \Xi ) =
β0 + β1 Xi
• Var(Yi ) = E[Yi − E(Yi )]2 = E(β0 + β1 Xi + εi − β0 − β1 Xi )2 =
E(ε2i ) = σ 2 :constant for all i, whereas Cov(Yi , Yj )=0, i̸= j
• The random variable Yi is normally distributed because εi is
normally distributed, i=1,...,n
E(Yi ) = β0 + β1 Xi
Var(Yi ) = σ 2 and ε~N(0, σ 2 ) then
Yi ~N(β0 + β1 Xi , σ 2 )
• The random variable Yi are independent because εi are
independent.
Cov(εi , εj ) = 0 then Cov(Yi , Yj ) = 0, i ̸= j
−−→ .
.
.
.
.
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
.
.
.
.
.
.
.
.
.
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 24 / 25
Self-study
Read pg. 1-51
from Neter, Wasserman, Kutner, Applied Linear Regression
Analysis, Irwin, 1983.
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
Gülhayat GÖLBAŞI ŞİMŞEK
Regression Analysis 1 25 / 25