0% found this document useful (0 votes)
8 views

4_c_logistic regression

The document provides an overview of logistic regression, a statistical method used for modeling binary dependent variables using a logistic function. It contrasts logistic regression with traditional regression analysis, highlighting that logistic regression predicts probabilities rather than values. The document also includes examples and illustrations to explain the concepts of maximum likelihood estimation and log-odds in the context of logistic regression.

Uploaded by

eeshasingh2501
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

4_c_logistic regression

The document provides an overview of logistic regression, a statistical method used for modeling binary dependent variables using a logistic function. It contrasts logistic regression with traditional regression analysis, highlighting that logistic regression predicts probabilities rather than values. The document also includes examples and illustrations to explain the concepts of maximum likelihood estimation and log-odds in the context of logistic regression.

Uploaded by

eeshasingh2501
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Logistic Regression

Introduction

Regression analysis and logistic regression


Concept of logistic regression
Types of logistic regression
Regression Analysis and Logistic Regression

Regression analysis
Based on the principle of Least Square Estimation (LSE)
The parameters are chosen to minimize the sum of squared errors (SSE)

Minimizes error in prediction


If the error distribution is normal with constant variance, the LSE estimates the parameters
accurately; that is, model is the best possible and with the smallest standard errors
Applicable when dependent variable follows normal distribution

Logistic regression analysis


When a dependent variable does not follow normal distribution
Value of a dependent variable may be with 2, 3 or a fewer more outcomes
Regression Analysis and Logistic Regression

Logistic regression
Considers maximum likelihood estimation (MLE) to give better result.
In MLE, the likelihood is the probability of the observed data set given a set of proposed values
for the parameters.
The principle of MLE is to estimate parameters by choosing parameter values that give the
largest possible likelihood.

Note

Regression analysis predicts a value of a dependent variable


Logistic regression predicts the probability of a given value of a dependent variable
Both estimates their respective model parameters
Regression Analysis and Logistic Regression

A Regression model and Logistic Regression model


An Example

Hours
0.50 0.75 1.00 1.25 1.50 1.75 1.75 2.00 2.25 2.50 2.75 3.00 3.25 3.50 4.00 4.25 4.50 4.75 5.00 5.50
(xi)

Pass
0 0 0 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 1 1
(yi)

Probability

Category 1:
Category 2:
Concept of Logistic Regression

Introduction

Developed and popularized primarily by


Joseph Berkson in (1944), where he coined
the term logit.

Logistic regression is a statistical method.


uses a logistic function to model a
binary dependent variable.
although many more complex
extensions exist.

Logistic regression (or logit regression)


estimates the parameters of a logistic
model.
Concept of Logistic Regression

Introduction

A binary logistic model has a


dependent variable with two
possible values, such as Pass or
Fail, Happy or Sad etc.

It is represented by an indicator
variable, where the two values
are labeled ‘1’ and ‘0’

1 0
Concept of Logistic Regression

What is logistic regression?

The logistic function, takes the following form 𝒑𝒙


𝒕 = 𝒍𝒏
𝟏 − 𝒑𝒙
𝑒 𝛽0 +𝛽1 𝑥1 +∙∙∙+𝛽𝑚𝑥𝑚
𝑝 𝑥1 , . . . , 𝑥𝑚 = 𝑝𝑥 =
1+𝑒 𝛽0 +𝛽1 𝑥1 +∙∙∙+𝛽𝑚𝑥𝑚
𝒆𝒕
px = 𝟏+𝒆𝒕
Binary regression: A case when 𝑝 has two outcomes: success and failure
It defines odds, which is the ratio of the probability of success and failure
𝑝𝑥
𝑜𝑑𝑑𝑠 = 1−𝑝𝑥
= 𝑒 𝛽0 +𝛽𝑎 𝑥1 +∙∙∙+𝛽𝑚𝑥𝑚

The logarithm of the odds (called logit) is


t = ln(odds) = 𝛽0 + 𝛽1 𝑥1 +∙∙∙ +𝛽𝑚 𝑥𝑚
An Illustration

A sample is collected to examine the effect of toxic substance on tumor. A subject is


examined for the toxic content in the body and then the presence (1) or absence (0) of
tumors. The independent variable is the concentration of the toxic substance “Conc” .
The number of subjects at each concentration (N) and the number having tumors
“Tumor” is shown in the table.
Odds and ln(odds) are also included in the table.

Conc N Tumor Odds ln(odds)


0.0 50 2 0.0417 -3.18 𝒑
𝒕 = 𝒍𝒏
2.1 54 5 0.1020 -3.28 𝟏−𝒑
5.4 46 5 0.1220 -2.10
8.0 51 10 0.2439 -1.41
15.0 50 40 4.0000 +1.39
19.5 52 42 4.2000 +1.44
An Illustration
Conc N Tumor Odds ln(odds)
0.0 50 2 0.0417 -3.18
2.1 54 5 0.1020 -3.28
5.4 46 5 0.1220 -2.10
8.0 51 10 0.2439 -1.41
15.0 50 40 4.0000 +1.39
19.5 52 42 4.2000 +1.44

Here, we find a relation between t


(ln(odds) and x (Conc).
𝑡 = 𝛽0 + 𝛽1 x
Thus, logit is a linear function
For this, it can be calculated as
𝛽0 = -3.204 and 𝛽1 = 0.2628
An Illustration
Here, we find a relation between t
(ln(odds) and x (Conc).
𝑡 = 𝛽0 + 𝛽1 x
Thus, logit is a linear function
For this, it can be calculated as
𝛽0 = -3.204 and 𝛽1 = 0.2628

For an example, a subject exposed to a


concentration of 10, has an estimated
probability of tumor is
𝑒𝑡
𝑝10 = 1+𝑒 𝑡
= 0.36
Concept of Logistic Regression

Logit in Logistic Regression


The log-odds (the logarithm of the odds) is a linear combination of
one or more independent variables ("predictors").
the independent variables can each be a continuous variable (any real
value).

The probability of the value can vary between 0 (such as certainly false)
and 1 (such as certainly true).

You might also like