Logistic Regression
Logistic Regression
► Key Points:
► Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value.
► It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
► In Logistic regression, instead of fitting a regression line, we fit an “S” shaped
logistic function, which predicts two maximum values (0 or 1).
Logistic Function – Sigmoid Function
► The sigmoid function is a mathematical function used to map the predicted values
to probabilities.
► It maps any real value into another value within a range of 0 and 1. The value of
the logistic regression must be between 0 and 1, which cannot go beyond this
limit, so it forms a curve like the “S” form.
► The S-form curve is called the Sigmoid function or the logistic function.
► In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1,
and a value below the threshold values tends to 0.
Types of Logistic Regression
On the basis of the categories, Logistic Regression can be classified into three types:
► Binomial: In binomial Logistic regression, there can be only two possible types of
the dependent variables, such as 0 or 1, Pass or Fail, etc.
► Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”
► Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as “low”, “Medium”, or “High”.
Assumptions of Logistic Regression
We will explore the assumptions of logistic regression as understanding these assumptions is important to
ensure that we are using appropriate application of the model. The assumption include:
► Independent observations: Each observation is independent of the other. meaning there is no correlation
between any input variables.
► Binary dependent variables: It takes the assumption that the dependent variable must be binary or
dichotomous, meaning it can take only two values. For more than two categories SoftMax functions are
used.
► Linearity relationship between independent variables and log odds: The relationship between the
independent variables and the log odds of the dependent variable should be linear.
► No outliers: There should be no outliers in the dataset.
► Large sample size: The sample size is sufficiently large
How does Logistic Regression work?
► The logistic regression model transforms the linear regression function continuous
value output into categorical value output using a sigmoid function, which maps
any real-valued set of independent variables input into a value between 0 and 1.
This function is known as the logistic function.
Sigmoid Function
► Now we use the sigmoid function where the input will be z and we find the
probability between 0 and 1. i.e. predicted y.
How to Evaluate Logistic Regression
Model?
Differences Between Linear and Logistic
Regression Linear Regression Logistic Regression
Linear regression is used to predict the continuous dependent variable using a Logistic regression is used to predict the categorical dependent variable using a
given set of independent variables. given set of independent variables.
Linear regression is used for solving regression problem. It is used for solving classification problems.
In this we predict the value of continuous variables In this we predict values of categorical variables
Least square estimation method is used for estimation of accuracy. Maximum likelihood estimation method is used for Estimation of accuracy.
The output must be continuous value, such as price, age, etc. Output must be categorical value such as 0 or 1, Yes or no, etc.
It required linear relationship between dependent and independent variables. It not required linear relationship.
There may be collinearity between the independent variables. There should be little to no collinearity between independent variables.
example
► Sigmoid