0% found this document useful (0 votes)
4 views

Logistic Regression

Logistic regression is a supervised machine learning algorithm used for binary classification, predicting the probability of an instance belonging to a class using a sigmoid function. It can be categorized into binomial, multinomial, and ordinal types, and relies on assumptions such as independent observations and a large sample size. The model transforms linear regression outputs into categorical values, making it suitable for classification problems rather than regression.

Uploaded by

Shehzad Aman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Logistic Regression

Logistic regression is a supervised machine learning algorithm used for binary classification, predicting the probability of an instance belonging to a class using a sigmoid function. It can be categorized into binomial, multinomial, and ordinal types, and relies on assumptions such as independent observations and a large sample size. The model transforms linear regression outputs into categorical values, making it suitable for classification problems rather than regression.

Uploaded by

Shehzad Aman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Machine Learning

INSTRUCTOR : MUHAMMAD HASAAN MUJTABA


EMAIL: [email protected]
Logistic Regression

► Logistic regression is a supervised machine learning algorithm used for


classification tasks where the goal is to predict the probability that an instance
belongs to a given class or not. Logistic regression is a statistical algorithm which
analyze the relationship between two data factors.
What is Logistic Regression?

► Logistic regression is used for binary classification where we use sigmoid


function, that takes input as independent variables and produces a probability
value between 0 and 1.
► For example, we have two classes Class 0 and Class 1 if the value of the logistic
function for an input is greater than 0.5 (threshold value) then it belongs to Class 1
otherwise it belongs to Class 0. It’s referred to as regression because it is the
extension of linear regression but is mainly used for classification problems.
What is Logistic Regression?

► Key Points:
► Logistic regression predicts the output of a categorical dependent variable.
Therefore, the outcome must be a categorical or discrete value.
► It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
► In Logistic regression, instead of fitting a regression line, we fit an “S” shaped
logistic function, which predicts two maximum values (0 or 1).
Logistic Function – Sigmoid Function

► The sigmoid function is a mathematical function used to map the predicted values
to probabilities.
► It maps any real value into another value within a range of 0 and 1. The value of
the logistic regression must be between 0 and 1, which cannot go beyond this
limit, so it forms a curve like the “S” form.
► The S-form curve is called the Sigmoid function or the logistic function.
► In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1,
and a value below the threshold values tends to 0.
Types of Logistic Regression

On the basis of the categories, Logistic Regression can be classified into three types:
► Binomial: In binomial Logistic regression, there can be only two possible types of
the dependent variables, such as 0 or 1, Pass or Fail, etc.
► Multinomial: In multinomial Logistic regression, there can be 3 or more possible
unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”
► Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered
types of dependent variables, such as “low”, “Medium”, or “High”.
Assumptions of Logistic Regression
We will explore the assumptions of logistic regression as understanding these assumptions is important to
ensure that we are using appropriate application of the model. The assumption include:
► Independent observations: Each observation is independent of the other. meaning there is no correlation
between any input variables.
► Binary dependent variables: It takes the assumption that the dependent variable must be binary or
dichotomous, meaning it can take only two values. For more than two categories SoftMax functions are
used.
► Linearity relationship between independent variables and log odds: The relationship between the
independent variables and the log odds of the dependent variable should be linear.
► No outliers: There should be no outliers in the dataset.
► Large sample size: The sample size is sufficiently large
How does Logistic Regression work?

► The logistic regression model transforms the linear regression function continuous
value output into categorical value output using a sigmoid function, which maps
any real-valued set of independent variables input into a value between 0 and 1.
This function is known as the logistic function.
Sigmoid Function

► Now we use the sigmoid function where the input will be z and we find the
probability between 0 and 1. i.e. predicted y.
How to Evaluate Logistic Regression
Model?
Differences Between Linear and Logistic
Regression Linear Regression Logistic Regression

Linear regression is used to predict the continuous dependent variable using a Logistic regression is used to predict the categorical dependent variable using a
given set of independent variables. given set of independent variables.

Linear regression is used for solving regression problem. It is used for solving classification problems.

In this we predict the value of continuous variables In this we predict values of categorical variables

In this we find best fit line. In this we find S-Curve.

Least square estimation method is used for estimation of accuracy. Maximum likelihood estimation method is used for Estimation of accuracy.

The output must be continuous value, such as price, age, etc. Output must be categorical value such as 0 or 1, Yes or no, etc.

It required linear relationship between dependent and independent variables. It not required linear relationship.

There may be collinearity between the independent variables. There should be little to no collinearity between independent variables.
example

► 1. How much student study for passing exam


► 2. how many hours he studied to achieve 95%
example

Hours Study Pass(1)/Fail(0)


29 0
15 0
33 1
28 1
39 1
example

► Sigmoid

You might also like