MACHINE LEARNING Presentation Logistic Regression

Logistic regression is a machine learning algorithm used for classification problems. It predicts the probability of an output belonging to a class by fitting data to a logistic curve. The algorithm works by assigning probabilities to outcomes using the sigmoid function and creating a decision boundary to separate classes based on these probabilities.

Uploaded by

Courtney Kudra Dzere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

385 views18 pages

MACHINE LEARNING Presentation Logistic Regression

Uploaded by

Courtney Kudra Dzere

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 18

MACHINE LEARNING

LOGISTIC REGRESSION
Brendon chamunorwa m223138
Emmanuel chingosho m224134
Rudo nyamarambwe m222242
Tadiwa masocha m223530
WHAT IS LOGISTIC
REGRESSION
Logistic regression is a supervised machine learning algorithm mainly used for
classification tasks where the goal is to predict the probability that an instance of belonging
to a given class.
It is used for classification algorithms its name is logistic regression. it’s referred to as
regression because it takes the output of the linear regression function as input and uses a
sigmoid function to estimate the probability for the given class.
The difference between linear regression and logistic regression is that linear regression
output is the continuous value that can be anything while logistic regression predicts the
probability that an instance belongs to a given class or not.
TERMINOLOGIES INVOLVED
IN LOGISTIC REGRESSION:
Independent variables: The input characteristics or predictor factors applied to the
dependent variable’s predictions.
Dependent variable: The target variable in a logistic regression model, which we are
trying to predict.
Logistic function: The formula used to represent how the independent and dependent
variables relate to one another. The logistic function transforms the input variables
into a probability value between 0 and 1, which represents the likelihood of the
dependent variable being 1 or 0.
.

Odds: It is the ratio of something occurring to something not occurring. it is different from
probability as the probability is the ratio of something occurring to everything that could
possibly occur.
Log-odds: The log-odds, also known as the logit function, is the natural logarithm of the
odds. In logistic regression, the log odds of the dependent variable are modeled as a linear
combination of the independent variables and the intercept.
Coefficient: The logistic regression model’s estimated parameters, show how the
independent and dependent variables relate to one another.
Intercept: A constant term in the logistic regression model, which represents the log odds
when all independent variables are equal to zero.
Maximum likelihood estimation: The method used to estimate the coefficients of the
logistic regression model, which maximizes the likelihood of observing the data given the
model.
CHARACTERISTICS OF
LOGISTIC REGRESSION
It is used for predicting the categorical dependent variable using a given set of
independent variables.
Logistic regression predicts the output of a categorical dependent variable. Therefore
the outcome must be a categorical or discrete value.
It can be either Yes or No, 0 or 1, true or False, etc. but instead of giving the exact
value as 0 and 1, it gives the probabilistic values which lie between 0 and 1.
Logistic Regression is much similar to the Linear Regression except that how they
are used. Linear Regression is used for solving Regression problems, whereas
Logistic regression is used for solving the classification problems.
,

In Logistic regression, instead of fitting a regression line, we fit an “S” shaped logistic
function, which predicts two maximum values (0 or 1).
The curve from the logistic function indicates the likelihood of something such as whether
the cells are cancerous or not, a mouse is obese or not based on its weight, etc.
Logistic Regression is a significant machine learning algorithm because it has the ability to
provide probabilities and classify new data using continuous and discrete datasets.
Logistic Regression can be used to classify the observations using different types of data
and can easily determine the most effective variables used for the classification.
LOGISTIC FUNCTION OR
SIGMOID FUNCTION
The sigmoid function is a mathematical function used to map the predicted values to
probabilities.
It maps any real value into another value within a range of 0 and 1. o The value of
the logistic regression must be between 0 and 1, which cannot go beyond this limit,
so it forms a curve like the “S” form.
The S-form curve is called the Sigmoid function or the logistic function.
In logistic regression, we use the concept of the threshold value, which defines the
probability of either 0 or 1. Such as values above the threshold value tends to 1, and
a value below the threshold values tends to 0.
TYPE OF LOGISTIC
REGRESSION:
1. Binomial: In binomial Logistic regression, there can be only two possible types
of the dependent variables, such as 0 or 1, Pass or Fail, etc.

2. Multinomial: In multinomial Logistic regression, there can be 3 or more possible

unordered types of the dependent variable, such as “cat”, “dogs”, or “sheep”

3. Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered

types of dependent variables, such as “low”, “Medium”, or “High”.
HOW DOES LOGISTIC
REGRESSION WORK?
Machine learning generally involves predicting a quantitative outcome or a
qualitative class. The former is commonly referred to as a regression problem. In the
scenario of linear regression, the input is a continuous variable, and the prediction is
a numerical value. When predicting a qualitative outcome (class), the task is
considered a classification problem. Examples of classification problems include
predicting what products a user will buy or if a target user will click on an online
advertisement.
Not all algorithms fit cleanly into this simple dichotomy, though, and logistic
regression is a notable example.
.

Logistic regression is part of the regression family as it involves predicting outcomes based
on quantitative relationships between variables. However, unlike linear regression, it
accepts both continuous and discrete variables as input and its output is qualitative. In
addition, it predicts a discrete class such as “Yes/No” or “Customer/Non-customer”.

In practice, the logistic regression algorithm analyzes relationships between variables. It

assigns probabilities to discrete outcomes using the Sigmoid function, which converts
numerical results into an expression of probability between 0 and 1.0. Probability is either 0
or 1, depending on whether the event happens or not. For binary predictions, you can divide
the population into two groups with a cut-off of 0.5. Everything above 0.5 is considered to
belong to group A, and everything below is considered to belong to group B.
A hyperplane is used as a decision line to separate two categories (as far as possible) after
data points have been assigned to a class using the Sigmoid function. The class of future
data points can then be predicted using the decision boundary.
LOGISTIC REGRESSION
EQUATION
The odd is the ratio of something occurring to something not occurring. it is
different from probability as the probability is the ratio of something occurring to
everything that could possibly occur. so odd will be

Applying natural log on odd. then log odd will be

then the final logistic regression equation will be:

ASSUMPTIONS FOR LOGISTIC
REGRESSION
Independent observations: Each observation is independent of the other. meaning
there is no correlation between any input variables.
Binary dependent variables: It takes the assumption that the dependent variable must
be binary or dichotomous, meaning it can take only two values. For more than two
categories softmax functions are used.
Linearity relationship between independent variables and log odds: The relationship
between the independent variables and the log odds of the dependent variable should
be linear.
No outliers: There should be no outliers in the dataset.
Large sample size: The sample size is sufficiently large
APPLYING STEPS IN LOGISTIC
REGRESSION MODELING:
1. Define the problem: Identify the dependent variable and independent variables and determine if the
problem is a binary classification problem.
2. Data preparation: Clean and preprocess the data, and make sure the data is suitable for logistic regression
modeling.
3. Exploratory Data Analysis (EDA): Visualize the relationships between the dependent and independent
variables, and identify any outliers or anomalies in the data.
4. Feature Selection: Choose the independent variables that have a significant relationship with the
dependent variable, and remove any redundant or irrelevant features.
5. Model Building: Train the logistic regression model on the selected independent variables and estimate
the coefficients of the model.
6. Model Evaluation: Evaluate the performance of the logistic regression model using appropriate metrics
such as accuracy, precision, recall, F1-score, or AUC-ROC.
7. Model improvement: Based on the results of the evaluation, fine-tune the model by adjusting the
independent variables, adding new features, or using regularization techniques to reduce overfitting.
8. Model Deployment: Deploy the logistic regression model in a real-world scenario and make predictions
on new data.
LOGISTIC REGRESSION
MODEL THRESHOLDING
Logistic regression becomes a classification technique only when a decision
threshold is brought into the picture. The setting of the threshold value is a very
important aspect of Logistic regression and is dependent on the classification
problem itself.

The decision for the value of the threshold value is majorly affected by the values of
precision and recall. Ideally, we want both precision and recall to be 1, but this
seldom is the case.

In the case of a Precision-Recall tradeoff, we use the following arguments to decide

upon the threshold:
,

1. Low Precision/High Recall: In applications where we want to reduce the number of

false negatives without necessarily reducing the number of false positives, we choose a
decision value that has a low value of Precision or a high value of Recall. For example,
in a cancer diagnosis application, we do not want any affected patient to be classified as
not affected without giving much heed to if the patient is being wrongfully diagnosed
with cancer. This is because the absence of cancer can be detected by further medical
diseases but the presence of the disease cannot be detected in an already rejected
candidate.

2. High Precision/Low Recall: In applications where we want to reduce the number of

false positives without necessarily reducing the number of false negatives, we choose a
decision value that has a high value of Precision or a low value of Recall. For example,
if we are classifying customers whether they will react positively or negatively to a
personalized advertisement, we want to be absolutely sure that the customer will react
positively to the advertisement because otherwise, a negative reaction can cause a loss
of potential sales from the customer.
CONCLUSION
Logistic regression is a supervised learning method that helps to predict events that have a
binary outcome, such as whether a person will successfully pass a driving test. In order to
make predictions in this scenario, you need data from past test results. The model takes this
data and predicts the likelihood that the same person will pass the test in the future. The
main idea behind logistic regression is to use a model based on the probability of an
outcome occurring.

Fisher Linear Discriminant Analysis: Max Welling
No ratings yet
Fisher Linear Discriminant Analysis: Max Welling
4 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
15 pages
Rajesh (DL Unit1) 04dec2024
No ratings yet
Rajesh (DL Unit1) 04dec2024
125 pages
Jntuk R20 ML Unit-Ii
No ratings yet
Jntuk R20 ML Unit-Ii
37 pages
Discriminant Analysis Assignment
No ratings yet
Discriminant Analysis Assignment
13 pages
Unit 4
No ratings yet
Unit 4
38 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
Gaussian Mixture Models Unit-III
No ratings yet
Gaussian Mixture Models Unit-III
13 pages
DBMS - Unit-3
No ratings yet
DBMS - Unit-3
35 pages
Supervised Learning Essentials
No ratings yet
Supervised Learning Essentials
30 pages
ML Unit - 3
No ratings yet
ML Unit - 3
23 pages
Data Warehousing & Mining Guide
No ratings yet
Data Warehousing & Mining Guide
142 pages
Computational Learning Theory Guide
No ratings yet
Computational Learning Theory Guide
24 pages
Lecture Notes - Random Forests PDF
100% (1)
Lecture Notes - Random Forests PDF
4 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
23 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
18 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
6 pages
ML Unit 2
No ratings yet
ML Unit 2
25 pages
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
No ratings yet
Classification and Regression Trees (CART - I) : Dr. A. Ramesh
34 pages
Model Building Through
No ratings yet
Model Building Through
21 pages
Perceptron
No ratings yet
Perceptron
26 pages
Unit IV V Deep Learning Material
No ratings yet
Unit IV V Deep Learning Material
32 pages
Soft Computing UNIT 3
No ratings yet
Soft Computing UNIT 3
10 pages
ML - Expectation-Maximization Algorithm
No ratings yet
ML - Expectation-Maximization Algorithm
3 pages
Single-Layer Perceptron Guide
No ratings yet
Single-Layer Perceptron Guide
39 pages
Linear Models & SVM in Machine Learning
100% (1)
Linear Models & SVM in Machine Learning
23 pages
Unit-4 Part-1 ML Ai&Ml r23
No ratings yet
Unit-4 Part-1 ML Ai&Ml r23
20 pages
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
No ratings yet
Sat - 13.Pdf - Child Mortality Prediction Using Machine Learning
11 pages
Data Science Unit-1 Notes
No ratings yet
Data Science Unit-1 Notes
19 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Neural Networks & SVMs in AI
No ratings yet
Neural Networks & SVMs in AI
19 pages
Introduction To Time Series Analysis
No ratings yet
Introduction To Time Series Analysis
93 pages
ML Unit 3
No ratings yet
ML Unit 3
17 pages
9.deep Feedforward Networks
100% (1)
9.deep Feedforward Networks
13 pages
ML-5TH Unit
No ratings yet
ML-5TH Unit
28 pages
Lab Program
100% (1)
Lab Program
15 pages
CS 601 Machine Learning Unit 5
No ratings yet
CS 601 Machine Learning Unit 5
18 pages
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
No ratings yet
Computational Graphs in Deep Learning Unit v4 Deep Leaerning
3 pages
ML Seminar Presentation
No ratings yet
ML Seminar Presentation
26 pages
SQL & PL/SQL Exercises for Students
No ratings yet
SQL & PL/SQL Exercises for Students
10 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
30 pages
Unit V Big Data Analytics
No ratings yet
Unit V Big Data Analytics
47 pages
Unit 3
100% (1)
Unit 3
21 pages
Modelling in R
No ratings yet
Modelling in R
47 pages
ML Lab Manual (5cs4-23)
No ratings yet
ML Lab Manual (5cs4-23)
53 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
3 pages
Class Notes Unit 2 ML Material
No ratings yet
Class Notes Unit 2 ML Material
31 pages
FSD Unit III
No ratings yet
FSD Unit III
22 pages
What Is Linear Discriminant Analysis
No ratings yet
What Is Linear Discriminant Analysis
3 pages
Section 2 Text Analytics and Text Mining Overview
No ratings yet
Section 2 Text Analytics and Text Mining Overview
47 pages
Types of Data (Qualitative and Quantitative)
No ratings yet
Types of Data (Qualitative and Quantitative)
89 pages
SMT-PREDICT An Efficient Framework For Stock Market Trend Prediction Using Historical and Sentimental
No ratings yet
SMT-PREDICT An Efficient Framework For Stock Market Trend Prediction Using Historical and Sentimental
5 pages
Machine Learning: PAC-Learning and VC-Dimension
No ratings yet
Machine Learning: PAC-Learning and VC-Dimension
31 pages
R PPT 30
No ratings yet
R PPT 30
45 pages
ANN - Ch2-Adaline and Madaline
100% (1)
ANN - Ch2-Adaline and Madaline
29 pages
Project Management NOtes For Makaut
No ratings yet
Project Management NOtes For Makaut
65 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
ISh116 Course Outline-1
100% (1)
ISh116 Course Outline-1
5 pages
Empathised Stage - Group4
No ratings yet
Empathised Stage - Group4
10 pages
Dynamic Programming Presentation (Autosaved)
No ratings yet
Dynamic Programming Presentation (Autosaved)
17 pages
Dynamic Programming
No ratings yet
Dynamic Programming
12 pages
NP Completeness Presentation
No ratings yet
NP Completeness Presentation
27 pages
Difference Between Physical and Logical Topology
No ratings yet
Difference Between Physical and Logical Topology
3 pages
Network Models
No ratings yet
Network Models
11 pages
Outlier Detection: Univariate and Multivariate
No ratings yet
Outlier Detection: Univariate and Multivariate
13 pages
Introductory Econometrics - Exam: 1 Theoretical Questions
No ratings yet
Introductory Econometrics - Exam: 1 Theoretical Questions
5 pages
Stock Analysis for Finance Students
No ratings yet
Stock Analysis for Finance Students
4 pages
Statistics Study Guide Chi-Square
No ratings yet
Statistics Study Guide Chi-Square
4 pages
Petroleum Data Analytics
No ratings yet
Petroleum Data Analytics
2 pages
ANOVA
No ratings yet
ANOVA
3 pages
3 Confidence Intervals
No ratings yet
3 Confidence Intervals
16 pages
09 Quiz 1
No ratings yet
09 Quiz 1
2 pages
Forecasting Essentials for Managers
No ratings yet
Forecasting Essentials for Managers
52 pages
IE533 - Chapter 04
No ratings yet
IE533 - Chapter 04
32 pages
Short Term Prediction of Groundwater Level Using Improved Random Forest Regression With A Combination of Random Features
No ratings yet
Short Term Prediction of Groundwater Level Using Improved Random Forest Regression With A Combination of Random Features
12 pages
Total Quality Management Multiple Choice Questions and Answers. Page 24
No ratings yet
Total Quality Management Multiple Choice Questions and Answers. Page 24
3 pages
JQT1997
No ratings yet
JQT1997
3 pages
Key Determinant Factors Affecting The Performance of Small and Medium Scale Manufacturing Enterprise A Case Study On West Shoa Zone Oromia National Regional State Ethiopia IJERTV
No ratings yet
Key Determinant Factors Affecting The Performance of Small and Medium Scale Manufacturing Enterprise A Case Study On West Shoa Zone Oromia National Regional State Ethiopia IJERTV
8 pages
Interview Questions AI
No ratings yet
Interview Questions AI
7 pages
2 Right Censoring and Kaplan-Meier Estimator: ST 745, Daowen Zhang
No ratings yet
2 Right Censoring and Kaplan-Meier Estimator: ST 745, Daowen Zhang
33 pages
Statistics Question Paper
No ratings yet
Statistics Question Paper
4 pages
5 Forecasting PDF
No ratings yet
5 Forecasting PDF
24 pages
Statistical Tables for Engineering
No ratings yet
Statistical Tables for Engineering
45 pages
Specification, SFU Notes
No ratings yet
Specification, SFU Notes
19 pages
Analysis of Contingency Tables
No ratings yet
Analysis of Contingency Tables
34 pages
Kolmogorov-Smirnov Test Guide
100% (1)
Kolmogorov-Smirnov Test Guide
12 pages
Base de Dato
No ratings yet
Base de Dato
53 pages
AP Statistics Unit 5 Progress Check MCQ Part A Report Details
No ratings yet
AP Statistics Unit 5 Progress Check MCQ Part A Report Details
1 page
MATH2089 Stats Past Paper 2
No ratings yet
MATH2089 Stats Past Paper 2
6 pages
This Study Resource Was: Statistics and Probability
50% (2)
This Study Resource Was: Statistics and Probability
3 pages
IV-Sem, Business Statistics-II - 649 ET
No ratings yet
IV-Sem, Business Statistics-II - 649 ET
3 pages
Analisis Inferensial
No ratings yet
Analisis Inferensial
34 pages
Coefficient of Variation Explained
No ratings yet
Coefficient of Variation Explained
4 pages

MACHINE LEARNING Presentation Logistic Regression

Uploaded by

MACHINE LEARNING Presentation Logistic Regression

Uploaded by

MACHINE LEARNING

2. Multinomial: In multinomial Logistic regression, there can be 3 or more possible

3. Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered

In practice, the logistic regression algorithm analyzes relationships between variables. It

Applying natural log on odd. then log odd will be

then the final logistic regression equation will be:

In the case of a Precision-Recall tradeoff, we use the following arguments to decide

1. Low Precision/High Recall: In applications where we want to reduce the number of

2. High Precision/Low Recall: In applications where we want to reduce the number of

You might also like