Logistic Regression
Logistic Regression
Common Applications:
1. Predicting continuous outcomes: Regression is used to predict continuous outcomes,
such as stock prices, temperatures, or energy consumption.
2. Analysing relationships: Regression is used to analyse the relationships between
variables, such as the relationship between advertising spend and sales.
3. Identifying trends: Regression is used to identify trends and patterns in data, such as
identifying seasonal trends in sales data.
Limitation of Linear Regression
• It assumes there is a linear relationship between the dependent and independent
variables. This assumption is only sometimes correct.
Example: The relationship between consumer spending and income is not always
linear. After a point, an increase in the income might not be directly proportional to the
spending.
Question: What is logistic regression?
Logistic regression is a type of regression analysis used to predict the outcome of a
categorical dependent variable based on one or more predictor variables. The dependent
variable is typically binary, meaning it has only two possible outcomes (e.g., 0/1,
yes/no, pass/fail, success/failure etc.).
The general form of a logistic regression equation is:
1
p= −( bo + b1 x1 + b2 x2 + b3 x3 +........+ bk xk )
1+ e
Where x1 , x2 , x3 ........xk are the independent variables (predictors), b1 , b2 , b3 ........bk are
regression coefficients
Example: Credit Risk Assessment
Goal: Predict the likelihood of loan default based on credit score, income, and
employment history.
Input variables: Credit score, income, employment history.
Output variable: Loan default (yes/no).
Logistic regression equation:
1
p= −( bo + b1 ( CreditScore ) + b2 ( Income ) + b3 ( EmploymentHistory ) )
1+ e
Example: Medical Diagnosis
Goal: Predict the presence or absence of a disease based on symptoms and medical
history.
Input variables: Symptoms (fever, headache, etc.), medical history (family history,
previous illnesses, etc.).
Output variable: Disease presence (yes/no).
Logistic regression equation:
1
p(disease) = −( bo + b1 ( Symptoms ) + b2 ( MedicalHistory ) )
1+ e
Example: Spam Detection
Goal: Classify emails as spam or not spam based on email content and metadata.
Input variables: Email content (keywords, phrases, etc.), metadata (sender, recipient,
etc.).
Output variable: Spam classification (yes/no).
Logistic regression equation:
1
p ( spam) = −( bo + b1 ( EmailContent ) + b2 ( Metadata ) )
1+ e
Assumptions of Logistic Regression:
1. Binary outcome: The dependent variable should be binary.
2. Independence: Each observation should be independent of the others.
3. Linearity: The relationship between the input variables and the log-odds of the
outcome should be linear.
Common Applications:
1. Credit risk assessment: Predicting the likelihood of loan default.
2. Medical diagnosis: Predicting the presence or absence of a disease.
3. Customer churn prediction: Predicting the likelihood of customer attrition.
4. Spam detection: Classifying emails as spam or not spam.
Limitations of Logistic Regression
It assumes the linear relationship between the dependent and independent variables' log
odds(logit). This can be restrictive if the actual relationship is non-linear.
Example: The effectiveness of a treatment might not linearly correlate with the
patient's age or the dosage.
Question: Explain Difference Between Linear Regression and Logistic
Regression?
1 1
( x) = or (log it p ) =
1 + e− x 1 + e − log it p
Properties of the Sigmoid Function:
1. S-shaped curve: The sigmoid function has an S-shaped curve, where the output
increases rapidly at first and then levels off.
2. Asymptotes: The sigmoid function has asymptotes at 0 and 1, meaning that the output
approaches 0 or 1 as the input approaches negative or positive infinity.
3. Continuous and differentiable: The sigmoid function is continuous and differentiable,
making it easy to optimize using gradient-based methods.
Problems:
1.How many years will it take for a bacterial population to grow to 9000 grams, assuming its growth
10000
follows the model f (t ) = ?
1 + e−0.12( t − 20)
2. Given that the probability p of a certain event occurring is 0.7, calculate the logit of p.
3. A survey shows that the probability p of purchasing a particular product is 0.8. compute the logit of
p
5.A logistic regression model has the following equation for the logit of p: log it ( p ) = −1.5 + 0.8 x ,
Find the predicted probability when x=3
6. The odds of winning a game are 3:1. What is the probability p and log it ( p ) .
7. A logistic regression model outputs the logit of p for a customer making a purchase as -0.847. classify
the customer as “likely to purchase “if p 0.5 , otherwise as “unlikely to purchase “
8. A logistic regression model has the following equation for the logit of p:
log it ( p ) = 2.5 − 1.2 x1 + 0.4 x2 , if x1 = 2 & x2 = 3 Find the predicted probability of p
10. The logit of a probability p is given as -1.5. find the corresponding probability using the Sigmoid
function
11. In a binary classification model, the Sigmoid function output is 0.65. if classification threshold is
0.5, determine the predicted class
12. If the output of the sigmoid function is 0.9, determine the corresponding input x that produces
this output. Find the derivative of the sigmoid function when x = 1
13. In a neural network, the weighted input is x = −0.7 . Calculate the sigmoid output and its
derivative.