0% found this document useful (0 votes)

121 views53 pages

UNIT 2 - Linear & Logistic Regression Ppt-Inverted

Simple Linear Regression is a statistical method that models the relationship between a dependent variable and a single independent variable, aiming to forecast new observations. It assumes a linear relationship, no multicollinearity, homoscedasticity, normal distribution of error terms, and no autocorrelation. Logistic Regression, on the other hand, is used for classification tasks to predict probabilities of categorical outcomes, fitting an 'S' shaped logistic function instead of a regression line.

Uploaded by

akshattiwari487

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

121 views53 pages

UNIT 2 - Linear & Logistic Regression Ppt-Inverted

Uploaded by

akshattiwari487

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Simple Linear Regression in

Machine Learning
Simple Linear Regression is a type of Regression algorithm that models
the relationship between a dependent variable and a single independent
variable. The relationship shown by a Simple Linear Regression model is
linear or a sloped straight line, hence it is called Simple Linear Regression.
The key point in Simple Linear Regression is that the dependent variable
must be a continuous/real value. However, the independent variable
can be measured on continuous or categorical values.

Simple Linear regression algorithm has mainly two objectives:

➢ Model the relationship between the two variables. Such as the relationship
between Income and expenditure, experience and Salary, etc.
➢ Forecasting new observations. Such as Weather forecasting according to
temperature, Revenue of a company according to the investments
relationship between the variables. Consider the below image:

Mathematically, we can represent a linear regression as:

y= a0+a1x+ ε
Here,

Y= Dependent Variable (Target Variable)

X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

The values for x and y variables are training datasets for Linear Regression model
representation.
Linear Regression Line
A linear line showing the relationship between the dependent and independent variables is called a regression line. A
regression line can show two types of relationship:
•Positive Linear Relationship:
If the dependent variable increases on the Y-axis and independent variable increases on X-axis, then such a relationship
is termed as a Positive linear relationship.

•Negative Linear Relationship:

If the dependent variable decreases on the Y-axis and the independent variable increases on the X-axis, then such a
relationship is called a negative linear relationship.
Assumptions of Linear Regression
Below are some important assumptions of Linear Regression. These are some formal checks while building a
Linear Regression model, which ensures to get the best possible result from the given dataset.

o Linear relationship between the features and target:

Linear regression assumes the linear relationship between the dependent and independent variables.

o Small or no multicollinearity between the features:

Multicollinearity means high-correlation between the independent variables. Due to multicollinearity, it may
difficult to find the true relationship between the predictors and target variables. Or we can say, it is difficult
to determine which predictor variable is affecting the target variable and which is not. So, the model
assumes either little or no multicollinearity between the features or independent variables.

o Homoscedasticity Assumption:
Homoscedasticity is a situation when the error term is the same for all the values of independent variables.
With homoscedasticity, there should be no clear pattern distribution of data in the scatter plot.

o Normal distribution of error terms:

Linear regression assumes that the error term should follow the normal distribution pattern. If error terms
are not normally distributed, then confidence intervals will become either too wide or too narrow, which
may cause difficulties in finding coefficients.
It can be checked using the q-q plot. If the plot shows a straight line without any deviation, which means
the error is normally distributed.

o No autocorrelations:
The linear regression model assumes no autocorrelation in error terms. If there will be any correlation in the
error term, then it will drastically reduce the accuracy of the model. Autocorrelation usually occurs if there
is a dependency between residual errors.
Finding the best fit line:
When working with linear regression, our main goal is to find the best fit line that means
the error between predicted values and actual values should be minimized. The best fit
line will have the least error.
The different values for weights or the coefficient of lines (a0, a1) gives a different line
of regression, so we need to calculate the best values for a0 and a1 to find the best fit line,
so to calculate this we use cost function.

Cost function-
o The different values for weights or coefficient of lines (a0, a1) gives the different line
of regression, and the cost function is used to estimate the values of the coefficient
for the best fit line.
o Cost function optimizes the regression coefficients or weights. It measures how a
linear regression model is performing.
o We can use the cost function to find the accuracy of the mapping function, which
maps the input variable to the output variable. This mapping function is also known
as Hypothesis function.
Given data (weeks, sales in thousands):
x: 1, 2, 3, 4, 5
y: 1.2, 1.8, 2.6, 3.2, 3.8
1. Compute means
ˉ 1 + 2 + 3 + 4 + 5 15
𝑥= = =3
5 5
ˉ 1.2 + 1.8 + 2.6 + 3.2 + 3.8 12.6
𝑦= = = 2.
5 5
.2. Compute slope 𝑏and intercept 𝑎
Use, 𝒃 = ∑(𝑥𝑖− 𝑥ˉ)(𝑦𝑖 − 𝑦ˉ)/ ∑(𝑥𝑖− 𝑥ˉ)2, 𝒂 = 𝑦ˉ− 𝑏𝑥ˉ.
ˉ ˉ ˉ ˉ ˉ
𝐱𝐢 𝐲𝐢 𝐱𝐢 − 𝐱 𝐲𝐢 − 𝐲 𝐱𝐢 − 𝐱 𝐲𝐢 − 𝐲 ቆ𝐱 𝐢 − 𝐱)𝟐

1 1.2 -2 -1.32 2.64 4

2 1.8 -1 -0.72 0.72 1
3 2.6 0 0.08 0.00 0
4 3.2 1 0.68 0.68 1
5 3.8 2 1.28 2.56 4
sum 6.60 10

So
6.60
𝑏= = 0.66
10
ˉ ˉ
𝑎 = 𝑦 − 𝑏𝑥 = 2.52 − 0.66 × 3 = 2.52 − 1.98 = 0.54
3. Regression line
𝑦ො = 𝑎 + 𝑏𝑥 = 0.54 + 0.66𝑥

.4Predictions

•For week 𝑥 = 7:
𝑦ො7 = 0.54 + 0.66 × 7 = 0.54 + 4.62 = 5.16(thousands) → 5,160 units
•For week 𝑥 = 12:
𝑦ො12 = 0.54 + 0.66 × 12 = 0.54 + 7.92 = 8.46(thousands) → 8,460 units

Answer: Predicted sales — week 7: 5.16k, week 12: 8.46k.

Linear Regression Problem
QUESTION 3# Given the data points:
x: 1, 2, 3, 4, 5
y: 1.2, 1.9, 3.2, 3.8, 5.1
[Link] a simple linear regression model 𝑦 = 𝑎 + 𝑏𝑥. Find the slope b and intercept 𝑎.
[Link] the predicted 𝑦forො each x and the residuals.
[Link] 𝑅2 (coefficient of determination).
[Link] 𝑦when 𝑥 = 6.

1) Useful sums and means

Number of observations: 𝑛 = 5.
෍ 𝑥 = 1 + 2 + 3 + 4 + 5 = 15.

෍ 𝑦 = 1.2 + 1.9 + 3.2 + 3.8 + 5.1 = 15.2.

ˉ ∑ 𝑥 15
𝑥= = = 3.0.
𝑛 5
ˉ ∑ 𝑦 15.2
𝑦= = = 3.04.
𝑛 5
෍ 𝑥 2 = 12 + 22 + 32 + 42 + 52 = 55.

෍ 𝑥 𝑦 = 1 ⋅ 1.2 + 2 ⋅ 1.9 + 3 ⋅ 3.2 + 4 ⋅ 3.8 + 5 ⋅ 5.1 = 55.3.

Compute the centered sums:
SSxx=∑(x−xˉ)2=10.0
ˉ
(You can check: 𝑆𝑆𝑥𝑥 = ෌ 𝑥 2 − 𝑛𝑥 2 = 55 − 5 ⋅ 32 = 55 − 45 = 10)
SSxy=∑(x−xˉ)(y−yˉ)=9.7
ˉ ˉ
(Check: 𝑆𝑆𝑥𝑦 = ∑ 𝑥 𝑦 − 𝑛𝑥 𝑦 = 55.3 − 5 ⋅ 3 ⋅ 3.04 = 55.3 − 45.6 = 9.7)

)2) Regression coefficients

Slope:
𝑆𝑆𝑥𝑦 9.7
𝑏= = = 0.97.
𝑆𝑆𝑥𝑥 10.0
Intercept:
ˉ ˉ
𝑎 = 𝑦 − 𝑏𝑥 = 3.04 − 0.97 ⋅ 3.0 = 3.04 − 2.91 = 0.13.
So the fitted line is:
𝑦ො = 0.13 + 0.97𝑥.
3) Predicted values and residuals
Compute 𝑦ො𝑖 = 𝑎 + 𝑏𝑥𝑖 :
•For x=1: 𝑦ො = 0.13 + 0.97 ⋅ 1 = 1.10.
Residual 𝑒 = 𝑦 − 𝑦ො = 1.20 − 1.10 = 0.10.
•For x=2: 𝑦ො = 0.13 + 0.97 ⋅ 2 = 2.07.
Residual = 1.90 − 2.07 = −0.17.
•For x=3: 𝑦ො = 0.13 + 0.97 ⋅ 3 = 3.04.
Residual = 3.20 − 3.04 = 0.16.
•For x=4: 𝑦ො = 0.13 + 0.97 ⋅ 4 = 4.01.
Residual = 3.80 − 4.01 = −0.21.
•For x=5: 𝑦ො = 0.13 + 0.97 ⋅ 5 = 4.98.
Residual = 5.10 − 4.98 = 0.12.
(Residuals sum to ≈ 0 as expected.)

4) Predict at 𝑥 = 6

𝑦ො 6 = 0.13 + 0.97 ⋅ 6 = 0.13 + 5.82 = 5.95.

LOGISTIC REGRESSION
• Logistic regression is a type of machine learning
model used for classification tasks, where the goal is
to predict whether something belongs to one of two
categories, like yes/no, true/false, or spam/not spam.
• Instead of predicting a continuous number like in
linear regression, logistic regression predicts the
probability that an input belongs to a particular class
(like the probability that an email is spam). The result
is a value between 0 and 1.
What is Logistic Regression?
Logistic regression machine learning is a statistical method that is used for building machine
learning models where the dependent variable is dichotomous: i.e. binary. Logistic regression is
used to describe data and the relationship between one dependent variable and one or more
independent variables. The independent variables can be nominal, ordinal, or of interval type.
The name “logistic regression” is derived from the concept of the logistic function that it uses.
The logistic function is also known as the sigmoid function. The value of this logistic function lies
between zero and one.
The following is an example of a logistic function we can use to find the probability of a vehicle
breaking down, depending on how many years it has been since it was serviced last.

Here is how you can interpret the results from the graph to decide whether the vehicle will break
down or not.
LOGISTIC REGRESSION
• Logistic regression is a type of machine learning
model used for classification tasks, where the goal is
to predict whether something belongs to one of two
categories, like yes/no, true/false, or spam/not spam.
• Instead of predicting a continuous number like in
linear regression, logistic regression predicts the
probability that an input belongs to a particular class
(like the probability that an email is spam). The result
is a value between 0 and 1.
• In Logistic regression, instead of fitting a
regression line, we fit an "S" shaped logistic
function, which predicts two maximum values (0
or 1).
• Logistic Regression can be used to classify the
observations using different types of data and can
easily determine the most effective variables
used for the classification. The below image is
showing the logistic function:
Logistic Regression in Machine Learning
1. Logistic regression is one of the most popular Machine Learning algorithms, which
comes under the Supervised Learning technique. It is used for predicting the
categorical dependent variable using a given set of independent variables.
2. Logistic regression predicts the output of a categorical dependent variable. Therefore,
the outcome must be a categorical or discrete value. It can be either Yes or No, 0 or 1,
true or False, etc., but instead of giving the exact value as 0 and 1, it gives the
probabilistic values which lie between 0 and 1.
3. Logistic Regression is much similar to the Linear Regression except that how they
are used. Linear Regression is used for solving Regression problems,
whereas Logistic Regression is used for solving classification problems.
4. In Logistic regression, instead of fitting a regression line, we fit an "S" shaped
Logistic Function - Sigmoid Function, which predicts two maximum values (0 or
1).
5. The curve from the logistic function indicates the likelihood of something, such as
whether the cells are cancerous or not, whether a mouse is obese or not based on its
weight, etc.
6. Logistic Regression is a significant machine learning algorithm because it can
provide probabilities and classify new data using continuous and discrete datasets.
7. Logistic Regression can be used to classify the observations using different types of
data and can easily determine the most effective variables used for the classification.
The previous image shows the logistic function:
Type of Logistic Regression:
On the basis of the categories, Logistic Regression can be classified into three types:
o Binomial: In binomial Logistic regression, there can be only two possible types of the
dependent variables, such as 0 or 1, Pass or Fail, etc.
o Multinomial: In multinomial Logistic regression, there can be 3 or more possible unordered
types of the dependent variable, such as "cat", "dogs", or "sheep"
o Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered types of
dependent variables, such as "low", "Medium", or "High".

Assumption in a Logistic Regression Algorithm

•In a binary logistic regression, the dependent variable must be binary

•For a binary regression, the factor level one of the dependent variables should
represent the desired outcome
•Only meaningful variables should be included
•The independent variables should be independent of each other. This means the
model should have little or no multicollinearity
•The independent variables are linearly related to the log odds
•Logistic regression requires quite large sample sizes
How Does the Logistic Regression Algorithm Work?

Consider the following example: An organization wants to determine an employee’s salary

increase based on their performance.
For this purpose, a linear regression algorithm will help them decide. Plotting a regression line by
considering the employee’s performance as the independent variable, and the salary increase as the
dependent variable will make their task easier.

Now, what if the organization wants to know whether an employee would get a promotion or not
based on their performance? The above linear graph won’t be suitable in this case. As such, we
clip the line at zero and one, and convert it into a sigmoid curve (S curve).
Based on the threshold values, the organization can decide whether an
employee will get a salary increase or not.

To understand logistic regression, let’s go over the odds of success.

Odds (𝜃) = Probability of an event happening / Probability of an event

not happening

𝜃=p/1-p

The values of odds range from zero to ∞ and the values of probability
lies between zero and one.
Use case of Logistic Regression
Logistic regression can be used to predict if a student will pass or fail an
exam based on the number of hours they spent studying. The dependent
variable is "pass" or "fail", which are represented by the values 1 and 0,
respectively.
Advantages of Logistic Regression:
• Easy to Use: Simple to understand and explain.
• Good for Yes/No Questions: Perfect for predicting things like "Will it rain?" or
"Will I buy this?"
• Gives Probabilities: Shows how likely something is to happen.
• Handles Curved Relationships: Works well even if the relationship isn’t straight.
• Better with Outliers: Less affected by extreme values than some other methods.

Disadvantages of Logistic Regression:

• Needs Lots of Data: Works best with a good amount of information.
• Assumes a Simple Relationship: It assumes a specific way the factors relate to the
outcome.
• Only for Two Options: Mainly used for two choices; not for multiple options.
• Struggles with Complex Patterns: Might not capture very complicated relationships
well.
• Sensitive to Similar Variables: If the factors are too alike, it can cause problems.
Difference Between Linear regression and Logistic Regression

Supervised Learning: Regression & Classification
No ratings yet
Supervised Learning: Regression & Classification
66 pages
2 - Chapter 1 - Measures of Central Tendency and Variation New
100% (1)
2 - Chapter 1 - Measures of Central Tendency and Variation New
18 pages
List of Statistical Packages
No ratings yet
List of Statistical Packages
9 pages
Understanding Linear Regression Concepts
No ratings yet
Understanding Linear Regression Concepts
18 pages
1-Introduction To Statistics PDF
100% (1)
1-Introduction To Statistics PDF
37 pages
Types of Statistics
No ratings yet
Types of Statistics
2 pages
Central Tendency vs. Dispersion Explained
No ratings yet
Central Tendency vs. Dispersion Explained
28 pages
Syllabus 950803 Quantitative Research Methods For Public Policy
100% (1)
Syllabus 950803 Quantitative Research Methods For Public Policy
3 pages
1
100% (1)
1
385 pages
Confidence Interval Estimation Guide
No ratings yet
Confidence Interval Estimation Guide
20 pages
Exercises - SPSS
No ratings yet
Exercises - SPSS
6 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
33 pages
Discriminant Analysis
No ratings yet
Discriminant Analysis
33 pages
ECON4150 - Introductory Econometrics Lecture 2: Review of Statistics
No ratings yet
ECON4150 - Introductory Econometrics Lecture 2: Review of Statistics
41 pages
Lecture Notes 4
No ratings yet
Lecture Notes 4
199 pages
Discrete Probability Distributions Guide
No ratings yet
Discrete Probability Distributions Guide
20 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
11 pages
Homework1 Answers
No ratings yet
Homework1 Answers
9 pages
Book-Sher Muhammad Chaudary - 89-133 PDF
100% (1)
Book-Sher Muhammad Chaudary - 89-133 PDF
45 pages
Methods Vol-2 N.G.
50% (2)
Methods Vol-2 N.G.
2 pages
Path Analysis
100% (1)
Path Analysis
1 page
Distributed Deadlock Detection Strategies
100% (1)
Distributed Deadlock Detection Strategies
40 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
30 pages
Meaning and Definition of Statistics
100% (2)
Meaning and Definition of Statistics
4 pages
Classification Algorithms Guide
No ratings yet
Classification Algorithms Guide
27 pages
Significance of Research in Sciences
No ratings yet
Significance of Research in Sciences
7 pages
Introduction To Econometrics - Stock & Watson - CH 5 Slides
100% (2)
Introduction To Econometrics - Stock & Watson - CH 5 Slides
71 pages
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
No ratings yet
Statistics For Management: Q.1 A) 'Statistics Is The Backbone of Decision Making'. Comment
10 pages
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
100% (1)
Confirmatory Factor Analysis Using AMOS: Step 1: Launch The AMOS Software
12 pages
Session 15 Regression and Correlation
No ratings yet
Session 15 Regression and Correlation
66 pages
Machine Learning Unit 5 Notes
No ratings yet
Machine Learning Unit 5 Notes
45 pages
Internet: Meaning and Definition History Advantages of Internet Disadvantages of Internet
No ratings yet
Internet: Meaning and Definition History Advantages of Internet Disadvantages of Internet
7 pages
Solved Example Problems For Regression Analysis - Maths
No ratings yet
Solved Example Problems For Regression Analysis - Maths
21 pages
Job Interview Guide for Students
No ratings yet
Job Interview Guide for Students
22 pages
RMPS M3
No ratings yet
RMPS M3
38 pages
Measures of Central Tendency
No ratings yet
Measures of Central Tendency
11 pages
University of Mauritius: Reduit Vacancies
No ratings yet
University of Mauritius: Reduit Vacancies
3 pages
Overview of R Programming Language
No ratings yet
Overview of R Programming Language
7 pages
Chepter # 5 Simple Regression and Correlation Exercise # 5 by Shahid Mehmood Simple Regression
No ratings yet
Chepter # 5 Simple Regression and Correlation Exercise # 5 by Shahid Mehmood Simple Regression
7 pages
Introduction To Business Statistics
No ratings yet
Introduction To Business Statistics
455 pages
Correlation
100% (1)
Correlation
49 pages
Statistics For Decisions Making: Dr. Rohit Joshi, IIM Shillong, Rj@iimshillong - in
No ratings yet
Statistics For Decisions Making: Dr. Rohit Joshi, IIM Shillong, Rj@iimshillong - in
10 pages
Sample Design and Sampling Procedures
No ratings yet
Sample Design and Sampling Procedures
43 pages
Ieee Research Paper
No ratings yet
Ieee Research Paper
2 pages
Predicate Logic Inference Examples
No ratings yet
Predicate Logic Inference Examples
2 pages
Software Reliability
100% (1)
Software Reliability
25 pages
Introduction To Time Series Analysis
No ratings yet
Introduction To Time Series Analysis
93 pages
Testing of Hypothesis
No ratings yet
Testing of Hypothesis
9 pages
Object Relational DBMSs
No ratings yet
Object Relational DBMSs
34 pages
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
100% (1)
Univariate, Bivariate and Multivariate Methods in Corpus-Based Lexicography - A Study of Synonymy
614 pages
03 Parametric Families of Distributions
No ratings yet
03 Parametric Families of Distributions
4 pages
2024 - April - UG - B.Sc. Maths - B.Sc. Maths
No ratings yet
2024 - April - UG - B.Sc. Maths - B.Sc. Maths
92 pages
Unit 3
No ratings yet
Unit 3
18 pages
Probability and Random Variables
No ratings yet
Probability and Random Variables
16 pages
Responsiveness and Productivity of Tax Yields
No ratings yet
Responsiveness and Productivity of Tax Yields
30 pages
STAT 231 Course Syllabus - Winter 2017
No ratings yet
STAT 231 Course Syllabus - Winter 2017
8 pages
Determinants: Concepts, Properties, and Applications
No ratings yet
Determinants: Concepts, Properties, and Applications
50 pages
MATH2501 Course Plan: Probability & Statistics
100% (1)
MATH2501 Course Plan: Probability & Statistics
6 pages
DS Unit-Iv
No ratings yet
DS Unit-Iv
34 pages
Regression Analysis Guide
No ratings yet
Regression Analysis Guide
13 pages
AP Statistics Portfolio Q2
No ratings yet
AP Statistics Portfolio Q2
17 pages
Further Mathematics Syllabus Overview
No ratings yet
Further Mathematics Syllabus Overview
18 pages
777-789 Yashika Final Paper
No ratings yet
777-789 Yashika Final Paper
13 pages
Syllabus-SED MATH 10a (Elem Stat and Prob-2023-2024
No ratings yet
Syllabus-SED MATH 10a (Elem Stat and Prob-2023-2024
15 pages
Bmatec301 TLP
No ratings yet
Bmatec301 TLP
12 pages
2024 12 16 15 28 Solution
No ratings yet
2024 12 16 15 28 Solution
28 pages
Probability & Statistics Course Overview
No ratings yet
Probability & Statistics Course Overview
8 pages
Econometric Theory I Course Outline
No ratings yet
Econometric Theory I Course Outline
4 pages
mWOMAC For OA Gujarati 2018
No ratings yet
mWOMAC For OA Gujarati 2018
8 pages
Aids Lab
No ratings yet
Aids Lab
45 pages
Wordwall in Mandarin Learning at UNP
No ratings yet
Wordwall in Mandarin Learning at UNP
9 pages
UG Economics Syllabus Overview
No ratings yet
UG Economics Syllabus Overview
49 pages
Resilience's Role in Young Adult Loneliness
No ratings yet
Resilience's Role in Young Adult Loneliness
2 pages
Acceleration and Deceleration Rates of V
No ratings yet
Acceleration and Deceleration Rates of V
9 pages
PSM Syllabus
No ratings yet
PSM Syllabus
13 pages
B.B.A. - General
No ratings yet
B.B.A. - General
52 pages
B.SC Statistics Syllabus 2021-2022-19.5.2022
No ratings yet
B.SC Statistics Syllabus 2021-2022-19.5.2022
138 pages
'School Administrators' and Stakeholders' Attitudes Toward, and Perspectives On, School Improvement Planning
No ratings yet
'School Administrators' and Stakeholders' Attitudes Toward, and Perspectives On, School Improvement Planning
20 pages
Corporate Finance Technical Foundations Ed11
No ratings yet
Corporate Finance Technical Foundations Ed11
260 pages
MODULE 2 Coursera
No ratings yet
MODULE 2 Coursera
9 pages
Answer 1760619418 Integrated Project Understanding and Trusting Data Part 2 MCQ 22213 4cee1d00 3a82 496f 852d E7ce12ca07a1
No ratings yet
Answer 1760619418 Integrated Project Understanding and Trusting Data Part 2 MCQ 22213 4cee1d00 3a82 496f 852d E7ce12ca07a1
10 pages
Pitfalls of DEA
No ratings yet
Pitfalls of DEA
15 pages
A Level Paper 2017 H2 Math Paper 2 With Solutions (9758)
No ratings yet
A Level Paper 2017 H2 Math Paper 2 With Solutions (9758)
15 pages
Elrazek 2022 IOP Conf. Ser. Mater. Sci. Eng. 1269 012005
No ratings yet
Elrazek 2022 IOP Conf. Ser. Mater. Sci. Eng. 1269 012005
15 pages
Correlation and Regression
No ratings yet
Correlation and Regression
50 pages
Cooling Coffee
No ratings yet
Cooling Coffee
57 pages
A Correlation Analysis On Chlorophyll Content and SPAD Value in Tomato Leaves
No ratings yet
A Correlation Analysis On Chlorophyll Content and SPAD Value in Tomato Leaves
6 pages
Unit - VI Assignment For Maths
No ratings yet
Unit - VI Assignment For Maths
1 page
Markov Regime-Switching Quantile Regression Models and Financial
No ratings yet
Markov Regime-Switching Quantile Regression Models and Financial
6 pages
Faculty Development & Productivity
No ratings yet
Faculty Development & Productivity
52 pages

UNIT 2 - Linear & Logistic Regression Ppt-Inverted

Uploaded by

UNIT 2 - Linear & Logistic Regression Ppt-Inverted

Uploaded by

Simple Linear Regression in

Simple Linear regression algorithm has mainly two objectives:

Mathematically, we can represent a linear regression as:

Y= Dependent Variable (Target Variable)

•Negative Linear Relationship:

o Linear relationship between the features and target:

o Small or no multicollinearity between the features:

o Normal distribution of error terms:

1 1.2 -2 -1.32 2.64 4

Answer: Predicted sales — week 7: 5.16k, week 12: 8.46k.

1) Useful sums and means

෍ 𝑦 = 1.2 + 1.9 + 3.2 + 3.8 + 5.1 = 15.2.

෍ 𝑥 𝑦 = 1 ⋅ 1.2 + 2 ⋅ 1.9 + 3 ⋅ 3.2 + 4 ⋅ 3.8 + 5 ⋅ 5.1 = 55.3.

)2) Regression coefficients

𝑦ො 6 = 0.13 + 0.97 ⋅ 6 = 0.13 + 5.82 = 5.95.

Assumption in a Logistic Regression Algorithm

•In a binary logistic regression, the dependent variable must be binary

Consider the following example: An organization wants to determine an employee’s salary

To understand logistic regression, let’s go over the odds of success.

Odds (𝜃) = Probability of an event happening / Probability of an event

Disadvantages of Logistic Regression:

You might also like