0% found this document useful (0 votes)

33 views12 pages

Understanding Regression Analysis in ML

Regression analysis is a statistical method used to model the relationship between dependent and independent variables, predicting continuous values like sales or temperature. It is a supervised learning technique that helps identify correlations and is essential for forecasting and understanding causal relationships. Various types of regression, including linear, logistic, polynomial, and others, are utilized in machine learning to analyze data and make predictions.

Uploaded by

micheleldavis248

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views12 pages

Understanding Regression Analysis in ML

Uploaded by

micheleldavis248

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Regression Analysis in Machine learning

Regression analysis is a statistical method to model the relationship between a dependent

(target) and independent (predictor) variables with one or more independent variables.
More specifically, Regression analysis helps us to understand how the value of the
dependent variable is changing corresponding to an independent variable when other
independent variables are held fixed. It predicts continuous/real values such
as temperature, age, salary, price, etc.

We can understand the concept of regression analysis using the below example:

Example: Suppose there is a marketing company A, who does various advertisement

every year and get sales on that. The below list shows the advertisement made by the
company in the last 5 years and the corresponding sales:

Now, the company wants to do the advertisement of $200 in the year 2019 and wants to
know the prediction about the sales for this year. So to solve such type of prediction
problems in machine learning, we need regression analysis.

Regression is a supervised learning technique which helps in finding the correlation between
variables and enables us to predict the continuous output variable based on the one or
more predictor variables. It is mainly used for prediction, forecasting, time series
modeling, and determining the causal-effect relationship between variables.
In Regression, we plot a graph between the variables which best fits the given datapoints,
using this plot, the machine learning model can make predictions about the data. In
simple words, "Regression shows a line or curve that passes through all the
datapoints on target-predictor graph in such a way that the vertical distance
between the datapoints and the regression line is minimum." The distance between
datapoints and line tells whether a model has captured a strong relationship or not.

Some examples of regression can be as:

o Prediction of rain using temperature and other factors

o Determining Market trends
o Prediction of road accidents due to rash driving.

Terminologies Related to the Regression Analysis:

o Dependent Variable: The main factor in Regression analysis which we want to
predict or understand is called the dependent variable. It is also called target
variable.
o Independent Variable: The factors which affect the dependent variables or which
are used to predict the values of the dependent variables are called independent
variable, also called as a predictor.
o Outliers: Outlier is an observation which contains either very low value or very high
value in comparison to other observed values. An outlier may hamper the result,
so it should be avoided.
o Multicollinearity: If the independent variables are highly correlated with each
other than other variables, then such condition is called Multicollinearity. It should
not be present in the dataset, because it creates problem while ranking the most
affecting variable.
o Underfitting and Overfitting: If our algorithm works well with the training dataset
but not well with test dataset, then such problem is called Overfitting. And if our
algorithm does not perform well even with training dataset, then such problem is
called underfitting.
Why do we use Regression Analysis?
As mentioned above, Regression analysis helps in the prediction of a continuous variable.
There are various scenarios in the real world where we need some future predictions such
as weather condition, sales prediction, marketing trends, etc., for such case we need some
technology which can make predictions more accurately. So for such case we need
Regression analysis which is a statistical method and used in machine learning and data
science. Below are some other reasons for using Regression analysis:

o Regression estimates the relationship between the target and the independent
variable.
o It is used to find the trends in data.
o It helps to predict real/continuous values.
o By performing the regression, we can confidently determine the most important
factor, the least important factor, and how each factor is affecting the other
factors.

Types of Regression
There are various types of regressions which are used in data science and machine
learning. Each type has its own importance on different scenarios, but at the core, all the
regression methods analyze the effect of the independent variable on dependent
variables. Here we are discussing some important types of regression which are given
below:

o Linear Regression
o Logistic Regression
o Polynomial Regression
o Support Vector Regression
o Decision Tree Regression
o Random Forest Regression
o Ridge Regression
o Lasso Regression:
Linear Regression:

o Linear regression is a statistical regression method which is used for predictive

analysis.
o It is one of the very simple and easy algorithms which works on regression and
shows the relationship between the continuous variables.
o It is used for solving the regression problem in machine learning.
o Linear regression shows the linear relationship between the independent variable
(X-axis) and the dependent variable (Y-axis), hence called linear regression.
o If there is only one input variable (x), then such linear regression is called simple
linear regression. And if there is more than one input variable, then such linear
regression is called multiple linear regression.
o The relationship between variables in the linear regression model can be explained
using the below image. Here we are predicting the salary of an employee on the
basis of the year of experience.
o Below is the mathematical equation for Linear regression:

1. Y= aX+b

Here,

Y= dependent variables (target variables),

X= Independent variables (predictor variables), a and b are the linear coefficients

Some popular applications of linear regression are:

o Analyzing trends and sales estimates

o Salary forecasting
o Real estate prediction
o Arriving at ETAs in traffic.
Logistic Regression:

o Logistic regression is another supervised learning algorithm which is used to solve

the classification problems. In classification problems, we have dependent
variables in a binary or discrete format such as 0 or 1.
o Logistic regression algorithm works with the categorical variable such as 0 or 1, Yes
or No, True or False, Spam or not spam, etc.
o It is a predictive analysis algorithm which works on the concept of probability.
o Logistic regression is a type of regression, but it is different from the linear
regression algorithm in the term how they are used.
o Logistic regression uses sigmoid function or logistic function which is a complex
cost function. This sigmoid function is used to model the data in logistic regression.
The function can be represented as:

o f(x)= Output between the 0 and 1 value.

o x= input to the function
o e= base of natural logarithm.

When we provide the input values (data) to the function, it gives the S-curve as follows:
o It uses the concept of threshold levels, values above the threshold level are
rounded up to 1, and values below the threshold level are rounded up to 0.

There are three types of logistic regression:

o Binary(0/1, pass/fail)
o Multi(cats, dogs, lions)
o Ordinal(low, medium, high)

Polynomial Regression:

o Polynomial Regression is a type of regression which models the non-linear

dataset using a linear model.
o It is similar to multiple linear regression, but it fits a non-linear curve between the
value of x and corresponding conditional values of y.
o Suppose there is a dataset which consists of datapoints which are present in a non-
linear fashion, so for such case, linear regression will not best fit to those
datapoints. To cover such datapoints, we need Polynomial regression.
o In Polynomial regression, the original features are transformed into
polynomial features of given degree and then modeled using a linear
model. Which means the datapoints are best fitted using a polynomial line.
o The equation for polynomial regression also derived from linear regression
equation that means Linear regression equation Y= b0+ b1x, is transformed into
Polynomial regression equation Y= b0+b1x+ b2x2+ b3x3+.....+ bnxn.
o Here Y is the predicted/target output, b0, b1,... bn are the regression
coefficients. x is our independent/input variable.
o The model is still linear as the coefficients are still linear with quadratic

Note: This is different from Multiple Linear regression in such a way that in Polynomial
regression, a single element has different degrees instead of multiple variables with the same
degree.

Support Vector Regression:

Support Vector Machine is a supervised learning algorithm which can be used for
regression as well as classification problems. So if we use it for regression problems, then
it is termed as Support Vector Regression.

Support Vector Regression is a regression algorithm which works for continuous variables.
Below are some keywords which are used in Support Vector Regression:

o Kernel: It is a function used to map a lower-dimensional data into higher

dimensional data.
o Hyperplane: In general SVM, it is a separation line between two classes, but in
SVR, it is a line which helps to predict the continuous variables and cover most of
the datapoints.
o Boundary line: Boundary lines are the two lines apart from hyperplane, which
creates a margin for datapoints.
o Support vectors: Support vectors are the datapoints which are nearest to the
hyperplane and opposite class.

In SVR, we always try to determine a hyperplane with a maximum margin, so that

maximum number of datapoints are covered in that margin. The main goal of SVR is to
consider the maximum datapoints within the boundary lines and the hyperplane
(best-fit line) must contain a maximum number of datapoints. Consider the below
image:
Here, the blue line is called hyperplane, and the other two lines are known as boundary
lines.

Decision Tree Regression:

o Decision Tree is a supervised learning algorithm which can be used for solving both
classification and regression problems.
o It can solve problems for both categorical and numerical data
o Decision Tree regression builds a tree-like structure in which each internal node
represents the "test" for an attribute, each branch represent the result of the test,
and each leaf node represents the final decision or result.
o A decision tree is constructed starting from the root node/parent node (dataset),
which splits into left and right child nodes (subsets of dataset). These child nodes
are further divided into their children node, and themselves become the parent
node of those nodes. Consider the below image:
Above image showing the example of Decision Tee regression, here, the model is trying
to predict the choice of a person between Sports cars or Luxury car.

o Random forest is one of the most powerful supervised learning algorithms which
is capable of performing regression as well as classification tasks.
o The Random Forest regression is an ensemble learning method which combines
multiple decision trees and predicts the final output based on the average of each
tree output. The combined decision trees are called as base models, and it can be
represented more formally as:

g(x)= f0(x)+ f1(x)+ f2(x)+....

o Random forest uses Bagging or Bootstrap Aggregation technique of ensemble

learning in which aggregated decision tree runs in parallel and do not interact with
each other.
o With the help of Random Forest regression, we can prevent Overfitting in the
model by creating random subsets of the dataset.
Ridge Regression:

o Ridge regression is one of the most robust versions of linear regression in which a
small amount of bias is introduced so that we can get better long term predictions.
o The amount of bias added to the model is known as Ridge Regression penalty.
We can compute this penalty term by multiplying with the lambda to the squared
weight of each individual features.
o The equation for ridge regression will be:

o A general linear or polynomial regression will fail if there is high collinearity

between the independent variables, so to solve such problems, Ridge regression
can be used.
o Ridge regression is a regularization technique, which is used to reduce the
complexity of the model. It is also called as L2 regularization.
o It helps to solve the problems if we have more parameters than samples.

Lasso Regression:

o Lasso regression is another regularization technique to reduce the complexity of

the model.
o It is similar to the Ridge Regression except that penalty term contains only the
absolute weights instead of a square of weights.
o Since it takes absolute values, hence, it can shrink the slope to 0, whereas Ridge
Regression can only shrink it near to 0.
o It is also called as L1 regularization. The equation for Lasso regression will be:

Regression Analysis and Techniques Overview
No ratings yet
Regression Analysis and Techniques Overview
39 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
32 pages
Understanding Regression Analysis in ML
No ratings yet
Understanding Regression Analysis in ML
41 pages
Understanding Regression Analysis in ML
No ratings yet
Understanding Regression Analysis in ML
48 pages
Understanding Regression Analysis in ML
No ratings yet
Understanding Regression Analysis in ML
11 pages
Understanding Regression Analysis in ML
No ratings yet
Understanding Regression Analysis in ML
12 pages
Supervised Learning: Regression Techniques
No ratings yet
Supervised Learning: Regression Techniques
67 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
26 pages
Understanding Linear Regression in ML
No ratings yet
Understanding Linear Regression in ML
66 pages
Understanding Regression in Machine Learning
100% (2)
Understanding Regression in Machine Learning
20 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
57 pages
Regression Guide for Supporting Characters
100% (1)
Regression Guide for Supporting Characters
21 pages
Understanding Regression in Machine Learning
No ratings yet
Understanding Regression in Machine Learning
9 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
13 pages
Supervised Learning: Regression Techniques
No ratings yet
Supervised Learning: Regression Techniques
25 pages
Understanding Regression Analysis Types
No ratings yet
Understanding Regression Analysis Types
5 pages
Gini Index and Decision Trees Explained
No ratings yet
Gini Index and Decision Trees Explained
17 pages
Supervised Learning: Regression Analysis
No ratings yet
Supervised Learning: Regression Analysis
22 pages
Understanding Regression in Machine Learning
No ratings yet
Understanding Regression in Machine Learning
15 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
22 pages
Understanding Regression in Machine Learning
No ratings yet
Understanding Regression in Machine Learning
7 pages
Types of Machine Learning Algorithms
No ratings yet
Types of Machine Learning Algorithms
9 pages
Machine Learning Algorithms Overview
No ratings yet
Machine Learning Algorithms Overview
46 pages
Understanding Linear Regression Concepts
No ratings yet
Understanding Linear Regression Concepts
27 pages
Understanding Regression Models Basics
No ratings yet
Understanding Regression Models Basics
26 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
12 pages
Types of Regression Techniques Explained
No ratings yet
Types of Regression Techniques Explained
13 pages
Understanding Regression Techniques in ML
No ratings yet
Understanding Regression Techniques in ML
20 pages
Understanding Regression Techniques
No ratings yet
Understanding Regression Techniques
31 pages
Overview of Regression Types
No ratings yet
Overview of Regression Types
8 pages
Understanding Regression in ML
No ratings yet
Understanding Regression in ML
16 pages
Understanding Regression Analysis in ML
No ratings yet
Understanding Regression Analysis in ML
53 pages
Machine Learning Techniques Overview
No ratings yet
Machine Learning Techniques Overview
16 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
20 pages
Machine Learning: Classification vs. Regression
No ratings yet
Machine Learning: Classification vs. Regression
13 pages
Overview of Supervised Learning
No ratings yet
Overview of Supervised Learning
24 pages
Classification vs. Regression Algorithms
No ratings yet
Classification vs. Regression Algorithms
19 pages
Machine Learning Regression Techniques
No ratings yet
Machine Learning Regression Techniques
26 pages
Regression and Classification Models Overview
No ratings yet
Regression and Classification Models Overview
50 pages
Types of Regression Analysis Explained
100% (1)
Types of Regression Analysis Explained
73 pages
Supervised Machine Learning: Regression Overview
No ratings yet
Supervised Machine Learning: Regression Overview
48 pages
Understanding Regression Analysis
No ratings yet
Understanding Regression Analysis
11 pages
Understanding Machine Learning Types
No ratings yet
Understanding Machine Learning Types
12 pages
Regression Analysis Techniques Overview
No ratings yet
Regression Analysis Techniques Overview
133 pages
Understanding Linear Models in ML
No ratings yet
Understanding Linear Models in ML
60 pages
Regression Analysis in Machine Learning
No ratings yet
Regression Analysis in Machine Learning
50 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
22 pages
SVR vs Linear Regression Explained
No ratings yet
SVR vs Linear Regression Explained
12 pages
Supervised Machine Learning Overview
No ratings yet
Supervised Machine Learning Overview
14 pages
Understanding Dependent Variables in Regression
No ratings yet
Understanding Dependent Variables in Regression
5 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
25 pages
Understanding Regression Analysis Techniques
No ratings yet
Understanding Regression Analysis Techniques
77 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
45 pages
Understanding Regression Analysis in AI
No ratings yet
Understanding Regression Analysis in AI
11 pages
Supervised Learning & Regression Techniques
No ratings yet
Supervised Learning & Regression Techniques
20 pages
Machine Learning Regression Techniques
No ratings yet
Machine Learning Regression Techniques
16 pages
Linear Separability in Regression Models
No ratings yet
Linear Separability in Regression Models
32 pages
Linear Regression in Machine Learning
No ratings yet
Linear Regression in Machine Learning
54 pages
Types of Regression Analysis Explained
No ratings yet
Types of Regression Analysis Explained
16 pages
Introduction to Regression Modeling Techniques
No ratings yet
Introduction to Regression Modeling Techniques
48 pages
Understanding Multiple Regression Analysis
No ratings yet
Understanding Multiple Regression Analysis
5 pages
Understanding Stack Operations and Applications
No ratings yet
Understanding Stack Operations and Applications
41 pages
Understanding Linear Regression Basics
No ratings yet
Understanding Linear Regression Basics
6 pages
Understanding Cost Functions in ML
No ratings yet
Understanding Cost Functions in ML
5 pages
Machine Learning Basics and Applications
No ratings yet
Machine Learning Basics and Applications
22 pages
Supervised vs. Unsupervised Learning Guide
No ratings yet
Supervised vs. Unsupervised Learning Guide
12 pages
MIS Course Overview and Topics
No ratings yet
MIS Course Overview and Topics
1 page
Solving Linear ODEs with Constant Coefficients
No ratings yet
Solving Linear ODEs with Constant Coefficients
5 pages
First Order Linear ODE Solutions Guide
No ratings yet
First Order Linear ODE Solutions Guide
4 pages
Solving First Order ODEs: Separation Method
No ratings yet
Solving First Order ODEs: Separation Method
6 pages
Đề Thi Thử Tốt Nghiệp 2021 - Tiếng Anh
No ratings yet
Đề Thi Thử Tốt Nghiệp 2021 - Tiếng Anh
7 pages
Understanding Participatory Approaches
No ratings yet
Understanding Participatory Approaches
11 pages
Profood International Corp: A Plant Visit
No ratings yet
Profood International Corp: A Plant Visit
5 pages
Stakeholder vs. Shareholder Theories Explained
No ratings yet
Stakeholder vs. Shareholder Theories Explained
25 pages
GIS Engineer Resume of UPPATI ANIL KUMAR
No ratings yet
GIS Engineer Resume of UPPATI ANIL KUMAR
2 pages
Prewriting Skills for Effective Writing
No ratings yet
Prewriting Skills for Effective Writing
12 pages
Uttarakhand JE/AE Previous Year Questions
No ratings yet
Uttarakhand JE/AE Previous Year Questions
232 pages
Charge Distribution in Spherical Shells
No ratings yet
Charge Distribution in Spherical Shells
12 pages
Heat Exchanger Calculations in Food Processing
No ratings yet
Heat Exchanger Calculations in Food Processing
4 pages
Understanding Care Ethics in Feminism
No ratings yet
Understanding Care Ethics in Feminism
5 pages
Frequency Tables and Measurement Levels
No ratings yet
Frequency Tables and Measurement Levels
9 pages
3rd Grade Admission Document Checklist
No ratings yet
3rd Grade Admission Document Checklist
55 pages
Upper-Intermediate English Test
No ratings yet
Upper-Intermediate English Test
30 pages
Student Performance Summary Report
No ratings yet
Student Performance Summary Report
1 page
On-Site Placement Verification SOP
No ratings yet
On-Site Placement Verification SOP
6 pages
Shovel Production: Cut Height & Swing Angle Effects
No ratings yet
Shovel Production: Cut Height & Swing Angle Effects
1 page
Graph Search Algorithms Implementation
No ratings yet
Graph Search Algorithms Implementation
5 pages
Learning Reflection and Collaboration Rubric
No ratings yet
Learning Reflection and Collaboration Rubric
2 pages
Applying The 2010 ASCE 7 Wind and Ice Requirements To Transmission Line Design
No ratings yet
Applying The 2010 ASCE 7 Wind and Ice Requirements To Transmission Line Design
12 pages
Overview of Indian Chemical Industry
No ratings yet
Overview of Indian Chemical Industry
13 pages
Water Shut Off Techniques in Well Stimulation
No ratings yet
Water Shut Off Techniques in Well Stimulation
117 pages
Advanced Honeycomb Designs Review
No ratings yet
Advanced Honeycomb Designs Review
24 pages
Ethics Module 3 Cultural Relativism PDF
No ratings yet
Ethics Module 3 Cultural Relativism PDF
37 pages
Omega Air Product Data Sheet Filter Element XR AF and AAF v4.00
No ratings yet
Omega Air Product Data Sheet Filter Element XR AF and AAF v4.00
2 pages
Ethics and Morality in Scientific Research
No ratings yet
Ethics and Morality in Scientific Research
150 pages
The Effect of Clinker Microstructure On Grindability
100% (1)
The Effect of Clinker Microstructure On Grindability
99 pages
2321mab102t - Question Bank Cat 1 Matrices and Calculus I Year
No ratings yet
2321mab102t - Question Bank Cat 1 Matrices and Calculus I Year
5 pages
NinjaTrader Condition Builder Guide
No ratings yet
NinjaTrader Condition Builder Guide
19 pages
Snell's Law Lab Analysis
No ratings yet
Snell's Law Lab Analysis
3 pages
Class 9 Work Done Numericals Guide
79% (24)
Class 9 Work Done Numericals Guide
2 pages

Understanding Regression Analysis in ML

Uploaded by

Understanding Regression Analysis in ML

Uploaded by

Regression Analysis in Machine learning

Regression analysis is a statistical method to model the relationship between a dependent

Example: Suppose there is a marketing company A, who does various advertisement

Some examples of regression can be as:

o Prediction of rain using temperature and other factors

Terminologies Related to the Regression Analysis:

o Linear regression is a statistical regression method which is used for predictive

Y= dependent variables (target variables),

Some popular applications of linear regression are:

o Analyzing trends and sales estimates

o Logistic regression is another supervised learning algorithm which is used to solve

o f(x)= Output between the 0 and 1 value.

There are three types of logistic regression:

o Polynomial Regression is a type of regression which models the non-linear

Support Vector Regression:

o Kernel: It is a function used to map a lower-dimensional data into higher

In SVR, we always try to determine a hyperplane with a maximum margin, so that

Decision Tree Regression:

g(x)= f0(x)+ f1(x)+ f2(x)+....

o Random forest uses Bagging or Bootstrap Aggregation technique of ensemble

o A general linear or polynomial regression will fail if there is high collinearity

o Lasso regression is another regularization technique to reduce the complexity of

You might also like