0% found this document useful (0 votes)

26 views18 pages

3.linear Regression

Linear regression is a supervised machine learning algorithm used for predicting continuous outputs based on input variables. It can be categorized into simple regression, which uses a single variable, and multivariate regression, which involves multiple variables. The learning process involves minimizing the Mean Squared Error (MSE) through methods like gradient descent to optimize the coefficients that define the relationship between inputs and outputs.

Uploaded by

Sairam Manne

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views18 pages

3.linear Regression

Uploaded by

Sairam Manne

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Linear Regression

• A supervised machine learning algorithm.

• Predicted output is continuous. Eg: Estimate sales, cost etc.
• Let x denote the set of input variables and y denotes the
output variable.
• As y is obtained from x, x is also called a set of Attributes or
Features to determine y .
• Collect n data points, (xi , yi ), i = 1, 2, · · · n – Training data.
• Choose a linear function, y = f (x) and estimate coefficients
of f using n training points – Learning.
• That means, f , which gives the relationship between x and y ,
is learnt from data.
1
Linear Regression (contd.)

• After learning f , new points can be predicted by, y = f (x).

• Types of linear regression:
Simple Regression – Based on a single variable.
Multivariate Regression – Based on multiple variables.
• Simple Regression: Choose y = f (x) = w0 + w1 x, and learn
coefficients or weights, w0 and w1 using n training data
points, (xi , yi ), i = 1, 2, · · · n.

2
Linear Regression (contd.)

• Multivariate Regression:
Choose y = f (x1 , x2 , · · · xk ) = w0 + w1 x1 + w2 x2 + · · · + wk xk .
(a) Learn weights, w0 , w1 , · · · wk , using n training data
points, (xi1 , xi2 , · · · xik , yi ), i = 1, 2, · · · n.
(b) Input data is a matrix of size n × (k + 1), where each row
i, denotes a k-dimensional input, (xi1 , xi2 , · · · xik ) and its
output, yi .
(c) In compact form, data can be represented by: (xi , yi ),
i = 1, 2, · · · n, where each xi is a k-dimensional vector.

3
An example
• Consider the following dataset for a problem related to
computer hardware.
Estimate the CPU relative performance (output variable
denoted by CRP) based on the input attributes: Vendor
name, Model name, Machine cycle time, Minimum main
memory in KB, Maximum main memory in KB, Cache
memory in KB, Minimum and maximum channels.
• Standard datasets for different ML problems are available and
maintained in the UCI repository:
[Link]

4
An example (contd.)

• Some instances (rows) from the UCI dataset for computer

hardware problem are shown in the below table.
Vendor Model MCT MINMain MaxMain CacheMem MinCh MaxCh CRP
honeywell dps:8/52 140 2000 32000 32 1 54 141
honeywell dps:8/62 140 2000 32000 32 1 54 189
ibm 3033:s 57 4000 16000 1 6 12 132
ibm 3033:u 57 4000 24000 64 12 16 237
hp 3000/88 75 3000 8000 8 3 48 64
hp 3000/iii 175 256 2000 0 3 24 22

5
Learning algorithm
• Learn coefficients or weights by minimizing an error function.
• Error function: Mean Squared Error (MSE) between actual
and predicted values over n training points.
MSE = J(w0 , w1 , · · · , wk ) = J(W ) =
n n
X
2 1X
1
n (yi −f (xi )) = (yi −w0 −w1 xi1 −w2 xi2 −· · ·−wk xik )2
n
i=1 i=1
Note that each xi is a k-dimensional vector and W is the
weight vector, (w0 , w1 , · · · , wk ).
• Here, yi and f (xi ) are the actual and predicted output values
for xi , respectively.
• Compute the weights, w0 , w1 , · · · , wk , in such a manner that
MSE is minimized.
6
Learning algorithm (contd.)

• Analytical solution (based on differentiation) is

computationally expensive, especially in higher dimensions.
• As such, Gradient descent algorithm is used to find the
solution iteratively.

7
Gradient descent method

• Gradient descent is an optimization algorithm used to

minimize some function by iteratively moving in the direction
of steepest descent.
• Steepest descent is defined by the negative of the gradient.
• For error function,
denoted by ∇J(W ).
J(W ), gradient is
∂J(W ) ∂J(W ) ∂J(W )
∇J(W ) = ∂w0 , ∂w1 , · · · , ∂wk .
• For example,
n
∂J(W )
X
1
∂wj = n (−xij ).2(yi − w0 − w1 xi1 − w2 xi2 − · · · − wk xik )
i=1
• Each component in the above gradient vector gives rate of
change of J(W ) with respect to each weight, wi .

8
Gradient descent method (contd.)

• Each weight is updated by taking a step (η) in the opposite

(negative) direction of the error gradient.
For each j,
δwj = −η ∂J(W )
∂wj .
wj = wj + δwj .
Here, η is a learning parameter controls the distance to move
in the direction of negative error gradient.
Vector representation: W = W + δW , where
δW = (δw0 , δw1 , · · · , δwk ).
That means, move in the direction of negative gradient
towards the minimum of the error function.

9
Gradient descent method (contd.)
• The above weight updation process can be repeated for
several iterations until the minimum point for the error
function is reached. Each such iteration is called an Epoch.
• Initially, random values are assigned to the weights.
• This updation of weights for a number of epochs to obtain
optimal weights (corresponding to minimum of the error
function) is called Training.

10
Gradient descent method (contd.)

• A one-dimensional example:

11
Training algorithm

• Initialize each wj , j = 1, 2, · · · k, to some random values.

• For one or more epochs or until some minimum error
threshold (say, ϵ < 0.001) is reached, do the following:
For each j = 1, 2, · · · k,
(i) δwj = −η ∂J(W
∂wj
)

(ii) wj = wj + δwj
• Training process can be monitored by plotting training
iterations/epochs vs MSE .

12
Training algorithm (contd.)

13
Training algorithm (contd.)

• After weights are learnt from the data using training process,
the function, y = f (x1 , x2 , · · · , xk ) can be used to predict
output, y , for any new input, (x1 , x2 , · · · , xk ) – Testing.

14
How to choose Learning rate (η)
• Learning rate parameter, η, controls the rate or speed at
which the weights (model) are learnt in the training process.
• A high learning rate can cover more distance at each step, but
there is a risk of overshooting the minimum point.
• A low learning rate is more precise but time consuming, due
to more number of gradient calculations.

15
How to choose Learning rate (η) (contd.)

• In the below figure, weight (θ) update steps on the MSE

curve (J(θ)), are illustrated for different values of η.

16
Thank You

Linear Regression for Beginners
No ratings yet
Linear Regression for Beginners
11 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Linear Regression For Machine Learning Course
No ratings yet
Linear Regression For Machine Learning Course
41 pages
Stochastic Gradient Descent Algorithm
No ratings yet
Stochastic Gradient Descent Algorithm
6 pages
Linear Regression
No ratings yet
Linear Regression
91 pages
Module 3
No ratings yet
Module 3
27 pages
Linear Regression
No ratings yet
Linear Regression
95 pages
Module2 Optimizations
No ratings yet
Module2 Optimizations
65 pages
Machine Learning Guide 2017
No ratings yet
Machine Learning Guide 2017
15 pages
5.1loss Function, Optimization, GD
No ratings yet
5.1loss Function, Optimization, GD
39 pages
Lecture Notes 5 Linear Regression
No ratings yet
Lecture Notes 5 Linear Regression
11 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
Machine Learning 45 A 87
No ratings yet
Machine Learning 45 A 87
43 pages
Advanced Machine Learning Techniques
No ratings yet
Advanced Machine Learning Techniques
164 pages
ML - Mca
No ratings yet
ML - Mca
48 pages
Linear Regression
No ratings yet
Linear Regression
6 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
ML Notes
No ratings yet
ML Notes
14 pages
Unit-III Advanced Machine Learning
No ratings yet
Unit-III Advanced Machine Learning
8 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
54 pages
Module3 Ch1
No ratings yet
Module3 Ch1
83 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
CS229
No ratings yet
CS229
69 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Notes 1
No ratings yet
Notes 1
30 pages
Linear Regression and Gradient Descent
No ratings yet
Linear Regression and Gradient Descent
29 pages
AIMLB PGP 2025 Session 5
No ratings yet
AIMLB PGP 2025 Session 5
67 pages
GradientDescent-Regression Slides
No ratings yet
GradientDescent-Regression Slides
26 pages
Regression
No ratings yet
Regression
30 pages
Math YHPLinear Regression
No ratings yet
Math YHPLinear Regression
13 pages
MECH4403 LR Week04
No ratings yet
MECH4403 LR Week04
25 pages
Gradient Descent
No ratings yet
Gradient Descent
5 pages
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
No ratings yet
Linear Classifier: by Dr. Sanjeev Kumar Associate Professor Department of Mathematics IIT Roorkee, Roorkee-247 667, India
86 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Neural Network - Optimization DRAFT 3.11
No ratings yet
Neural Network - Optimization DRAFT 3.11
66 pages
Lecture3 Upload
No ratings yet
Lecture3 Upload
28 pages
Foundations of Machine Learning - 3
No ratings yet
Foundations of Machine Learning - 3
38 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
ch6 (Q 2,8,4)
No ratings yet
ch6 (Q 2,8,4)
9 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
Brief Summary ML
No ratings yet
Brief Summary ML
25 pages
CM20315 06 Fitting
No ratings yet
CM20315 06 Fitting
67 pages
Gradient Descent
No ratings yet
Gradient Descent
16 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
Intro to Machine Learning Concepts
No ratings yet
Intro to Machine Learning Concepts
8 pages
Eem520l3 2023
No ratings yet
Eem520l3 2023
25 pages
Predictive Maintenance
No ratings yet
Predictive Maintenance
66 pages
11 Gradient Descent
No ratings yet
11 Gradient Descent
58 pages
Machine Learning Categories Explained
No ratings yet
Machine Learning Categories Explained
8 pages
Module I Complete Notes
No ratings yet
Module I Complete Notes
136 pages
02 - Linear Models - A
No ratings yet
02 - Linear Models - A
23 pages
NN Theory
No ratings yet
NN Theory
138 pages
Machine Learning Notes by Standard Andrew NG
No ratings yet
Machine Learning Notes by Standard Andrew NG
142 pages
Supervised Learning and Linear Regression
No ratings yet
Supervised Learning and Linear Regression
141 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
6 Stored Procedures
No ratings yet
6 Stored Procedures
33 pages
Database Management Systems - Syllabus
No ratings yet
Database Management Systems - Syllabus
2 pages
Nomalization - Navathe
No ratings yet
Nomalization - Navathe
61 pages
Functional Dependency
No ratings yet
Functional Dependency
35 pages
2.1 BinaryRepresentation
No ratings yet
2.1 BinaryRepresentation
48 pages
2.3 Int Arthematic
No ratings yet
2.3 Int Arthematic
37 pages
MTL107 Set7
No ratings yet
MTL107 Set7
5 pages
NMSE 2016 Exercises 4 Solution
No ratings yet
NMSE 2016 Exercises 4 Solution
3 pages
Assignment Problem Examples Explained
No ratings yet
Assignment Problem Examples Explained
41 pages
Commedia Ch02polynomialsmcq 20240910173308
No ratings yet
Commedia Ch02polynomialsmcq 20240910173308
6 pages
2023-24 F2 MAT Mid-Year Exam Paper I Sol - Eng
No ratings yet
2023-24 F2 MAT Mid-Year Exam Paper I Sol - Eng
7 pages
Algorithms 17 00048 v2
No ratings yet
Algorithms 17 00048 v2
23 pages
Linear Programming Examples and Solutions
No ratings yet
Linear Programming Examples and Solutions
147 pages
Matrix Chain Multiplication
No ratings yet
Matrix Chain Multiplication
11 pages
Graphing Polynomials Worksheet
No ratings yet
Graphing Polynomials Worksheet
6 pages
Polymath Tutorial
No ratings yet
Polymath Tutorial
17 pages
Understanding Algorithm Complexity
No ratings yet
Understanding Algorithm Complexity
15 pages
CTPS Cat 1 Set 1
No ratings yet
CTPS Cat 1 Set 1
1 page
Deep Learning Model Setup
No ratings yet
Deep Learning Model Setup
29 pages
Galerkin's Method for ODE-BVP Solutions
No ratings yet
Galerkin's Method for ODE-BVP Solutions
14 pages
Polynomials Hand in Assignment #1
No ratings yet
Polynomials Hand in Assignment #1
5 pages
DLT PYQs
No ratings yet
DLT PYQs
3 pages
Cordeau Ropke 2009 Branch and Cut and Price For The Pickup and Delivery Problem With Time Windows
No ratings yet
Cordeau Ropke 2009 Branch and Cut and Price For The Pickup and Delivery Problem With Time Windows
21 pages
ML Lab Manual
100% (1)
ML Lab Manual
37 pages
Computational Biology Course Guide
No ratings yet
Computational Biology Course Guide
3 pages
Lagranges Interpolation Formula For Unequal Interval
0% (1)
Lagranges Interpolation Formula For Unequal Interval
18 pages
Newtons Divided Difference Interpolation
No ratings yet
Newtons Divided Difference Interpolation
9 pages
1 3 Simplificatn N Factorisatn
No ratings yet
1 3 Simplificatn N Factorisatn
11 pages
Week 3 Notes
No ratings yet
Week 3 Notes
10 pages
Optimization With R - Tips and Tricks
No ratings yet
Optimization With R - Tips and Tricks
17 pages
Assignment 1 (Marks 10 10 100) (Graphical, Simplex, Two Phase and Big M Method)
No ratings yet
Assignment 1 (Marks 10 10 100) (Graphical, Simplex, Two Phase and Big M Method)
3 pages
Ece 4219
No ratings yet
Ece 4219
2 pages
AI Sports Win Prediction Model
No ratings yet
AI Sports Win Prediction Model
24 pages
许韬-Lecture Notes 03 - Chap2-Sec1 2025
No ratings yet
许韬-Lecture Notes 03 - Chap2-Sec1 2025
23 pages
Numerical Methods SMJM 3053: Nurhazimah Nazmi, PHD 5.33.01 Nurhazimah@Utm - My
No ratings yet
Numerical Methods SMJM 3053: Nurhazimah Nazmi, PHD 5.33.01 Nurhazimah@Utm - My
30 pages
Understanding Linear Regression Models
No ratings yet
Understanding Linear Regression Models
7 pages

3.linear Regression

Uploaded by

3.linear Regression

Uploaded by

Linear Regression

• A supervised machine learning algorithm.

• After learning f , new points can be predicted by, y = f (x).

• Some instances (rows) from the UCI dataset for computer

• Analytical solution (based on differentiation) is

• Gradient descent is an optimization algorithm used to

• Each weight is updated by taking a step (η) in the opposite

• Initialize each wj , j = 1, 2, · · · k, to some random values.

• In the below figure, weight (θ) update steps on the MSE

You might also like