Linear regression with gradient descent

Linear Regression
with gradient
Descent
A guide by landofai.com
Code: Linear regression

Gradient Descent
Gradient Descent is the most popular optimization
strategy, used machine learning and deep learning
right now.
It can be combined with almost every algorithm
yet it is easy to understand. So, everyone planning
to go on the journey of machine learning should
understand this.

Intuition
It is simply used to find the values of the
parameters at which the given function reaches
its nearest minimum cost.

Intuition
"A gradient is the ratio which relates the input and output of a
function. How small changes drives changes in the output of the
function."
Suppose we have a function f(x) = x2. Then the derivative of the
function, f’(x) is 2*x. It means if the x is changed 1 unit then f(x) will
change 2*1.

1. A blindfolded person starts at top of hill.
2. Checks for the steepest direction
downward on that point.
3. Take a step in that direction.
4. Again checks for the steepest direction
towards downward.
5. Repeat until the steep/gradient is
acceptable or is flat.

Math proving this
The equation below shows how it's done: 'x(next)' is the new
position of the person, 'x(current)' is the current position,
subtraction means that we 'gamma is the step and 'nabla f(x)'
is the gradient showing the steepest direction.

Let's take another example
of Linear regression
technique in machine
learning,
We have to find optimal 'w'
and 'b' for the cost function
J(w, b) where J is minimum.
below is an illustration of
convex function, (w and b)
are represented on the
horizontal axes, while J(w, b)
is represented on the vertical
axis.

Learning rate
The steps which are taken to reach optimal point decides the rate
of gradient descent. It is often referred to as 'Learning rate'(i.e.,
The size of the steps).
➔ Too big
bounce between the convex function and may not reach
the local minimum.
➔ Too small
gradient descent will eventually reach the local minimum
but it will take too much time for that
➔ Just right
gradient descent will eventually reach the local minimum
but it will take too much time for that

Gradient Descent types
● Batch Gradient Descent
A.k.a. Vanilla Gradient Descent. Calculates error for each
example. Model is updated only after an epoch.
● Stochastic Gradient descent
SGD unlike vanilla, iterates over each example while updating
the model. Frequent updates can be computationally more
expensive.
● Mini Batch Gradient Descent
a combination of concepts of both SGD and Batch Gradient
Descent.
○ Splits data into batches then performs update on
batches balancing between the efficiency of batch
gradient descent and the robustness of SGD.

Linear Regression.
Just Give me The code.: GradientDescentDemo
Visitors to his store, mostly tourists, speak many
different languages making anything beyond a simple
transaction a challenge.

Y=mX+b
1. Our goal is to best fit a line for given
points.
2. Start by random m and b.
3. Calculate error between predicted Y and
true Y.
4. Adjust m and b with gradient descent
5. Repeat until satisfactory result is achieved.

Error
Here, Mean Squared Error(MSE)

How to update m and b
Updated value = old value - learning rare * gradient

Good luck!
Thank You for your interest.
AI Adventures

Linear regression with gradient descent

In this document

More Related Content

What's hot

Similar to Linear regression with gradient descent

Recently uploaded

Linear regression with gradient descent