0% found this document useful (0 votes)
417 views2 pages

Stochastic Gradient Descent Overview

Stochastic search methods, also known as stochastic optimization methods, utilize randomness to solve complex optimization problems and are effective in various fields like machine learning and engineering. These methods often employ heuristic search techniques and include algorithms inspired by biological processes, such as Genetic Algorithms and Simulated Annealing. Stochastic Gradient Descent (SGD) is a specific example that improves computational efficiency by using a single randomly selected sample for each iteration instead of the entire dataset.

Uploaded by

razorquake2499
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
417 views2 pages

Stochastic Gradient Descent Overview

Stochastic search methods, also known as stochastic optimization methods, utilize randomness to solve complex optimization problems and are effective in various fields like machine learning and engineering. These methods often employ heuristic search techniques and include algorithms inspired by biological processes, such as Genetic Algorithms and Simulated Annealing. Stochastic Gradient Descent (SGD) is a specific example that improves computational efficiency by using a single randomly selected sample for each iteration instead of the entire dataset.

Uploaded by

razorquake2499
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Stochastic search methods

 Also known as stochastic optimization methods, are a class of optimization


techniques that use randomness or randomness-inspired mechanisms to find
solutions to optimization problems.
 Particularly useful when dealing with complex and noisy optimization problems
where traditional deterministic approaches may be less effective.
 Widely used in various fields, including machine learning, engineering, operations
research, and economics.

Stochastic Optimization Algorithms


The use of randomness in the algorithms often means that the techniques are referred to as
“heuristic search” as they use a rough rule-of-thumb procedure that may or may not work
to find the optima instead of a precise procedure.
- Many stochastic algorithms are inspired by a biological or natural process and
may be referred to as “metaheuristics” as a higher-order procedure providing the
conditions for a specific search of the objective function.
- They are also referred to as “black box” optimization algorithms. There are many
stochastic optimization algorithms.
Some examples of stochastic optimization algorithms include
1. Iterated Local Search
2. Stochastic Hill Climbing
3. Stochastic Gradient Descent
4. Tabu Search
5. Greedy Randomized Adaptive Search Procedure

Some examples of stochastic optimization algorithms that are inspired by biological or


physical processes include:
1. Simulated Annealing
2. Evolution Strategies
3. Genetic Algorithm
4. Differential Evolution
5. Particle Swarm Optimization
Stochastic Gradient Descent (SGD):
The word “stochastic” means a system or a process that is linked with a random probability.
Hence, in Stochastic Gradient Descent, a few samples are selected randomly instead of the
whole data set for each iteration.
- In Gradient Descent, there is a term called “batch” which denotes the total number
of samples from a dataset that is used for calculating the gradient for each iteration.
- In typical Gradient Descent optimization, like Batch Gradient Descent, the batch is
taken to be the whole dataset. Although, using the whole dataset is really useful for
getting to the minima in a less noisy and less random manner, but the problem arises
when our datasets get big.
Suppose, you have a million samples in your dataset, so if you use a typical Gradient Descent
optimization technique, you will have to use all of the one million samples for completing
one iteration while performing the Gradient Descent, and it has to be done for every
iteration until the minima are reached. Hence, it becomes computationally very expensive to
perform. This problem is solved by Stochastic Gradient Descent. In SGD, it uses only a single
sample, i.e., a batch size of one, to perform each iteration. The sample is randomly shuffled
and selected for performing the iteration.

Example of stochastic
Stochastic processes are widely used as mathematical models of systems and phenomena
that appear to vary in a random manner. Examples include the growth of a bacterial
population, an electrical current fluctuating due to thermal noise, or the movement of a gas
molecule.
Advantages to stochastic modeling
Stochastic models are particularly useful in forecasting, in which the actuary produces
estimates of results in future years, not just a current year valuation.

You might also like