UNIT III Part-2
UNIT III Part-2
Code:U18CST7002
Presented by: Nivetha R
Department: CSE
Contents
Learning Rate:
Batch gradient descent (BGD) is used to find the error for each
point in the training set and update the model after evaluating all
training examples. This procedure is known as the training epoch.
In simple words, it is a greedy approach where need to sum over all
examples for each update.
Stochastic gradient descent (SGD) is a type of gradient descent that runs one
training example per iteration. Or in other words, it processes a training epoch
for each example within a dataset and updates each training example's
parameters one at a time. As it requires only one training example at a time,
hence it is easier to store in allocated memory. However, it shows some
computational efficiency losses in comparison to batch gradient systems as it
shows frequent updates that require more detail and speed. Further, due to
frequent updates, it is also treated as a noisy gradient
Mini Batch gradient descent is the combination of both batch gradient descent
and stochastic gradient descent. It divides the training datasets into small
batch sizes then performs the updates on those batches separately. Splitting
training datasets into smaller batches make a balance to maintain the
computational efficiency of batch gradient descent and speed of stochastic
gradient descent. Hence, achieving a special type of gradient descent with
higher computational efficiency and less noisy gradient descent.
• It is computationally efficient.