Module - 2 Ver 1.4
Module - 2 Ver 1.4
Module - 2
Regularization: Definition 2
1. Bias is the difference between the predicted value and the expected/true value.
2. The model makes certain assumptions about the data to make the target function simple,
but those assumptions may not always be correct.
3. A high bias model makes more assumptions about the target function.
4. High bias can cause an algorithm to miss the correct relationship between features and the
target output (underfitting).
5. The bias error is the error due to wrong/in-accurate assumptions that the learning
algorithm makes during training.
6. Zero bias may sound good as the model perfectly fits the training data, but this means that
the model has learned too much from the training data, it is called overfitting and the
model will not be able to do a good job with the new/testing data.
Variance and errors 5
• Underfit Overfit
Balanced fit 11
Improving the performance of model-1 12
• The L2 norm is the most common norm function in machine learning. Its
definition is the same as the Euclidean distance formula between the endpoint
of the vector and the origin:
• Commonly used L1 norm is simply the sum of the elements of the vector:
Solution to Generalized
Lagrange Function
Insight into the effect of constraint 24
• Weight decay
• Weight decay is a regularization technique of adding a small penalty,
usually the L2 norm of the weights (all the weights of the model), to the
loss function.
loss = loss + weight decay parameter * L2 norm of the weights
Loss = MSE(y_hat, y) + wd * sum(w^2)
• Regularization help in stopping the iteration when slope of likelihood
equals weight decay coefficient.
Regularization in linear algebra problems 27
• Most datasets have some mistakes in labels, this will maximize the
probability prediction.
• To prevent noise is explicitly labelled in the model.
• Local smoothing is a mechanism to regularize a model based on
softmax
Questions? 35
NEXT CLASS:
Semi-Supervised Learning, Multi-Task Learning