0% found this document useful (0 votes)
15 views2 pages

AI1001 Assignment 9

The document discusses model selection and optimization, emphasizing the importance of choosing the right model to explain data complexity while avoiding underfitting and overfitting. It covers techniques such as cross-validation, bias-variance tradeoff, regularization methods (L1, L2, and elastic net), and hyperparameter tuning strategies like grid search, random search, and Bayesian optimization. Practical considerations for model selection include starting with simple models, using regularization, and ensuring a separate test set for final evaluation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views2 pages

AI1001 Assignment 9

The document discusses model selection and optimization, emphasizing the importance of choosing the right model to explain data complexity while avoiding underfitting and overfitting. It covers techniques such as cross-validation, bias-variance tradeoff, regularization methods (L1, L2, and elastic net), and hyperparameter tuning strategies like grid search, random search, and Bayesian optimization. Practical considerations for model selection include starting with simple models, using regularization, and ensuring a separate test set for final evaluation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Assignment 9: Model Selections and Optimization

Arnav Govindu
[email protected]

Model Selection
The goal of model selection is to identify a hypothesis or model that explains the data while covering
the complexity involved. One has to explore different models and variations in a model family, like
decision trees, neural networks,etc.

Hypothesis Space
This refers to the set of all possible models. For example, in polynomial regression, the degree of the
polynomial directly relates to its complexity.

Under and Over fitting


Underfitting is when the model is too simple and does not cover the entire complexity of the patterns
in the data, resulting in poor performance.
Overfitting is when the model is too complex and includes noise and unnecessary data in the training
data. This too, results in poor performance

Cross-Validation
This is a crucial technique to avoid overfitting. The dataset is split into k subsets (folds). The model
is trained on k-1 folds, and is validated on the remaining one. This process repeats k times, using a
different fold each time. This model is a more reliable estimate of the model’s error. Leave out one
cross validation (LOO-CV) is another version of k fold where each instance is used a validation set
once.

Bias-Variance Tradeoff
This is a key concept in model selection, which influences the model’s ability to generalize bias and
variance.
Bias is the error introduced by approximating real world problems with a simplified model.
Variance is the error introduced due to model complexity. High variance can cause overfitting.
A good model minimized bias and variance, while making sure generalization is optimized.

Regularization
Regularization is when a penalty is introduced for complex models. The penalty strength is controlled
by a hyperparameter ().

L2 Regulation
This regulation adds a penalty proportional to the square of the model parameters values.

L1 Regulation
This regulation adds a penalty proportional to the absolute values of the parameters

Elastic net
This is a combination of L1 and L2 regularization.
Hyperparameter Tuning
Hyperparameters are external configurations of a model, like depth of a decision tree, regularization
strength, etc. They are not learned during training and have to be optimized separately.
Grid search
A brute force method where all combinations of hyperparameters are exhaustively tested. It is
effective, but inefficient.
Random Search
Instead of trying every combination, random values for hyperparameters are tested. This can often be
more efficient.
Bayesian Optimization
This method uses a probabalistic approach by mapping hyperparameters to the objective and uses
this to choose the next set of hyperparameters to evaluate. This method is the most efficient out of the
mentioned ones.

Learning Curve
Learning curves are a useful tool to check if a model is underfitting or overfitting. If both training
and validaiton errors are high, the model is likely underfitting. If the training error is low but the
validation error is high, the model is likely overfitting. The chapter also discusses how increasing
the size of the model initially reduces training error, but eventually increases validation error as
overfitting starts to occur.

Practical Considerations
This section also includes practical advice on how to select models in real-world scenarios.
> Start with simple models and gradually increase complexity.
> Regularization is important to keep models from overfitting, especially in high dimensional datasets.
> Automated methods for hyperparameter tuning, like Bayesian optimization are highly efficient in
finding optimal solutions without trail and error.
> It is crucial to use a separate test set for final model evaluation to avoid biased estimates of
performance.

You might also like