Sklearn | Model Hyper-parameters Tuning
Last Updated :
24 Apr, 2025
Hyperparameter tuning is the process of finding the optimal values for the hyperparameters of a machine-learning model. Hyperparameters are parameters that control the behaviour of the model but are not learned during training. Hyperparameter tuning is an important step in developing machine learning models because it can significantly improve the model's performance on new data. However, hyperparameter tuning can be a time-consuming and challenging task. Scikit-learn provides several tools that can help you tune the hyperparameters of your machine-learning models. In this guide, we will provide a comprehensive overview of hyperparameter tuning in Scikit-learn.
What are hyperparameters?
Hyperparameters are parameters that control the behaviour of a machine-learning model but are not learned during training. Some common examples of hyperparameters include:
- Regularization strength: This parameter controls how much the model is penalized for overfitting.
- Number of trees: This parameter controls the number of trees in a random forest model.
- Learning rate: This parameter controls how quickly the model learns during training.
Why is hyperparameter tuning important?
Tuning hyperparameters is important because it can improve the performance of a training model on new data. For example, a poorly calibrated model will have high bias, meaning it is unsuitable for new data. On the other hand, a well-calibrated model will have bias and high variance, meaning it will extend well to new data and be accurate.
How to tune hyperparameters in Scikit-learn:
Scikit-Learn provides a variety of tools to help you tune the hyperparameters of your machine-learning models. A popular method is to use grid search.
GridSearch CV : Grid search is a brute force method that iterates through all possible combinations of hyperparameter values. You can implement grid search in scikit-learn using the GridSearchCV class. The GridSearchCV class defines a machine learning model and hyperparameter search space. A hyperparameter search space is a dictionary that defines the range of values for each hyperparameter. The model is then evaluated on the delayed validation dataset. The combination of hyperparameters that best fit the data used was selected as the optimal model.
Another popular way to tune hyperparameters is to use random search.
Random Search : Compared to grid search, random search is a cheaper method because it tests only a random sample of hyperparameter values. You can implement random search in sci-kit-learn using the RandomizedSearchCV class. The RandomizedSearchCV class takes a machine-learning model and a hyperparameter distribution as input. A hyperparameter distribution is a dictionary that defines the distribution of values to be tested for each hyperparameter. In the RandomizedSearchCV lecture, we train a machine learning program to randomly check hyperparameter values in hyperparameter passes.
At this point, the demo is evaluated based on the delayed assertion data set. The combination of hyperparameters that achieves the best performance on the assertion dataset is selected as the key metric.
Advanced hyperparameter tuning techniques
In addition to grid search and random search, there are several other advanced hyperparameter tuning techniques that you can use in Scikit-learn. These techniques include:
- Bayesian optimization: Bayesian optimization is a sequential model-based optimization technique that can be used to search for the optimal hyperparameter values efficiently.
- Hyperband: Hyperband is a resource-efficient algorithm for hyperparameter tuning.
- Tree-structured Parzen estimator (TPE): TPE is a sequential model-based optimization technique often used to tune the hyperparameters of tree-based models.
Drawback of gridsearch cv:
- Computationally expensive: GridSearchCV searches for all combinations of hyperparameters in the grid. Therefore, it can be considered expensive, especially when the search area is large or samples are used.
- Comprehensive Search: GridSearchCV performs a comprehensive search on the grid parameter. This means that it evaluates all connections, even if some of them do not appear to improve performance standards. This may cause data loss.
- Not effective for large search space: When dealing with large search space or large number of hyperparameters, GridSearchCV does not work to scale due to large number of connections.
- Limited Exploration: GridSearchCV may not be able to explore the hyperparameter space like other search methods (such as random search). It does not provide much randomness in the search process and the hyperparameter space may not have an expectation space.
- Scalability Issues: GridSearchCV may not work well with some machine learning algorithms and large datasets. This may be impossible when dealing with big data.
- Will not change the results: GridSearchCV does not update its search based on the results of previous tests. It does not learn from the performance of previous hyperparameter combinations and may waste time on similar combinations or not match.
- Limited parallelization: GridSearchCV can be parallelized to some extent, but not all connections can be calculated at the same time. This limits its performance on multi-core processors or distributed computing environments.
- Does not solve the problem of model selection: GridSearchCV only focuses on hyperparameter modification and does not solve the problem of choosing different models or algorithms. Model selection often involves choosing from different types of machine learning, which GridSearchCV does not always support.
SVC Algorithm
GridSearchCV
Python3
# Import necessary libraries
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42)
# Define the parameter grid to search over
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf', 'poly'],
'gamma': [0.1, 1, 'scale', 'auto'],
}
# Create an SVM classifier
svm = SVC()
# Create a GridSearchCV object
grid_search = GridSearchCV(
estimator=svm, param_grid=param_grid, cv=5, n_jobs=-1)
# Fit the GridSearchCV object to the training data
grid_search.fit(X_train, y_train)
# Print the best hyperparameters and corresponding accuracy score
print("Best Hyperparameters: ", grid_search.best_params_)
print("Best Accuracy Score: {:.2f}%".format(grid_search.best_score_ * 100))
# Evaluate the model on the test set
best_svm = grid_search.best_estimator_
test_accuracy = best_svm.score(X_test, y_test)
print("Test Accuracy: {:.2f}%".format(test_accuracy * 100))
Output:
Best Hyperparameters: {'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}
Best Accuracy Score: 95.83%
Test Accuracy: 100.00%
- The output will display the best hyperparameters found during the grid search and the corresponding cross-validation accuracy score.
- It will also show the accuracy of the best model on the test set.
- The code is essentially performing hyperparameter optimization to find the best SVM model for the Iris dataset, and it reports the performance of the best model on unseen data.
Random search
Python3
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV, RandomizedSearchCV
from scipy.stats import uniform, expon
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42)
# Define the parameter grid for Grid Search
param_grid = {
'C': [0.1, 1, 10],
'kernel': ['linear', 'rbf', 'poly'],
'gamma': [0.1, 1, 'scale', 'auto'],
}
# Define the parameter distributions for Random Search
param_dist = {
'C': uniform(0.1, 10),
'kernel': ['linear', 'rbf', 'poly'],
'gamma': expon(scale=1),
}
# Create an SVM classifier
svm = SVC()
# Create a GridSearchCV object
grid_search = GridSearchCV(
estimator=svm, param_grid=param_grid, cv=5, n_jobs=-1)
# Create a RandomizedSearchCV object
random_search = RandomizedSearchCV(
estimator=svm, param_distributions=param_dist, n_iter=50, cv=5, n_jobs=-1)
# Fit the GridSearchCV object to the training data
grid_search.fit(X_train, y_train)
# Fit the RandomizedSearchCV object to the training data
random_search.fit(X_train, y_train)
# Print the best hyperparameters and corresponding accuracy score for Grid Search
print("Grid Search - Best Hyperparameters: ", grid_search.best_params_)
print("Grid Search - Best Accuracy Score: {:.2f}%".format(grid_search.best_score_ * 100))
# Print the best hyperparameters and corresponding accuracy score for Random Search
print("Random Search - Best Hyperparameters: ", random_search.best_params_)
print("Random Search - Best Accuracy Score: {:.2f}%".format(random_search.best_score_ * 100))
# Evaluate the best models on the test set
best_svm_grid = grid_search.best_estimator_
best_svm_random = random_search.best_estimator_
test_accuracy_grid = best_svm_grid.score(X_test, y_test)
test_accuracy_random = best_svm_random.score(X_test, y_test)
print("Test Accuracy (Grid Search): {:.2f}%".format(test_accuracy_grid * 100))
print("Test Accuracy (Random Search): {:.2f}%".format(test_accuracy_random * 100))
Output:
Grid Search - Best Hyperparameters: {'C': 0.1, 'gamma': 0.1, 'kernel': 'poly'}
Grid Search - Best Accuracy Score: 95.83%
Random Search - Best Hyperparameters: {'C': 3.900736564361965, 'gamma': 0.4094567581571069, 'kernel': 'linear'}
Random Search - Best Accuracy Score: 96.67%
Test Accuracy (Grid Search): 100.00%
Test Accuracy (Random Search): 96.67%
The output will display the best hyperparameters found during grid search and random search, along with their corresponding cross-validation accuracy scores.
It will also show the accuracy of the best models found by both methods on the test set.
You can compare the performance of grid search and random search in finding the best hyperparameters for the SVM classifier.
XGBoost algorithm
GridSearchCV
Python3
import xgboost as xgb
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn import datasets
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the hyperparameters and their search ranges
param_grid = {
'n_estimators': [100, 200, 300],
'learning_rate': [0.01, 0.1, 0.2],
'max_depth': [3, 4, 5],
'min_child_weight': [1, 3, 5],
'subsample': [0.8, 0.9, 1.0],
'colsample_bytree': [0.8, 0.9, 1.0]
}
# Create an XGBoost model
xgb_model = xgb.XGBClassifier()
# Perform GridSearchCV
grid_search = GridSearchCV(xgb_model, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)
# Get the best hyperparameters
best_params = grid_search.best_params_
# Fit the model with the best hyperparameters on the entire dataset
best_model = grid_search.best_estimator_
best_model.fit(X_train, y_train)
# Evaluate the best model on the test set
accuracy = best_model.score(X_test, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")
Output:
Best Hyperparameters: {'colsample_bytree': 1.0, 'learning_rate': 0.01, 'max_depth': 3, 'min_child_weight': 1, 'n_estimators': 200, 'subsample': 1.0}
Accuracy on test set: 1.00
In this output:
- The best hyperparameters found by the grid search are listed.
- The accuracy on the test set is also reported, indicating how well the best model performs on unseen data.
- The goal of this code is to find the best hyperparameters for an XGBoost classifier and evaluate its performance on the test set
Random search
Python3
import xgboost as xgb
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn import datasets
# Load the Iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Define the hyperparameter search space
param_dist = {
'n_estimators': [100, 200, 300, 400, 500],
'learning_rate': [0.01, 0.1, 0.2, 0.3, 0.4],
'max_depth': [3, 4, 5, 6, 7, 8, 9, 10],
'min_child_weight': [1, 3, 5, 7, 9],
'subsample': [0.8, 0.9, 1.0],
'colsample_bytree': [0.6, 0.7, 0.8, 0.9, 1.0],
'gamma': [0, 0.1, 0.2, 0.3, 0.4],
'lambda': [0, 0.1, 0.2, 0.3, 0.4]
}
# Create an XGBoost model
xgb_model = xgb.XGBClassifier()
# Perform RandomizedSearchCV
random_search = RandomizedSearchCV(xgb_model, param_distributions=param_dist, n_iter=100, cv=5, scoring='accuracy', random_state=42)
random_search.fit(X_train, y_train)
# Get the best hyperparameters
best_params = random_search.best_params_
# Fit the model with the best hyperparameters on the entire dataset
best_model = random_search.best_estimator_
best_model.fit(X_train, y_train)
# Evaluate the best model on the test set
accuracy = best_model.score(X_test, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")
Output:
Best Hyperparameters: {'subsample': 0.8, 'n_estimators': 200, 'min_child_weight': 1, 'max_depth': 7, 'learning_rate': 0.01, 'lambda': 0.3, 'gamma': 0.3, 'colsample_bytree': 0.9}
Accuracy on test set: 1.00
In this output:
- The best hyperparameters found by the random search are listed.
- The accuracy on the test set is also reported, indicating how well the best model performs on unseen data.
- Randomized search is a more efficient way to explore hyperparameter space compared to grid search, especially when there are a large number of hyperparameters to consider.
Logistic regression algorithm
GridSearchCV
Python3
from sklearn.model_selection import GridSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
import warnings
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Scale the data using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Define the hyperparameters and their search ranges
param_grid = {
'C': [0.001, 0.01, 0.1, 1, 10, 100],
'penalty': ['l2'], # Only 'l2' penalty is compatible with 'lbfgs' solver
'solver': ['liblinear', 'lbfgs']
}
# Create a Logistic Regression model
logistic_regression = LogisticRegression(max_iter=1000)
# Perform GridSearchCV with warnings filtered
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=UserWarning)
grid_search = GridSearchCV(logistic_regression, param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train_scaled, y_train)
# Get the best hyperparameters
best_params = grid_search.best_params_
# Fit the model with the best hyperparameters on the entire dataset
best_model = grid_search.best_estimator_
best_model.fit(X_train_scaled, y_train)
# Evaluate the best model on the test set
accuracy = best_model.score(X_test_scaled, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")
Output:
Best Hyperparameters: {'C': 1, 'penalty': 'l2', 'solver': 'lbfgs'}
Accuracy on test set: 1.00
In this code:
- The best hyperparameters are reported, including 'C', 'penalty', and 'solver'.
- The accuracy on the test set indicates how well the logistic regression model with the best hyperparameters performs on unseen data. In this case, it achieves an accuracy of 0.97 (97%).
Random search
Python3
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler
import numpy as np
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Scale the data using StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)
# Define the hyperparameter search space
param_dist = {
'C': np.logspace(-4, 4, 100), # Range of C values in logarithmic scale
'penalty': ['l2'], # Only 'l2' penalty is compatible with 'lbfgs' solver
'solver': ['lbfgs'] # Use only 'lbfgs' solver
}
# Create a Logistic Regression model
logistic_regression = LogisticRegression(max_iter=1000)
# Perform RandomizedSearchCV with error_score='raise'
random_search = RandomizedSearchCV(logistic_regression, param_distributions=param_dist, n_iter=100, cv=5, scoring='accuracy', random_state=42, error_score='raise')
random_search.fit(X_train_scaled, y_train)
# Get the best hyperparameters
best_params = random_search.best_params_
# Fit the model with the best hyperparameters on the entire dataset
best_model = random_search.best_estimator_
best_model.fit(X_train_scaled, y_train)
# Evaluate the best model on the test set
accuracy = best_model.score(X_test_scaled, y_test)
print(f"Best Hyperparameters: {best_params}")
print(f"Accuracy on test set: {accuracy:.2f}")
Output:
Best Hyperparameters: {'solver': 'lbfgs', 'penalty': 'l2', 'C': 0.6280291441834259}
Accuracy on test set: 1.00
In this code:
- The best hyperparameters are reported, including 'C', 'penalty', and 'solver'.
- The accuracy on the test set indicates how well the logistic regression model with the best hyperparameters performs on unseen data. In this case, it achieves an accuracy of 0.97 (97%).
Conclusion
Hyperparameter tuning is an imperative step in machine learning show improvement. Tuning hyperparameters can essentially make strides demonstrate execution on modern information. Scikit-learn gives a few devices to assist you tune the hyperparameters of your machine learning demonstrate.
Similar Reads
Hyperparameter tuning
Machine Learning model is defined as a mathematical model with several parameters that need to be learned from the data. By training a model with existing data we can fit the model parameters. However there is another kind of parameter known as hyperparameters which cannot be directly learned from t
8 min read
Hyperparameter Tuning with R
In R Language several techniques and packages can be used to optimize these hyperparameters, leading to better, more reliable models. in this article, we will discuss all the techniques and packages for Hyperparameter Tuning with R.What are Hyperparameters?Hyperparameters are the settings that contr
5 min read
Hyperparameter Tuning in Linear Regression
Linear regression is one of the simplest and most widely used algorithms in machine learning. Despite its simplicity, it can be quite powerful, especially when combined with proper hyperparameter tuning. Hyperparameter tuning is the process of tuning a machine learning model's parameters to achieve
7 min read
Hyperparameter tuning with Optuna in PyTorch
Hyperparameter tuning is a critical step in the machine learning pipeline, often determining the success of a model. Optuna is a powerful and flexible framework for hyperparameter optimization, designed to automate the search for optimal hyperparameters. When combined with PyTorch, a popular deep le
5 min read
Random Forest Hyperparameter Tuning in Python
Random Forest is one of the most popular machine learning algorithms used for both classification and regression tasks. It works by building multiple decision trees and combining their outputs to improve accuracy and control overfitting. While Random Forest is a robust model, fine-tuning its hyperpa
5 min read
Parameter Sharing and Typing in Machine Learning
We usually apply limitations or penalties to parameters in relation to a fixed region or point. L2 regularisation (or weight decay) penalises model parameters that deviate from a fixed value of zero, for example. However, we may occasionally require alternative means of expressing our prior knowledg
3 min read
Hyperparameter tuning with Ray Tune in PyTorch
Hyperparameter tuning is a crucial step in the machine learning pipeline that can significantly impact the performance of a model. Choosing the right set of hyperparameters can be the difference between an average model and a highly accurate one. Ray Tune is an industry-standard tool for distributed
7 min read
Hyperparameters Optimization methods - ML
In this article, we will discuss the various hyperparameter optimization techniques and their major drawback in the field of machine learning. What are the Hyperparameters?Hyperparameters are those parameters that we set for training. Hyperparameters have major impacts on accuracy and efficiency whi
7 min read
ML | Ridge Regressor using sklearn
Ridge regression is a powerful technique used in statistics and machine learning to improve the performance of linear regression models. In this article we will understand the concept of ridge regression with its implementation in sklearn.Ridge RegressionA Ridge regressor is basically a regularized
4 min read
What are LLM Parameters?
Parameters are like the "controls" inside a Large Language Model (LLM) that determine how it learns and processes information. There are two main types: Trainable parameters (like weights and biases) that the model learns from data during trainingNon-trainable parameters (like hyperparameters and fr
5 min read