0% found this document useful (0 votes)
27 views38 pages

ML06 Classical Techniques

Here are the steps to train and fine-tune an SVM regressor on the California housing dataset: 1. Load the dataset using sklearn.datasets.fetch_california_housing() 2. Split into train and test sets 3. Initialize an SVM regressor model (sklearn.svm.SVR) 4. Try different kernel types (linear, poly, rbf, sigmoid) and pick the best performing one based on cross-validation score 5. Perform grid search on hyperparameters like C, gamma to find the best configuration 6. Train the model on full training set using the best hyperparameters 7. Evaluate performance on test set using metrics like RMSE, R2 score 8. Try
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views38 pages

ML06 Classical Techniques

Here are the steps to train and fine-tune an SVM regressor on the California housing dataset: 1. Load the dataset using sklearn.datasets.fetch_california_housing() 2. Split into train and test sets 3. Initialize an SVM regressor model (sklearn.svm.SVR) 4. Try different kernel types (linear, poly, rbf, sigmoid) and pick the best performing one based on cross-validation score 5. Perform grid search on hyperparameters like C, gamma to find the best configuration 6. Train the model on full training set using the best hyperparameters 7. Evaluate performance on test set using metrics like RMSE, R2 score 8. Try
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

3
k-Nearest Neighbors
• Find a predefined number of training samples (k) closest in distance to
the new point and predict the label from them: regression or
classification.
• The number of samples can be a user-defined constant (k-nearest
neighbor learning) or vary based on the local density of points (radius-
based neighbor learning).
• The distance can be any metric measure: standard Euclidean distance
is the most common choice.
• Reference: https://scikit-learn.org/stable/modules/neighbors.html

4
Nearest Neighbors Classification
class sklearn.neighbors.KNeighborsClassifier(n_neighbors=5,
weights='uniform', … )

• weights can be: uniform: All points in each neighborhood are


weighted equally, and distance: Weight points by the inverse of their
distance.
• Example:
from sklearn.neighbors import KNeighborsClassifier
knn_clf = KNeighborsClassifier()
knn_clf.fit(X_train, y_train)

5
Nearest Neighbors Regression
class sklearn.neighbors.KNeighborsRegressor(n_neighbors=5,
weights='uniform', … )

• The label assigned to a query point is computed based on the mean


of the labels of its nearest neighbors.
• Example:
from sklearn.neighbors import KNeighborsRegressor
model = KNeighborsRegressor(n_neighbors=3)
model.fit(X, y)

6
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

7
Support Vector Machine (SVM)
• Very powerful and versatile Machine Learning model, capable of
performing linear or nonlinear classification, regression, and outlier
detection.
• Well suited for classification of complex but small- or medium-sized
datasets.
• SVM gives large margin classification.

8
Linear SVM Classification
• The decision boundary is fully determined by the instances located
on the edge. These instances are called the support vectors.
• SVMs are sensitive to the feature scales.

9
Soft Margin Classification
• Hard margin classification cannot handle linearly inseparable classes
and is sensitive to outliers.

• Soft margin classification finds a balance between keeping the


margin as large as possible and limiting the margin violations.
10
Soft Margin Classification
• You can control the number of violations using the C hyperparameter.

• If your SVM model is overfitting, you can try regularizing it by


reducing C.

11
Iris Dataset

• A famous dataset that contains


the sepal and petal length and
width of 150 iris flowers of
three different species: Setosa,
Versicolor, and Virginica.

12
SVM Classification Example
from sklearn.datasets import load_iris
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
>>> X_new = [[5.5, 1.7], [5.0, 1.5]]
iris = load_iris(as_frame=True) >>> svm_clf.predict(X_new)
X = iris.data[["petal length (cm)", array([ True, False])
"petal width (cm)"]].values >>> svm_clf.decision_function(X_new)
y = (iris.target == 2) # Iris virginica array([ 0.66163411, -0.22036063])

svm_clf = make_pipeline(StandardScaler(),
LinearSVC(C=1, random_state=42))
svm_clf.fit(X, y) 13
Nonlinear SVM Classification
• The SVM class supports nonlinear classification using the kernel
option. Controls how much the model is
influenced by high-degree polynomials
versus low-degree

14
Gaussian Radial Basis Function

• The Gaussian RBF can be used to find similarity features (x2 and x3 )
of the one-dimensional dataset with two landmarks to it at x1 = –2
and x1 = 1
Linearly separable

15
Gaussian RBF Kernel
• Is popular with SVM to solve nonlinear problems.

• Transforms a training set with m instances and n features to m


instances and m features.
• gamma and C are used for regularization with smaller values.

16
Gaussian RBF Kernel

17
Linear SVM Regression
• Fits as many instances as possible on the margin while limiting margin
violations. The width of the street is controlled by a hyperparameter ϵ.

18
Nonlinear SVM Regression

19
SVM Conclusion

• The LinearSVC has complexity of 𝑂 𝑚 × 𝑛 .

• The SVC time complexity is usually between 𝑂 𝑚2 × 𝑛 and


𝑂 𝑚3 × 𝑛 .

• This algorithm is perfect for complex but small or medium training


sets. However, it scales well with the number of features.

20
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

21
Decision Trees
• Decision Trees are versatile Machine Learning algorithms that can
perform both classification and regression tasks, and even
multioutput tasks.
• They are very powerful algorithms, capable of fitting complex
datasets.
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
iris = load_iris(as_frame=True)
X_iris = iris.data[["petal length (cm)", "petal width (cm)"]].values
y_iris = iris.target
tree_clf = DecisionTreeClassifier(max_depth=2, random_state=42)
tree_clf.fit(X_iris, y_iris)
22
Visualizing a Decision Tree

23
Visualizing a Decision Tree

24
Regularization Hyperparameters
• Increase min_* or decrease max_*: max_depth=None,
min_samples_split=2, min_samples_leaf=1,
min_weight_fraction_leaf=0.0, max_features=None,
max_leaf_nodes=None

25
Decision Trees Regression
from sklearn.tree import DecisionTreeRegressor
tree_reg = DecisionTreeRegressor(max_depth=2)
tree_reg.fit(X, y)

26
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

27
Ensemble Learning and Random Forests

• A group of predictors is called an ensemble.


• You can train a group of Decision Tree classifiers, each on a different
random subset of the training set.
• To make predictions, obtain the predictions of all individual trees,
then predict the class that gets the most votes (hard voting),
• or predict the class with the highest-class probability (soft voting).
• Such an ensemble of Decision Trees is called a Random Forest.

28
Voting Classifiers
• If each classifier is a weak learner (meaning it does only slightly
better than random guessing), the ensemble can be a strong learner
(achieving high accuracy).

29
Scikit-Learn Voting Classifier 1/2

30
Scikit-Learn Voting Classifier 2/2

31
Bagging and Pasting
• Use the same training algorithm for every predictor but train them
on different random subsets of the training set.
• When sampling is performed with replacement, this method is called
bagging (short for bootstrap aggregating).
• When sampling is performed without replacement, it is called
pasting.
• The aggregation function is the most frequent prediction (hard
voting) for classification, highest probability (soft voting), or the
average for regression.

32
Bagging Demonstration

33
Bagging and Pasting

Use all available cores

34
Random Forests
• An ensemble of Decision Trees trained via the bagging with
max_samples set to the size of the training set and choosing the best
random splits.

• Equivalent to: It samples 𝑛 features.

35
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

36
Exercises

1. Train and fine-tune an SVM regressor on the California housing


dataset. You can use the original dataset rather than the tweaked
version we used in Chapter 2, which you can load using
sklearn.datasets.fetch_california_housing(). The targets
represent hundreds of thousands of dollars. Since there are over
20,000 instances, SVMs can be slow, so for hyperparameter tuning
you should use far fewer instances (e.g., 2,000) to test many more
hyperparameter combinations. What is your best model’s RMSE?

37
Exercises
2. Train and fine-tune a Decision Tree for the moons dataset.
a) Generate a moons dataset using make_moons(n_samples=10000,
noise=0.4).
b) Split it into a training set and a test set using train_test_split().
c) Use grid search with cross-validation (with the help of the GridSearchCV class)
to find good hyperparameter values for a DecisionTreeClassifier. Hint: try
various values for max_leaf_nodes.
d) Train it on the full training set using these hyperparameters, and measure your
model’s performance on the test set. You should get roughly 85% to 87%
accuracy.

38
Exercises

3. Load the MNIST dataset and split it into a training set and a test set
(take the first 60,000 instances for training, and the remaining
10,000 for testing). Train a random forest classifier on the dataset
and time how long it takes, then evaluate the resulting model on
the test set.

39
Summary

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

40

You might also like