0% found this document useful (0 votes)

27 views38 pages

ML06 Classical Techniques

Here are the steps to train and fine-tune an SVM regressor on the California housing dataset: 1. Load the dataset using sklearn.datasets.fetch_california_housing() 2. Split into train and test sets 3. Initialize an SVM regressor model (sklearn.svm.SVR) 4. Try different kernel types (linear, poly, rbf, sigmoid) and pick the best performing one based on cross-validation score 5. Perform grid search on hyperparameters like C, gamma to find the best configuration 6. Train the model on full training set using the best hyperparameters 7. Evaluate performance on test set using metrics like RMSE, R2 score 8. Try

Uploaded by

ahmedemad20452045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

27 views38 pages

ML06 Classical Techniques

Uploaded by

ahmedemad20452045

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

3
k-Nearest Neighbors
• Find a predefined number of training samples (k) closest in distance to
the new point and predict the label from them: regression or
classification.
• The number of samples can be a user-defined constant (k-nearest
neighbor learning) or vary based on the local density of points (radius-
based neighbor learning).
• The distance can be any metric measure: standard Euclidean distance
is the most common choice.
• Reference: https://scikit-learn.org/stable/modules/neighbors.html

4
Nearest Neighbors Classification
class sklearn.neighbors.KNeighborsClassifier(n_neighbors=5,
weights='uniform', … )

• weights can be: uniform: All points in each neighborhood are

weighted equally, and distance: Weight points by the inverse of their
distance.
• Example:
from sklearn.neighbors import KNeighborsClassifier
knn_clf = KNeighborsClassifier()
knn_clf.fit(X_train, y_train)

5
Nearest Neighbors Regression
class sklearn.neighbors.KNeighborsRegressor(n_neighbors=5,
weights='uniform', … )

• The label assigned to a query point is computed based on the mean

of the labels of its nearest neighbors.
• Example:
from sklearn.neighbors import KNeighborsRegressor
model = KNeighborsRegressor(n_neighbors=3)
model.fit(X, y)

6
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

7
Support Vector Machine (SVM)
• Very powerful and versatile Machine Learning model, capable of
performing linear or nonlinear classification, regression, and outlier
detection.
• Well suited for classification of complex but small- or medium-sized
datasets.
• SVM gives large margin classification.

8
Linear SVM Classification
• The decision boundary is fully determined by the instances located
on the edge. These instances are called the support vectors.
• SVMs are sensitive to the feature scales.

9
Soft Margin Classification
• Hard margin classification cannot handle linearly inseparable classes
and is sensitive to outliers.

• Soft margin classification finds a balance between keeping the

margin as large as possible and limiting the margin violations.
10
Soft Margin Classification
• You can control the number of violations using the C hyperparameter.

• If your SVM model is overfitting, you can try regularizing it by

reducing C.

11
Iris Dataset

• A famous dataset that contains

the sepal and petal length and
width of 150 iris flowers of
three different species: Setosa,
Versicolor, and Virginica.

12
SVM Classification Example
from sklearn.datasets import load_iris
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC
>>> X_new = [[5.5, 1.7], [5.0, 1.5]]
iris = load_iris(as_frame=True) >>> svm_clf.predict(X_new)
X = iris.data[["petal length (cm)", array([ True, False])
"petal width (cm)"]].values >>> svm_clf.decision_function(X_new)
y = (iris.target == 2) # Iris virginica array([ 0.66163411, -0.22036063])

svm_clf = make_pipeline(StandardScaler(),
LinearSVC(C=1, random_state=42))
svm_clf.fit(X, y) 13
Nonlinear SVM Classification
• The SVM class supports nonlinear classification using the kernel
option. Controls how much the model is
influenced by high-degree polynomials
versus low-degree

14
Gaussian Radial Basis Function

• The Gaussian RBF can be used to find similarity features (x2 and x3 )
of the one-dimensional dataset with two landmarks to it at x1 = –2
and x1 = 1
Linearly separable

15
Gaussian RBF Kernel
• Is popular with SVM to solve nonlinear problems.

• Transforms a training set with m instances and n features to m

instances and m features.
• gamma and C are used for regularization with smaller values.

16
Gaussian RBF Kernel

17
Linear SVM Regression
• Fits as many instances as possible on the margin while limiting margin
violations. The width of the street is controlled by a hyperparameter ϵ.

18
Nonlinear SVM Regression

19
SVM Conclusion

• The LinearSVC has complexity of 𝑂 𝑚 × 𝑛 .

• The SVC time complexity is usually between 𝑂 𝑚2 × 𝑛 and

𝑂 𝑚3 × 𝑛 .

• This algorithm is perfect for complex but small or medium training

sets. However, it scales well with the number of features.

20
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

21
Decision Trees
• Decision Trees are versatile Machine Learning algorithms that can
perform both classification and regression tasks, and even
multioutput tasks.
• They are very powerful algorithms, capable of fitting complex
datasets.
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
iris = load_iris(as_frame=True)
X_iris = iris.data[["petal length (cm)", "petal width (cm)"]].values
y_iris = iris.target
tree_clf = DecisionTreeClassifier(max_depth=2, random_state=42)
tree_clf.fit(X_iris, y_iris)
22
Visualizing a Decision Tree

23
Visualizing a Decision Tree

24
Regularization Hyperparameters
• Increase min_* or decrease max_*: max_depth=None,
min_samples_split=2, min_samples_leaf=1,
min_weight_fraction_leaf=0.0, max_features=None,
max_leaf_nodes=None

25
Decision Trees Regression
from sklearn.tree import DecisionTreeRegressor
tree_reg = DecisionTreeRegressor(max_depth=2)
tree_reg.fit(X, y)

26
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

27
Ensemble Learning and Random Forests

• A group of predictors is called an ensemble.

• You can train a group of Decision Tree classifiers, each on a different
random subset of the training set.
• To make predictions, obtain the predictions of all individual trees,
then predict the class that gets the most votes (hard voting),
• or predict the class with the highest-class probability (soft voting).
• Such an ensemble of Decision Trees is called a Random Forest.

28
Voting Classifiers
• If each classifier is a weak learner (meaning it does only slightly
better than random guessing), the ensemble can be a strong learner
(achieving high accuracy).

29
Scikit-Learn Voting Classifier 1/2

30
Scikit-Learn Voting Classifier 2/2

31
Bagging and Pasting
• Use the same training algorithm for every predictor but train them
on different random subsets of the training set.
• When sampling is performed with replacement, this method is called
bagging (short for bootstrap aggregating).
• When sampling is performed without replacement, it is called
pasting.
• The aggregation function is the most frequent prediction (hard
voting) for classification, highest probability (soft voting), or the
average for regression.

32
Bagging Demonstration

33
Bagging and Pasting

Use all available cores

34
Random Forests
• An ensemble of Decision Trees trained via the bagging with
max_samples set to the size of the training set and choosing the best
random splits.

• Equivalent to: It samples 𝑛 features.

35
Outline

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

36
Exercises

1. Train and fine-tune an SVM regressor on the California housing

dataset. You can use the original dataset rather than the tweaked
version we used in Chapter 2, which you can load using
sklearn.datasets.fetch_california_housing(). The targets
represent hundreds of thousands of dollars. Since there are over
20,000 instances, SVMs can be slow, so for hyperparameter tuning
you should use far fewer instances (e.g., 2,000) to test many more
hyperparameter combinations. What is your best model’s RMSE?

37
Exercises
2. Train and fine-tune a Decision Tree for the moons dataset.
a) Generate a moons dataset using make_moons(n_samples=10000,
noise=0.4).
b) Split it into a training set and a test set using train_test_split().
c) Use grid search with cross-validation (with the help of the GridSearchCV class)
to find good hyperparameter values for a DecisionTreeClassifier. Hint: try
various values for max_leaf_nodes.
d) Train it on the full training set using these hyperparameters, and measure your
model’s performance on the test set. You should get roughly 85% to 87%
accuracy.

38
Exercises

3. Load the MNIST dataset and split it into a training set and a test set
(take the first 60,000 instances for training, and the remaining
10,000 for testing). Train a random forest classifier on the dataset
and time how long it takes, then evaluate the resulting model on
the test set.

39
Summary

1. k-Nearest Neighbors
2. Support Vector Machines
3. Decision Trees
4. Ensemble Learning and Random Forests
5. Exercises

Assignment 3.docx 2
No ratings yet
Assignment 3.docx 2
23 pages
Amlt Bca Unit-1
No ratings yet
Amlt Bca Unit-1
24 pages
Classifying Data Using Support Vector Machines (SVMS) in Python
No ratings yet
Classifying Data Using Support Vector Machines (SVMS) in Python
5 pages
Machine Learning Crash Course: Computer Vision James Hays
No ratings yet
Machine Learning Crash Course: Computer Vision James Hays
38 pages
ML Models
No ratings yet
ML Models
21 pages
BigData Week13
No ratings yet
BigData Week13
62 pages
DSM MOd 5
No ratings yet
DSM MOd 5
34 pages
Ex 6, EX 7 AIML
No ratings yet
Ex 6, EX 7 AIML
9 pages
Support Vector Machine (SVM) Classifier:: Key Features
No ratings yet
Support Vector Machine (SVM) Classifier:: Key Features
6 pages
VAMSHI PR (1) 2 Edit
No ratings yet
VAMSHI PR (1) 2 Edit
16 pages
Quiz 1 On Wednesday
No ratings yet
Quiz 1 On Wednesday
46 pages
SVM Classifier Techniques Guide
No ratings yet
SVM Classifier Techniques Guide
15 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
ML.4-Classification Techniques (Week 5,6,7)
No ratings yet
ML.4-Classification Techniques (Week 5,6,7)
56 pages
MODULE - 4 - PART 2 - Support Vector Machines
No ratings yet
MODULE - 4 - PART 2 - Support Vector Machines
6 pages
2.11 Chapter 5 SVM
No ratings yet
2.11 Chapter 5 SVM
25 pages
Practical 6
No ratings yet
Practical 6
4 pages
20MEMECH Part 3 - Classification
No ratings yet
20MEMECH Part 3 - Classification
49 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
Fundamentals of Machine Learning Support Vector Machines, Practical Session
No ratings yet
Fundamentals of Machine Learning Support Vector Machines, Practical Session
4 pages
SVM Experimentxtended
No ratings yet
SVM Experimentxtended
3 pages
Jadavpur University: Assignment Submission
No ratings yet
Jadavpur University: Assignment Submission
9 pages
Machine Learning
No ratings yet
Machine Learning
15 pages
Jntuk Machine Learning 3-2 Unit-3
No ratings yet
Jntuk Machine Learning 3-2 Unit-3
33 pages
Machine Learning
No ratings yet
Machine Learning
8 pages
HandsOnML Ch5E
No ratings yet
HandsOnML Ch5E
31 pages
BDA Worksheet 5 Arman
No ratings yet
BDA Worksheet 5 Arman
5 pages
Assignment 0.2
No ratings yet
Assignment 0.2
8 pages
Python ML Algorithm
No ratings yet
Python ML Algorithm
30 pages
SVM Unit 2
No ratings yet
SVM Unit 2
12 pages
Supervised Learning
No ratings yet
Supervised Learning
30 pages
Monish NLP
No ratings yet
Monish NLP
2 pages
Machine Learning SVM - Supervised
No ratings yet
Machine Learning SVM - Supervised
32 pages
SVM Experiment Extended
No ratings yet
SVM Experiment Extended
3 pages
Data Science
No ratings yet
Data Science
8 pages
Support Vector Machines Guide
No ratings yet
Support Vector Machines Guide
17 pages
Mid2 Answers
No ratings yet
Mid2 Answers
42 pages
Title: Implement Support Vector Machine Classifier: Department of Computer Science and Engineering
No ratings yet
Title: Implement Support Vector Machine Classifier: Department of Computer Science and Engineering
5 pages
Building, Tuning, and Deploying Models
No ratings yet
Building, Tuning, and Deploying Models
11 pages
Unit 3 Aam
No ratings yet
Unit 3 Aam
30 pages
ML Practical 3
No ratings yet
ML Practical 3
5 pages
Support Vector Machine
No ratings yet
Support Vector Machine
52 pages
KNN and Random Forests Guide
No ratings yet
KNN and Random Forests Guide
6 pages
CH 7
No ratings yet
CH 7
33 pages
FRM Course Syllabus IPDownload
No ratings yet
FRM Course Syllabus IPDownload
3 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
PML Ex4
No ratings yet
PML Ex4
8 pages
Nitin ML Assignment 1
No ratings yet
Nitin ML Assignment 1
18 pages
Machine Learning Notes SVM
No ratings yet
Machine Learning Notes SVM
22 pages
UCS551 Chapter 6 - Classification
No ratings yet
UCS551 Chapter 6 - Classification
20 pages
Raghav Soni (20IOT6014) Algo - Assignment
No ratings yet
Raghav Soni (20IOT6014) Algo - Assignment
14 pages
Chapter Four - Part One
No ratings yet
Chapter Four - Part One
44 pages
Chapter 6 ML Classifications
100% (1)
Chapter 6 ML Classifications
51 pages
Machine Learning Cheat Sheet: Karn Singh
No ratings yet
Machine Learning Cheat Sheet: Karn Singh
13 pages
Ain3001 - 04 - Support - Vector.machines
No ratings yet
Ain3001 - 04 - Support - Vector.machines
50 pages
28.14 - Code Sample - mp4
No ratings yet
28.14 - Code Sample - mp4
4 pages
Understanding Support Vector Machines
No ratings yet
Understanding Support Vector Machines
32 pages
Python for Machine Learning Enthusiasts
No ratings yet
Python for Machine Learning Enthusiasts
50 pages
Clarivate Top 100 New Global Brands Report 2022
No ratings yet
Clarivate Top 100 New Global Brands Report 2022
17 pages
Inference Based CBT Questions
No ratings yet
Inference Based CBT Questions
8 pages
Dialogue Journal of Phi Sigma Tau Download
No ratings yet
Dialogue Journal of Phi Sigma Tau Download
50 pages
EENT Exam 2
No ratings yet
EENT Exam 2
3 pages
Ultimate Burger Recipe Collection
No ratings yet
Ultimate Burger Recipe Collection
12 pages
Nippon Steel Arcelor Mittal Catalogue
0% (1)
Nippon Steel Arcelor Mittal Catalogue
8 pages
Kanon Green Binder Rev 2
100% (1)
Kanon Green Binder Rev 2
202 pages
Spesifikasi Barang Medis Habis Pakai Reagen Laboratorium TA. 2021
No ratings yet
Spesifikasi Barang Medis Habis Pakai Reagen Laboratorium TA. 2021
3 pages
gpdk180 DRM PDF
No ratings yet
gpdk180 DRM PDF
313 pages
Anspach & Hobday Brewery Expansion Plan
No ratings yet
Anspach & Hobday Brewery Expansion Plan
29 pages
1831 e Cr421 en Voith Turbo Safeset Torque Limiting Couplings
No ratings yet
1831 e Cr421 en Voith Turbo Safeset Torque Limiting Couplings
32 pages
MS6001FA
No ratings yet
MS6001FA
14 pages
Diagnostic Imaging of Child Abuse 3rd Edition Paul K. Kleinman (Editor) Instant Download
100% (7)
Diagnostic Imaging of Child Abuse 3rd Edition Paul K. Kleinman (Editor) Instant Download
32 pages
145 148 +Ram+Kumar
No ratings yet
145 148 +Ram+Kumar
4 pages
Student Exploration: Free Fall Tower
No ratings yet
Student Exploration: Free Fall Tower
4 pages
FIGURE 10. The Effects of Voltage and Frequency Variation On Induction-Motor Characteristics
No ratings yet
FIGURE 10. The Effects of Voltage and Frequency Variation On Induction-Motor Characteristics
2 pages
Especificação Técnica RM7800L1087 PDF
No ratings yet
Especificação Técnica RM7800L1087 PDF
36 pages
Samsung Galaxy Alpha
No ratings yet
Samsung Galaxy Alpha
5 pages
IIT JAM Biotechnology Syllabus
No ratings yet
IIT JAM Biotechnology Syllabus
4 pages
Product Specifications Product Specifications: Ldf4Rn LDF4RN - 50A 50A
No ratings yet
Product Specifications Product Specifications: Ldf4Rn LDF4RN - 50A 50A
3 pages
BPAG 172 Solved Assignment
No ratings yet
BPAG 172 Solved Assignment
6 pages
Aventos Variety For Lift Systems: Choosing The Right Lift System
No ratings yet
Aventos Variety For Lift Systems: Choosing The Right Lift System
2 pages
Presentasi Power Electronics Bagus
No ratings yet
Presentasi Power Electronics Bagus
28 pages
Localized Wind Systems - Final
No ratings yet
Localized Wind Systems - Final
6 pages
Spices: Quality & Processing Insights
100% (1)
Spices: Quality & Processing Insights
87 pages
Crochet Christmas Gnome Pattern
83% (6)
Crochet Christmas Gnome Pattern
13 pages
Yirye Fashion Inc
No ratings yet
Yirye Fashion Inc
15 pages
4th Eng MCQ
No ratings yet
4th Eng MCQ
13 pages
LRCX2012
No ratings yet
LRCX2012
213 pages
Static Liquefaction-Type Tailings Dam Failures: Understanding Options For Detecting Failures
No ratings yet
Static Liquefaction-Type Tailings Dam Failures: Understanding Options For Detecting Failures
6 pages

ML06 Classical Techniques

Uploaded by

ML06 Classical Techniques

Uploaded by

Outline

• weights can be: uniform: All points in each neighborhood are

• The label assigned to a query point is computed based on the mean

• Soft margin classification finds a balance between keeping the

• If your SVM model is overfitting, you can try regularizing it by

• A famous dataset that contains

• Transforms a training set with m instances and n features to m

• The LinearSVC has complexity of 𝑂 𝑚 × 𝑛 .

• The SVC time complexity is usually between 𝑂 𝑚2 × 𝑛 and

• This algorithm is perfect for complex but small or medium training

• A group of predictors is called an ensemble.

Use all available cores

• Equivalent to: It samples 𝑛 features.

1. Train and fine-tune an SVM regressor on the California housing

You might also like