INT247 Lect3.03.1

Uploaded by

ashuoshs318

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

INT247 Lect3.03.1

Uploaded by

ashuoshs318

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

INT247

Machine Learning Foundations

Lecture #5.0
Normalization and Feature Scaling

© LPU :: INT247 Machine Learning Foundations

Feature Scaling
• Used for standardization of independent
variables of data features.
• Dataset contains features varying in
magnitude, units and range. For example:
– Gold_weight measured in gms.
– Iron_weight measured in Kg.
• Euclidian distance is not the best method to
scale the features.
© LPU :: INT247
Machine Learning
Foundations
Techniques of Feature Scaling
• Standardisation
• Mean Normalization
• Min-Max Scaling
• Unit Vector

𝝈
• This redistributes the features with their
mean =0 and standard deviation =1.

Perform standardisation and normalisation on

dataset.

© LPU :: INT247
Machine Learning
Foundations
Solution
Consider the following dataset:
X Normalized Standardized
0.0 0.0 -1.336306
1.0 0.2 -0.801784
2.0 0.4 -0.267261
3.0 0.6 0.267261
4.0 0.8 0.801784
5.0 1.0 1.336306

© LPU :: INT247
Machine Learning
Foundations
Over-fitting
• Model performs much better on a training dataset
than on the test dataset.
• Model fits the parameter too closely to a particular
observation in the training dataset.
• Not generalize the real data.

© LPU :: INT247
Machine Learning
Foundations
Reduce Generalization Errors
• Collect more training data.
• Introduce a penalty for complexity via regularization.
• Choose a simpler model with fewer parameters.
• Reduce the dimensionality of the data.

• L1 regularization yields sparse feature vectors.

• Sparsity is useful if dataset is high dimensional with many
irrelevant features.
• L1 penalty is the sum of the absolute weight coefficients.

© LPU :: INT247
Machine Learning
Foundations
Sequential Feature Selection Algorithms
• Family of greedy search algorithms.
• Reduce an initial d-dimensional feature space into k-
dimensional feature sub-space where k<d.
• Automatically select a subset of features that are most
relevant to the problem.

© LPU :: INT247
Machine Learning
Foundations
Sequential Forward Selection (SFS) Algo.
SFS is the simplest greedy search algorithm.
– Starting from the empty set, sequentially add the
features x+ that maximizes J(Yk+x+) when combined
with the features Ykthat have already been selected.
1. Start with the empty set Y0={
2. Select the next best feature x+=argmaxJ(Yk+x)
3. Update Yk+1=Yk+x+; k=k+1
4. Go to 2
© LPU :: INT247
Machine Learning
Foundations
Sequential Forward Selection (SFS) Algo.
• SFS performs best when the optimal subset is
small.
• The search space is drawn like an ellipse to
emphasize the fact that there are fewer states
towards the full or empty sets.

© LPU :: INT247
Machine Learning
Foundations
Example
• Run SFS to completion for the following objective function:
J(X)=-2x1x2+3x1+5x2-2x1x2x3+7x3+4x4+-2x1x2x3x4
Where xk are indicator variables, which indicate whether the kth feature
has been selected (xk=1) or not (xk=0)
J(x1)=3 J(x2)=5 J(x3)=7 J(x4)=4
x3 is maximum: J(x3x1)=10 J(x3x2)=12 J(x3x4)=11
x3x2 is maximum: j(x3x2x1)=11 j(x3x2x4)=16
x3x2x4 is maximum: j(x3x2x4x1)=13

© LPU :: INT247
Machine Learning
Foundations
Sequential Backward Selection (SBS) Algo.
Aims to reduce the dimensionality of the initial feature
subspace.
– Initialize the algorithm with k=d where d is the dimensionality
of the full feature space Xd.
– Determine the feature x- that maximizes the criterion x-
=argmaxJ(Xk-x) where xϵXk.
– Remove the feature x- from the feature set: Xk-1=Xk-x-, k=k-1.
– Terminate if k equals the number of desired features, if not, go
to step 2.

© LPU :: INT247
Machine Learning
Foundations
Sequential Backward Selection (SBS)
• SBS works best when the optimal feature subset is
large, since SBS spends most of its time visiting
large subsets.
• The main limitation of SBS is its inability to re-
evaluate the usefulness of a feature after it has
been discarded.

© LPU :: INT247
Machine Learning
Foundations
Bidirectional Search (BDS)
BDS is a parallel implementation of SFS and SBS.
– SFS is performed from the empty set.
– SBS is performed from the full set.
– To guarantee that SFS and SBS converge to the same
solution.
• Features already selected by SFS are not removed by SBS.
• Features already removed by SBS are not selected by SFS.

© LPU :: INT247
Machine Learning
Foundations
Bidirectional Search (BDS)
1. Start SFS with YF={
2. Start SBS with YB=X
3. Select the best feature

4. Remove the worst feature

5. Go to step 2

© LPU :: INT247
Machine Learning
Foundations
Selecting Features Using Random Forests
There are two different methods for feature
selection are:
– Mean decrease impurity
– Mean decrease accuracy

© LPU :: INT247
Machine Learning
Foundations
Mean Decrease Impurity
• Impurity: measure based on which optimal condition is
chosen.
• During training, it is computed how each feature
decreases the weighted impurity in a tree.
• For a forest, the impurity decrease from each feature can
be averaged and the features are ranked according to this
measure.

© LPU :: INT247
Machine Learning
Foundations
Mean Decrease Impurity
• Feature selection based on impurity reduction is biased
towards preferring variables with more categories.
• When the dataset has two or more correlated features,
any of these correlated features can be used as the
predictor.

© LPU :: INT247
Machine Learning
Foundations
Mean Decrease Accuracy
• Measure the impact of each feature on accuracy of the
model.
• Permute the values of each feature and measure how
much the permutation decreases the accuracy of the
model.
• Unimportant variables permutation have little or no effect
on model accuracy.
• Important variables permutation significantly decrease
the accuracy.

SVM Using Python
No ratings yet
SVM Using Python
24 pages
ML Lecture 02
No ratings yet
ML Lecture 02
40 pages
SVM Intro
No ratings yet
SVM Intro
23 pages
Eel891 Selecao Atributos George Bebis
No ratings yet
Eel891 Selecao Atributos George Bebis
58 pages
Feature Selection Methods
No ratings yet
Feature Selection Methods
24 pages
CDT B1 Lab06 MondayWeek2
No ratings yet
CDT B1 Lab06 MondayWeek2
6 pages
Chapter 7
No ratings yet
Chapter 7
64 pages
Hybrid-Recursive Feature Elimination for Efficient Feature Selection
No ratings yet
Hybrid-Recursive Feature Elimination for Efficient Feature Selection
9 pages
This Is
No ratings yet
This Is
7 pages
Least Squares Support Vector Machines: Johan Suykens
No ratings yet
Least Squares Support Vector Machines: Johan Suykens
84 pages
Feature engineering
No ratings yet
Feature engineering
5 pages
data mining techniques
No ratings yet
data mining techniques
27 pages
SML Unit 4
No ratings yet
SML Unit 4
61 pages
Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation
No ratings yet
Fast and Efficient Boolean Matrix Factorization by Geometric Segmentation
8 pages
SVM Scribe Notes
No ratings yet
SVM Scribe Notes
16 pages
dimensionalityReduction.pptx
No ratings yet
dimensionalityReduction.pptx
117 pages
Basic of SVM Algorithm
No ratings yet
Basic of SVM Algorithm
10 pages
Embedded Methods: Isabelle Guyon André Elisseeff
No ratings yet
Embedded Methods: Isabelle Guyon André Elisseeff
12 pages
ml41
No ratings yet
ml41
49 pages
MLQB2
No ratings yet
MLQB2
11 pages
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
No ratings yet
Feature Selection For SVMS: by J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik
19 pages
MergedPDF Iml
No ratings yet
MergedPDF Iml
114 pages
Lec5 Support vector machine
No ratings yet
Lec5 Support vector machine
28 pages
Building Good Training Sets
No ratings yet
Building Good Training Sets
51 pages
Feature Selection
No ratings yet
Feature Selection
36 pages
Module5.2 Feature selection methods
No ratings yet
Module5.2 Feature selection methods
64 pages
Lecture Notes - SVM
No ratings yet
Lecture Notes - SVM
13 pages
Feature Selection For SVMS: J. Weston, S. Mukherjee, O. Chapelle, M. Pontil T. Poggio, V. Vapnik
No ratings yet
Feature Selection For SVMS: J. Weston, S. Mukherjee, O. Chapelle, M. Pontil T. Poggio, V. Vapnik
7 pages
Svmtrain M
No ratings yet
Svmtrain M
10 pages
L5-Support Vector Machine
No ratings yet
L5-Support Vector Machine
61 pages
Support Vector Machin, An Excellent Tool
No ratings yet
Support Vector Machin, An Excellent Tool
36 pages
Lecture6 Notes
No ratings yet
Lecture6 Notes
5 pages
ML-chap13_2024_110331
No ratings yet
ML-chap13_2024_110331
67 pages
Branch and Bound Feature Selection
No ratings yet
Branch and Bound Feature Selection
4 pages
CS-3035 (ML) - CS End April 2024
No ratings yet
CS-3035 (ML) - CS End April 2024
21 pages
Introduction To: Support Vector Machines
No ratings yet
Introduction To: Support Vector Machines
53 pages
Lab 6 Dsa
No ratings yet
Lab 6 Dsa
15 pages
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification 06. Feature Selection & Extraction: Abdelmoniem Bayoumi, PHD
29 pages
C. Cifarelli Et Al - Incremental Classification With Generalized Eigenvalues
No ratings yet
C. Cifarelli Et Al - Incremental Classification With Generalized Eigenvalues
25 pages
Wrapper Method
No ratings yet
Wrapper Method
58 pages
Time Series Forecasting by Using Wavelet Kernel SVM
No ratings yet
Time Series Forecasting by Using Wavelet Kernel SVM
52 pages
Introduction To Kernels: Max Welling
No ratings yet
Introduction To Kernels: Max Welling
16 pages
Detailed SVM Presentation
No ratings yet
Detailed SVM Presentation
15 pages
PPTREVIEW2
No ratings yet
PPTREVIEW2
18 pages
Introduction To Support Vector Machines: 1 Description
No ratings yet
Introduction To Support Vector Machines: 1 Description
15 pages
SVM-1
No ratings yet
SVM-1
36 pages
Lecture 8
No ratings yet
Lecture 8
19 pages
ML Unit 3
No ratings yet
ML Unit 3
83 pages
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
No ratings yet
A Tutorial on ν-Support Vector Machines: 1 An Introductory Example
29 pages
A Introduction To SVM PDF
No ratings yet
A Introduction To SVM PDF
48 pages
L06 Feature Selection and Extraction
No ratings yet
L06 Feature Selection and Extraction
29 pages
SVM Tutorial
No ratings yet
SVM Tutorial
34 pages
28.14 - Code Sample - mp4
No ratings yet
28.14 - Code Sample - mp4
4 pages
ML Notes.docx
No ratings yet
ML Notes.docx
15 pages
SVM Class
No ratings yet
SVM Class
33 pages
Deep Learn
No ratings yet
Deep Learn
48 pages
cs221-lecture11
No ratings yet
cs221-lecture11
71 pages
SVM PRESENTATION
No ratings yet
SVM PRESENTATION
34 pages
SVM-CDing2024 11 15
No ratings yet
SVM-CDing2024 11 15
54 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
3blue1 Brown Neural Networks
No ratings yet
3blue1 Brown Neural Networks
72 pages
23-TAI-Extensible Machine Learning For Encrypted Network Traffic Application Labeling Via Uncertainty Quantification
No ratings yet
23-TAI-Extensible Machine Learning For Encrypted Network Traffic Application Labeling Via Uncertainty Quantification
15 pages
Faller - VFM - COBEM - 2023 - Rev - Final 3
No ratings yet
Faller - VFM - COBEM - 2023 - Rev - Final 3
7 pages
Lect3 Supervised1
No ratings yet
Lect3 Supervised1
25 pages
ML_lab_Dipali
No ratings yet
ML_lab_Dipali
36 pages
Text Classification Week 6
No ratings yet
Text Classification Week 6
16 pages
Allianz Improving PL through Machine Learning
No ratings yet
Allianz Improving PL through Machine Learning
6 pages
Predictive Analytics A Review of Trends and Techni
No ratings yet
Predictive Analytics A Review of Trends and Techni
7 pages
7.+18750+61-76
No ratings yet
7.+18750+61-76
16 pages
Setfit
No ratings yet
Setfit
13 pages
MCQS ML
No ratings yet
MCQS ML
27 pages
Ai Notes
No ratings yet
Ai Notes
19 pages
Data Science_Modern_Template_CV
No ratings yet
Data Science_Modern_Template_CV
1 page
Predictive Analytics For Future Life Expectancy Using Machine Learning
No ratings yet
Predictive Analytics For Future Life Expectancy Using Machine Learning
6 pages
Enhancing Digital Image Forgery Detection Using Transfer Learning
No ratings yet
Enhancing Digital Image Forgery Detection Using Transfer Learning
12 pages
27786-Article Text-31840-1-2-20240324
No ratings yet
27786-Article Text-31840-1-2-20240324
9 pages
Fairface: Face Attribute Dataset For Balanced Race, Gender, and Age
No ratings yet
Fairface: Face Attribute Dataset For Balanced Race, Gender, and Age
11 pages
A Comprehensive Review of Supervised and Unsupervised Machine Learning Techniques
No ratings yet
A Comprehensive Review of Supervised and Unsupervised Machine Learning Techniques
3 pages
Community Forensics Using Thousands Generators to TrainTrain Fake Image
No ratings yet
Community Forensics Using Thousands Generators to TrainTrain Fake Image
15 pages
EBook - Data Science 4
No ratings yet
EBook - Data Science 4
14 pages
DA Practicle Answers Easyw
No ratings yet
DA Practicle Answers Easyw
30 pages
CSC 240 HW 4
No ratings yet
CSC 240 HW 4
17 pages
Step-By-Step Guide To Gain MLOps Skills
No ratings yet
Step-By-Step Guide To Gain MLOps Skills
6 pages
CS598 Report
No ratings yet
CS598 Report
10 pages
21AD62 Model Paper
No ratings yet
21AD62 Model Paper
5 pages
Crime Prediction Using Machine Learning
No ratings yet
Crime Prediction Using Machine Learning
19 pages
Enabling 3D Object Detection With A Low-Resolution Lidar
No ratings yet
Enabling 3D Object Detection With A Low-Resolution Lidar
4 pages
Multi Label Classification For Emotion Analysis of Autism Spectrum Disorder Children Using Deep Neural Networks
No ratings yet
Multi Label Classification For Emotion Analysis of Autism Spectrum Disorder Children Using Deep Neural Networks
5 pages
Ground Water Level Prediction: Srigurulekha K. & Dhivya S
No ratings yet
Ground Water Level Prediction: Srigurulekha K. & Dhivya S
11 pages
ISE 529 mock test answers
No ratings yet
ISE 529 mock test answers
6 pages