Feature Selection

The document discusses feature selection in classification, emphasizing its role in reducing costs and improving accuracy through methods like exhaustive search and artificial neural networks. It outlines the evaluation parameters for classifiers, including accuracy, design and classification time, space requirements, explanation ability, and noise tolerance. Additionally, it describes the process of estimating classifier performance using training data and validation methods.

Uploaded by

phicsp1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views32 pages

Feature Selection

Uploaded by

phicsp1234

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Feature Selection

Ms. Gayaksha Kandolkar

Assistant Professor (on Contract)
Department of Computer Engineering,
Padre Conceicao College of Engineering
● Feature selection can also be used to speed up the process of classiﬁcation, at
the same time, ensuring that the classiﬁcation accuracy is optimal. Feature
selection ensures the following:

1. Reduction in cost of pattern classiﬁcation and design of the classiﬁer

Dimensionality reduction, i.e., using a limited feature set simplifies both the
representation of patterns and the complexity of the classifiers. Consequently,
the resulting classifier will be faster and use less memory.
2. Improvement of classification accuracy
Exhaustive Search
● The most straightforward approach to the problem of feature selection is to
search through all the feature sub-sets and find the best sub-set.
● In this method, all combinations of features are tried out and the criterion function
J calculated.
● The combination of features which gives the highest value of J is the set of features
selected.
● If the patterns consist of d features, and a sub-set of size m features is to be found
with the smallest classification error, it entails searching all (d m) possible sub-sets
of size m and selecting the sub-set with the highest criterion function J(.),whereJ
=(1− Pe)
This method is suitable when only a few features need to be removed.
Artificial Neural Networks
● A multilayer feed-forward network with a back-propagation learning algorithm is
used in this method. The approach considered here is to take a larger than
necessary network and then remove unnecessary nodes.
● Pruning is carried out by eliminating the least salient nodes. It is based on the idea
of iteratively eliminating units and adjusting the remaining weights in such a way
that the network performance does not become worse over the entire training set.
● The pruning of nodes corresponds to removing the corresponding features
from the feature set. The saliency of a node is defined as the sum of the
increase in error over all the training patterns caused by the removal of
that node.
● The node pruning based feature selection first trains a network and then
removes the least salient node.
● The reduced network is trained again, followed by the removal of the least
salient node. This procedure is repeated to get the least classification
error.
Evaluation of Classifiers
The various parameters of the classifier which requires to be taken into account are:
1. Accuracy of the classifier: The main aim of using a classifier is to correctly classify
unknown patterns.
2. Design time and classification time: Design time is the time taken to build the classifier
from the training data while classification time is the time taken to classify a pattern using
the designed classifier.
3. Space required If an abstraction of the training set is carried out, the space required will
be less. If no abstraction is carried out and the entire training data is required for
classification, the space requirement is high.
4. Explanation ability: If the reason for the classifier in choosing the class of a
pattern is clear to the user, then its explanation ability is good. For instance, in
the decision tree classifier, following the path from the root of the tree to the
leaf node for the values of the features in the pattern will give the class of the
pattern.
Similarly, the user understands why a rule based system chooses a particular
class for a pattern. On the other hand, the neural network classifier has a
trained neural network and it is not clear to the user what the network is
doing.
5. Noise tolerance: This refers to the ability of the classifier to take care of
outliers and patterns wrongly classified.
To estimate how good a classifier is, an estimate can be made using the
training set itself. This is known as resubstitution estimate.

It assumes that the training data is a good representative of the data.

Sometimes, a part of the training data is used as a measure of the performance
of the classiﬁer. Usually the training set is divided into smaller subsets. One of
the subsets is used for training while the other is used for validation. The
diﬀerent methods of validation are as follows:
THANK YOU

CH 4
No ratings yet
CH 4
21 pages
Unit-Iii: Classification and Prediction
No ratings yet
Unit-Iii: Classification and Prediction
21 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
117 pages
4 Classification
No ratings yet
4 Classification
20 pages
Chapter 4
No ratings yet
Chapter 4
31 pages
Liu 2009
No ratings yet
Liu 2009
9 pages
DWM Unit-V Notes
No ratings yet
DWM Unit-V Notes
15 pages
6 الى13 داتا ماينق
No ratings yet
6 الى13 داتا ماينق
19 pages
DWDM Unit IV Note
No ratings yet
DWDM Unit IV Note
21 pages
Data Mining Unit-IV
No ratings yet
Data Mining Unit-IV
7 pages
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
No ratings yet
Feature Selection Techniques For ML - A Survey of More Than Two Decades of Research - Dipti Theng
63 pages
PR Assignment 01 - Seemal Ajaz (206979)
No ratings yet
PR Assignment 01 - Seemal Ajaz (206979)
7 pages
Clustering Before Classification
No ratings yet
Clustering Before Classification
3 pages
DWDM Module IV
No ratings yet
DWDM Module IV
57 pages
Pattern Recognition and Computer Vision NOTES
No ratings yet
Pattern Recognition and Computer Vision NOTES
27 pages
Module 3
No ratings yet
Module 3
64 pages
Classification and Prediction Guide
No ratings yet
Classification and Prediction Guide
93 pages
Dmi Unit 4
No ratings yet
Dmi Unit 4
34 pages
3 Module DWM
No ratings yet
3 Module DWM
16 pages
Unit-4 DM
No ratings yet
Unit-4 DM
15 pages
Lecture 12 - Training Methods
No ratings yet
Lecture 12 - Training Methods
25 pages
IJETR2225
No ratings yet
IJETR2225
3 pages
Unit 3
No ratings yet
Unit 3
28 pages
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
No ratings yet
Genetic Algorithm-Based Feature Selection Method For Credit Risk Analysis
4 pages
Supervised Learning: Adane Letta Mamuye (PHD)
No ratings yet
Supervised Learning: Adane Letta Mamuye (PHD)
41 pages
KNIME - Seven Techs For Dimensionality Reduction
No ratings yet
KNIME - Seven Techs For Dimensionality Reduction
17 pages
A Study of Some Data Mining Classification Techniques
No ratings yet
A Study of Some Data Mining Classification Techniques
4 pages
Workbook of Pattern Recognition
No ratings yet
Workbook of Pattern Recognition
11 pages
03 Decision Tree
No ratings yet
03 Decision Tree
59 pages
Features Election
No ratings yet
Features Election
62 pages
Portinale Saitta 2002a
No ratings yet
Portinale Saitta 2002a
22 pages
Unit 3 Data Mining
No ratings yet
Unit 3 Data Mining
21 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
DMDW Classification
No ratings yet
DMDW Classification
18 pages
Feature Selection Techniques and Its Importance in Machine Learning: A Survey
No ratings yet
Feature Selection Techniques and Its Importance in Machine Learning: A Survey
6 pages
Unit-4 DM
No ratings yet
Unit-4 DM
19 pages
Pattern Recoginition 5
No ratings yet
Pattern Recoginition 5
43 pages
Classification and Prediction Guide
No ratings yet
Classification and Prediction Guide
59 pages
6153-Article Text-6539-1-10-20230303
No ratings yet
6153-Article Text-6539-1-10-20230303
9 pages
Classification
No ratings yet
Classification
95 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Classification and Prediction
No ratings yet
Classification and Prediction
21 pages
Review of Data Mining Classification Techniques
No ratings yet
Review of Data Mining Classification Techniques
4 pages
Classification
100% (1)
Classification
37 pages
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
No ratings yet
Fast Clustering Based Feature Selection: Ubed S. Attar, Ajinkya N. Bapat, Nilesh S. Bhagure, Popat A. Bhesar
7 pages
Classifiction
No ratings yet
Classifiction
42 pages
DM Notes - UNIT 3
No ratings yet
DM Notes - UNIT 3
24 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
41 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
CH 5
No ratings yet
CH 5
84 pages
DWDM 4
No ratings yet
DWDM 4
58 pages
Unit 3 Classification
No ratings yet
Unit 3 Classification
71 pages
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
No ratings yet
1-2 The Problem 3-4 Proposed Solution 5-7 The Experiment 8-9 Experimental Results 10-11 Conclusion 12 References 13
14 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
52 pages
Feature Extraction Techniques Using Support Vector Machines in Disease Prediction
No ratings yet
Feature Extraction Techniques Using Support Vector Machines in Disease Prediction
8 pages
Module 3 Notes
No ratings yet
Module 3 Notes
31 pages
MCN3
No ratings yet
MCN3
6 pages
ML Unit 1 Theory Question Bank
No ratings yet
ML Unit 1 Theory Question Bank
1 page
Unit 4 - Outliers
No ratings yet
Unit 4 - Outliers
1 page
MCN Assignment 4
No ratings yet
MCN Assignment 4
1 page
Introduction To Kernel PCA
No ratings yet
Introduction To Kernel PCA
1 page
The Temple of Nim Newsletter - December 2009
No ratings yet
The Temple of Nim Newsletter - December 2009
18 pages
App Testing Stages Explained
No ratings yet
App Testing Stages Explained
7 pages
Tetrapakcasestudy 130923142535 Phpapp01
0% (1)
Tetrapakcasestudy 130923142535 Phpapp01
29 pages
P 1 Part - 2 Ancient and Medieval History of India
No ratings yet
P 1 Part - 2 Ancient and Medieval History of India
32 pages
About Davinci Resolve Studio 19.1.3
No ratings yet
About Davinci Resolve Studio 19.1.3
2 pages
Asejire Power Plant
100% (1)
Asejire Power Plant
4 pages
The Value and Price of Food: An Excerpt From Terra Madre by Carlo Petrini
100% (1)
The Value and Price of Food: An Excerpt From Terra Madre by Carlo Petrini
7 pages
STM32C0 Workshop Installation Guide and Code To Be Added V1.3 - In-PERSON
No ratings yet
STM32C0 Workshop Installation Guide and Code To Be Added V1.3 - In-PERSON
32 pages
IPCRF: Results-Based Management Guide
No ratings yet
IPCRF: Results-Based Management Guide
14 pages
Diplomas Concert 011220 - Recorded Exam Requirements
No ratings yet
Diplomas Concert 011220 - Recorded Exam Requirements
5 pages
Lecture 2 - Load Calculation in Structures PDF
No ratings yet
Lecture 2 - Load Calculation in Structures PDF
31 pages
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
100% (1)
Data Analytics - 4 Manuscripts - Data Science For Beginners, Data Analysis With Python, SQL Computer Programming For Beginners, Statistics For Beginners
481 pages
Asian Sexy Chodne Wala 101
No ratings yet
Asian Sexy Chodne Wala 101
5 pages
UNIT 1 Imp of Translation
No ratings yet
UNIT 1 Imp of Translation
30 pages
West Bengal State University: Review Form 2023-24 For SEMESTER-V (CBCS) Examination
No ratings yet
West Bengal State University: Review Form 2023-24 For SEMESTER-V (CBCS) Examination
1 page
CH 16 Solutions
No ratings yet
CH 16 Solutions
4 pages
Bahasa Inggeris Year 1 Paper 2
No ratings yet
Bahasa Inggeris Year 1 Paper 2
5 pages
Recommended Spare Parts: SL1, SLV 0.9-1.5 KW: Type: Model
No ratings yet
Recommended Spare Parts: SL1, SLV 0.9-1.5 KW: Type: Model
1 page
Information Guide - Tier 2 Robert Brown Promising Researcher Award (Jul 2025)
No ratings yet
Information Guide - Tier 2 Robert Brown Promising Researcher Award (Jul 2025)
9 pages
Ms 146
0% (4)
Ms 146
3 pages
Creating A Learning Culture in The Organisation
No ratings yet
Creating A Learning Culture in The Organisation
5 pages
Factors Influencing Photosynthesis Rates
No ratings yet
Factors Influencing Photosynthesis Rates
20 pages
Business Letters Punctuations and Styles
No ratings yet
Business Letters Punctuations and Styles
16 pages
Micro Assignment 1 (Updated) Fri
No ratings yet
Micro Assignment 1 (Updated) Fri
2 pages
C267 Service Guide for Technicians
No ratings yet
C267 Service Guide for Technicians
21 pages
Law Exam Prep: Key Contract Questions
No ratings yet
Law Exam Prep: Key Contract Questions
20 pages
Conditional Deed of Sale
No ratings yet
Conditional Deed of Sale
3 pages
Hostel Management System Project
No ratings yet
Hostel Management System Project
29 pages
Making Work Visible - Audio Book
No ratings yet
Making Work Visible - Audio Book
56 pages
Bhanu - PO - 15293
No ratings yet
Bhanu - PO - 15293
2 pages

Feature Selection

Uploaded by

Feature Selection

Uploaded by

Feature Selection

Ms. Gayaksha Kandolkar

1. Reduction in cost of pattern classiﬁcation and design of the classiﬁer

It assumes that the training data is a good representative of the data.

You might also like