0% found this document useful (0 votes)

58 views

SE-6104 Data Mining and Analytics: Lecture # 13 Advance Classification

Ensemble methods combine multiple machine learning models to improve classification accuracy. Popular ensemble methods include bagging, boosting, and random forests. Bagging averages predictions from classifiers trained on bootstrap samples of the data. Boosting iteratively reweights samples and focuses on misclassified samples. Random forests generate decision trees using random subsets of features. These ensemble methods often significantly outperform single classifiers and are more robust to noise or outliers in the data.

Uploaded by

Huma Qayyum MohyudDin

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views

SE-6104 Data Mining and Analytics: Lecture # 13 Advance Classification

Uploaded by

Huma Qayyum MohyudDin

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 31

SE-6104

Data Mining and Analytics

Lecture # 13

Chapter 9

Advance Classification
Techniques to Improve Classification Accuracy
Ensemble Methods: Increasing the Accuracy

• Ensemble methods
• Use a combination of models to increase accuracy
• Combine a series of k learned models, M1, M2, …, Mk, with
the aim of creating an improved model M*
• Popular ensemble methods
• Bagging: averaging the prediction over a collection of
classifiers
• Boosting: weighted vote with a collection of classifiers
• Ensemble: combining a set of heterogeneous classifiers

2
Bagging: Boostrap Aggregation
• Analogy: Diagnosis based on multiple doctors’ majority vote
• Training
• Given a set D of d tuples, at each iteration i, a training set Di of d tuples is
sampled with replacement from D (i.e., bootstrap)
• A classifier model Mi is learned for each training set Di
• Classification: classify an unknown sample X
• Each classifier Mi returns its class prediction
• The bagged classifier M* counts the votes and assigns the class with the
most votes to X
• Prediction: can be applied to the prediction of continuous values by taking
the average value of each prediction for a given test tuple
• Accuracy
• Often significantly better than a single classifier derived from D
• For noise data: not considerably worse, more robust
• Proved improved accuracy in prediction
3
Boosting
• Analogy: Consult several doctors, based on a combination of
weighted diagnoses—weight assigned based on the previous
diagnosis accuracy
• How boosting works?
• Weights are assigned to each training tuple
• A series of k classifiers is iteratively learned
• After a classifier Mi is learned, the weights are updated to allow
the subsequent classifier, Mi+1, to pay more attention to the
training tuples that were misclassified by Mi
• The final M* combines the votes of each individual classifier,
where the weight of each classifier's vote is a function of its
accuracy
• Boosting algorithm can be extended for numeric prediction
• Comparing with bagging: Boosting tends to have greater accuracy,
but it also risks overfitting the model to misclassified data
4
Adboost (Freund and Schapire, 1997)
• Given a set of d class-labeled tuples, (X1, y1), …, (Xd, yd)
• Initially, all the weights of tuples are set the same (1/d)
• Generate k classifiers in k rounds. At round i,
• Tuples from D are sampled (with replacement) to form a training set Di of
the same size
• Each tuple’s chance of being selected is based on its weight index.
• A classification model Mi is derived from Di e.g using GINI
• Its error rate is calculated using Di as a test set e.g total error associated
with classiers.
• If a tuple is misclassified, its weight is increased, o.w. it is decreased
• Error rate: err(Xj) is the misclassification error of tuple Xj. Classifier Mi error rate
is the sum of the weights of the misclassified
d tuples:
error ( M i )   w j  err ( X j )
j
1  error ( M i )
log
error ( M i )
• (Amount of say): The weight of classifier Mi’s vote is 1/2 5
Adboost (Freund and Schapire, 1997)

• Increase in weight of misclassified

• New Sample weight= sample weight x e amount of say

• Decrease in weight of misclassified

• New Sample weight= sample weight x e- amount of say

6
Random Forest
• Random Forest:
( Breiman 2001)
• Each classifier in the ensemble is a decision tree classifier and is
generated using a random selection of attributes at each node to
determine the split
• During classification, each tree votes and the most popular class is
returned
Method to construct Random Forest:
• Forest-RI (random input selection): Randomly select, at each node, F
attributes as candidates for the split at the node.
• The CART methodology is used to grow the trees to maximum size
• Comparable in accuracy to Adaboost, but more robust to errors and
outliers
• Insensitive to the number of attributes selected for consideration at each
split, and faster than bagging or boosting

7
Classification of Class-Imbalanced Data Sets
• Class-imbalance problem: Rare positive example but numerous
negative ones, e.g., medical diagnosis, fraud, oil-spill, fault, etc.
• Traditional methods assume a balanced distribution of classes and
equal error costs: not suitable for class-imbalanced data
• Typical methods for imbalance data in 2-class classification : for
improving the classification accuracy of class-imbalanced data.
• Oversampling: re-sampling of data from positive class
• Under-sampling: randomly eliminate tuples from negative class
• Threshold-moving: moves the decision threshold, t, so that the
rare class tuples are easier to classify, and hence, less chance of
costly false negative errors
• Ensemble techniques: Ensemble multiple classifiers introduced
above
• Still difficult for class imbalance problem on multiclass tasks

8
Classification by Backpropagation

(CH # 9.2)
• Brain
• A marvelous piece of
architecture and design.
• In association with a
nervous system, it
controls the life patterns,
communications,
interactions, growth and
development of
hundreds of million of life
forms.
• There are about 1010 to 1014 nerve cells (called
neurons) in an adult human brain.
• Neurons are highly connected with each other.
Each nerve cell is connected to hundreds of
thousands of other nerve cells.
• Passage of information between neurons is
slow (in comparison to transistors in an IC). It
takes place in the form of electrochemical
signals between two neurons in milliseconds.
• Energy consumption per neuron is low
(approximately 10-6 Watts).
Look more like some
spots of ink… aren’t they!

Taking a more closer look

reveals that there is a
large collection of different
molecules, working together
coherently, in an organized
manner.

Put together, they form the

best information processing
system in the known universe.
Nucleus
Axon
Synapse

Axons from
another
neurons Cell Body

Synapse Dendrites

Mind you, a neuron is a three dimensional entity!

• An artificial neural network is an information
processing system that has certain performance
characteristics in common with biological neural
networks.
• An ANN can be characterized by:
1. Architecture: The pattern of connections between
different neurons.
2. Training or Learning Algorithms: The method of
determining weights on the connections.
3. Activation Function: The nature of function used by a
neuron to become activated.
• There are two basic categories:
1. Feed-forward Neural Networks
• These are the nets in which the signals flow from the
input units to the output units, in a forward direction.
• They are further classified as:
1. Single Layer Nets
2. Multi-layer Nets
2. Recurrent Neural Networks
• These are the nets in which the signals can flow in
both directions from the input to the output or vice
versa.
w11

w21
X1 Y1
w31
w12
X2 w22 Y2

l w32 l
l l
l l

w13
Xn w2m Ym
wnm
Input Output
Units Units
w11 v11
X1 Y1
wi1 vj1
Z1
l wn1 l
vp1 l
l l
l l l
w1j l v1k
Xi wij Zj vjk Yk
wnj l vpk
Biological l
l
l l

Neurons
l w1p v1m l
l l
Zp
In Action wip vjm
Xn wnp vpm
Ym
Input Hidden Output
Units Units Units
1 w11 1
X1 Y1

w1n
v1n v1m

w1m

Xn Ym
wnm
1 1
Supervised Training
• Training is accomplished by presenting a sequence of
training vectors or patterns, each with an associated
target output vector.
• The weights are then adjusted according to a learning
algorithm.
• During training, the network develops an associative
memory. It can then recall a stored pattern when it is
given an input vector that is sufficiently similar to a vector
it has learned.
Unsupervised Training
• A sequence of input vectors is provided, but no traget
vectors are specified in this case.
• The net modifies its weights and biases, so that the most
similar input vectors are assigned to the same output unit.
Classification by Backpropagation
• Backpropagation: A neural network learning algorithm
• Started by psychologists and neurobiologists to develop and test
computational analogues of neurons
• A neural network: A set of connected input/output units where
each connection has a weight associated with it
• During the learning phase, the network learns by adjusting the
weights so as to be able to predict the correct class label of the
input tuples
• Also referred to as connectionist learning due to the connections
between units
21
Neural Network as a Classifier
• Weakness
• Long training time
• Require a number of parameters typically best determined empirically,
e.g., the network topology or “structure."
• Poor interpretability: Difficult to interpret the symbolic meaning behind
the learned weights and of “hidden units" in the network
• Strength
• High tolerance to noisy data
• Ability to classify untrained patterns
• Well-suited for continuous-valued inputs and outputs
• Successful on a wide array of real-world data
• Algorithms are inherently parallel
• Techniques have recently been developed for the extraction of rules from
trained neural networks

22
A Neuron (= a perceptron)
- j
x0 w0
x1 w1
å f
output y
xn wn
For Example
n
Input weight weighted Activation y  sign( wi xi   j )
vector x vector w sum function i0

• The n-dimensional input vector x is mapped into variable y by means of

the scalar product and a nonlinear function mapping

23
A Multi-Layer Feed-Forward Neural Network

• Given a unit, j in a hidden or output layer, the net input, Ij , to unit j is

I j   wij Oi   j
i

• Given the net input Ij to unit j, then Oj , the output of unit j, is computed as
1
Oj  I j
1 e
24
• Backpropagate the error: The error is propagated backward by updating the
weights and biases to reflect the error of the network’s prediction. For a unit j in
the output layer, the error Errj is computed by
Err j  O j (1  O j )(T j  O j )
where Oj is the actual output of unit j, and Tj is the known target value of the
given training tuple.
• The error of a hidden layer unit j is
Err j  O j (1  O j ) Errk w jk
k

where wjk is the weight of the connection from unit j to a unit k in the next
higher layer, and Errk is the error of unit k.
• Weights are updated by the following equations, where is the change in
weight wij
wij  wij  (l ) Err j Oi
• Biases are updated by the following equations, where is the change in
bias weight
 j   j  (l) Err j

25
How A Multi-Layer Neural Network Works?

• The inputs to the network correspond to the attributes measured for each
training tuple
• Inputs are fed simultaneously into the units making up the input layer
• They are then weighted and fed simultaneously to a hidden layer
• The number of hidden layers is arbitrary, although usually only one
• The weighted outputs of the last hidden layer are input to units making up
the output layer, which emits the network's prediction
• The network is feed-forward in that none of the weights cycles back to an
input unit or to an output unit of a previous layer

26
Defining a Network Topology
• First decide the network topology: # of units in the input layer, #
of hidden layers (if > 1), # of units in each hidden layer, and # of
units in the output layer
• Normalizing the input values for each attribute measured in the
training tuples to [0.0—1.0]
• One input unit per domain value, each initialized to 0
• Output, if for classification and more than two classes, one
output unit per class is used
• Once a network has been trained and its accuracy is
unacceptable, repeat the training process with a different
network topology or a different set of initial weights
27
Backpropagation
• Iteratively process a set of training tuples & compare the network's prediction
with the actual known target value
• For each training tuple, the weights are modified to minimize the mean
squared error between the network's prediction and the actual target value
• Modifications are made in the “backwards” direction: from the output layer,
through each hidden layer down to the first hidden layer, hence
“backpropagation”
• Steps
• Initialize weights (to small random #s) and biases in the network
• Propagate the inputs forward (by applying activation function)
• Backpropagate the error (by updating weights and biases)
• Terminating condition (when error is very small, etc.)

28
Example to discuss
Terminating condition:
• Training stops when
• All wij in the previous epoch are so small as to be below some
specified threshold, or
• The percentage of tuples misclassified in the previous epoch is
below some threshold, or
• A prespecified number of epochs has expired.
• In practice, several hundreds of thousands of epochs may be
required before the weights will converge.

32
Backpropagation and Interpretability
• Efficiency of backpropagation: Each epoch (one interation through the training
set) takes O(|D| * w), with |D| tuples and w weights, but # of epochs can be
exponential to n, the number of inputs, in the worst case
• Rule extraction from networks: network pruning
• Simplify the network structure by removing weighted links that have the
least effect on the trained network
• The set of input and activation values are studied to derive rules
describing the relationship between the input and hidden unit layers
• Sensitivity analysis: measure the impact that a given input variable has on a
network output. The knowledge gained from this analysis can be represented
in rules

TEFL Final Lesson Plan
100% (1)
TEFL Final Lesson Plan
7 pages
Teaching Pronunciation
100% (2)
Teaching Pronunciation
18 pages
Personal Statement
No ratings yet
Personal Statement
6 pages
artificial-neural-networks
No ratings yet
artificial-neural-networks
61 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
ISC Unit II Topic-6
No ratings yet
ISC Unit II Topic-6
29 pages
Ensemble Method
No ratings yet
Ensemble Method
8 pages
Data Imbalance Problem
No ratings yet
Data Imbalance Problem
56 pages
02 K-Means
No ratings yet
02 K-Means
25 pages
15 dm2 Imbalanced Learning 2022 23
No ratings yet
15 dm2 Imbalanced Learning 2022 23
35 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
81 pages
Pattern Classification Neural Networks: Abdelmoniem Bayoumi, PHD
No ratings yet
Pattern Classification Neural Networks: Abdelmoniem Bayoumi, PHD
22 pages
Project 1 - Predict Car Purchasing Amount
No ratings yet
Project 1 - Predict Car Purchasing Amount
25 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
Basic Concepts of Data Mining, Clustering and Genetic Algorithms
No ratings yet
Basic Concepts of Data Mining, Clustering and Genetic Algorithms
26 pages
Back Propagation Technique
No ratings yet
Back Propagation Technique
24 pages
3_answers
No ratings yet
3_answers
19 pages
09-Neural Networks
No ratings yet
09-Neural Networks
18 pages
Artificial Intellligence Based On Health Care System
No ratings yet
Artificial Intellligence Based On Health Care System
24 pages
Lesson 7.0 Supervised Learning With Neural Networks (1)
No ratings yet
Lesson 7.0 Supervised Learning With Neural Networks (1)
22 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Analysis of Complex Sample Survey Data
No ratings yet
Analysis of Complex Sample Survey Data
20 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Artificial Neural Networks: An: G.Anuradha
No ratings yet
Artificial Neural Networks: An: G.Anuradha
76 pages
Chapter 9 - ANNs
No ratings yet
Chapter 9 - ANNs
25 pages
Pattern Recognition 14
No ratings yet
Pattern Recognition 14
46 pages
module4_DS_ppt
No ratings yet
module4_DS_ppt
49 pages
(L06) Decision Tree Induction - Frequency Tables
No ratings yet
(L06) Decision Tree Induction - Frequency Tables
54 pages
10 SVMAndEvaluation PDF
No ratings yet
10 SVMAndEvaluation PDF
60 pages
Learning: Book: Artificial Intelligence, A Modern Approach (Russell & Norvig)
No ratings yet
Learning: Book: Artificial Intelligence, A Modern Approach (Russell & Norvig)
22 pages
H1.1 Definitions, Measures, Plots, CLT
No ratings yet
H1.1 Definitions, Measures, Plots, CLT
83 pages
Unit 2-Ann
No ratings yet
Unit 2-Ann
62 pages
AI Lecture
No ratings yet
AI Lecture
63 pages
Unit4_PPT
No ratings yet
Unit4_PPT
118 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
34 pages
Unit 2 Soft
No ratings yet
Unit 2 Soft
14 pages
MLP Lecture 4
No ratings yet
MLP Lecture 4
35 pages
AI & ML Unit 5 Notes
No ratings yet
AI & ML Unit 5 Notes
23 pages
Combining Classifiers: Outline
No ratings yet
Combining Classifiers: Outline
15 pages
8.NN and Clustering Moodle
No ratings yet
8.NN and Clustering Moodle
51 pages
Neural Networks - 2
No ratings yet
Neural Networks - 2
79 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
What Actions Can Human Brain Do?: Trained
No ratings yet
What Actions Can Human Brain Do?: Trained
40 pages
Learning
No ratings yet
Learning
48 pages
2025-Lecture07-P2-MLP
No ratings yet
2025-Lecture07-P2-MLP
56 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
28 pages
Lec 6-7 (Neural Networks)
No ratings yet
Lec 6-7 (Neural Networks)
26 pages
Data Mining - Ensemble Methods
No ratings yet
Data Mining - Ensemble Methods
12 pages
Perceptron
No ratings yet
Perceptron
26 pages
Introduction To Artificial Neural Network
100% (1)
Introduction To Artificial Neural Network
42 pages
Mod 2.1,2.2
No ratings yet
Mod 2.1,2.2
24 pages
MLCH9
No ratings yet
MLCH9
45 pages
Statistics c.1
No ratings yet
Statistics c.1
125 pages
Classification Algorithm: Supervised Learning Technique Training Data
No ratings yet
Classification Algorithm: Supervised Learning Technique Training Data
28 pages
9.54 Class 13: Unsupervised Learning
No ratings yet
9.54 Class 13: Unsupervised Learning
54 pages
UNIT 4 - Perceptron and DL
No ratings yet
UNIT 4 - Perceptron and DL
39 pages
Neural Networks
From Everand
Neural Networks
Sasha Kurzweil
No ratings yet
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Chapter 08
No ratings yet
Chapter 08
52 pages
SE-6104 Data Mining and Analytics: Lecture # 12 Rule Based Classification
No ratings yet
SE-6104 Data Mining and Analytics: Lecture # 12 Rule Based Classification
62 pages
Dr. Huma Qayyum Department of Software Engineering Huma - Ayub@uettaxila - Edu.pk
No ratings yet
Dr. Huma Qayyum Department of Software Engineering Huma - Ayub@uettaxila - Edu.pk
20 pages
Design and Analysis of Algorithm Course Code: 5009
No ratings yet
Design and Analysis of Algorithm Course Code: 5009
46 pages
Analysis of Algorithm and Design
No ratings yet
Analysis of Algorithm and Design
64 pages
Embedded and Real Time System Assignment 4 SE-2K-16 Marks 20
No ratings yet
Embedded and Real Time System Assignment 4 SE-2K-16 Marks 20
1 page
Assignment 2 (16 SE 13)
No ratings yet
Assignment 2 (16 SE 13)
7 pages
Design and Analysis of Algorithm Course Code: 5009
No ratings yet
Design and Analysis of Algorithm Course Code: 5009
59 pages
Data Warehousing & DATA MINING (SE-409) : Lecture-4
No ratings yet
Data Warehousing & DATA MINING (SE-409) : Lecture-4
28 pages
Panzieri-Davoli1993 - Chapter - RealTimeSystemsATutorial (2) Realtime
No ratings yet
Panzieri-Davoli1993 - Chapter - RealTimeSystemsATutorial (2) Realtime
28 pages
Arduino: Introduction & Programming: Course Instructor
100% (1)
Arduino: Introduction & Programming: Course Instructor
29 pages
Design and Analysis of Algorithm: Lecture 13, 14 Backtracking
No ratings yet
Design and Analysis of Algorithm: Lecture 13, 14 Backtracking
8 pages
Introduction To Real-Time Systems
No ratings yet
Introduction To Real-Time Systems
13 pages
Data Warehousing & DATA MINING (SE-409) : Lecture-2
No ratings yet
Data Warehousing & DATA MINING (SE-409) : Lecture-2
36 pages
Design and Analysis of Algorithm Course Code: 5009
No ratings yet
Design and Analysis of Algorithm Course Code: 5009
50 pages
Grade 6 Impromptu Rubric BC
No ratings yet
Grade 6 Impromptu Rubric BC
1 page
Yoga Sutras Pada 4
No ratings yet
Yoga Sutras Pada 4
3 pages
An Evaluation of Conditioned Inhibition As Defined by Rescorla's Two-Test Strategy
No ratings yet
An Evaluation of Conditioned Inhibition As Defined by Rescorla's Two-Test Strategy
19 pages
Context, Register, Genre
No ratings yet
Context, Register, Genre
4 pages
Q1-Week 8
No ratings yet
Q1-Week 8
5 pages
Odyssey 2017-3 Lutz The Early Years Parents and Children Reading Together
No ratings yet
Odyssey 2017-3 Lutz The Early Years Parents and Children Reading Together
7 pages
Teaching Reading in Social Studies Science and Math Chapter 8
No ratings yet
Teaching Reading in Social Studies Science and Math Chapter 8
13 pages
ResearchComponents PDF
No ratings yet
ResearchComponents PDF
19 pages
Lesson Plan
No ratings yet
Lesson Plan
9 pages
Job Satisfaction and Performance Level of Employees of Ajinomoto Philippines Corporation Lucena Branch
No ratings yet
Job Satisfaction and Performance Level of Employees of Ajinomoto Philippines Corporation Lucena Branch
43 pages
7 Golem Learning
No ratings yet
7 Golem Learning
2 pages
Faith and Rationality Plantinga PDF
No ratings yet
Faith and Rationality Plantinga PDF
1 page
Perdev 3 and 4
No ratings yet
Perdev 3 and 4
7 pages
Hambatan Dan Harapan Pemartabatan Bahasa Indonesia: Mohamad Jazeri Dan Siti Zumrotul Maulida
No ratings yet
Hambatan Dan Harapan Pemartabatan Bahasa Indonesia: Mohamad Jazeri Dan Siti Zumrotul Maulida
10 pages
COT - Template DLL
No ratings yet
COT - Template DLL
8 pages
The Flight From Conversation
No ratings yet
The Flight From Conversation
5 pages
Running Head: Fluency Literature Review 1
No ratings yet
Running Head: Fluency Literature Review 1
11 pages
Curriculum Implementation & Evaluation
100% (3)
Curriculum Implementation & Evaluation
121 pages
Towards A Strategic Place Brand-Management Model
No ratings yet
Towards A Strategic Place Brand-Management Model
20 pages
ELC501
No ratings yet
ELC501
11 pages
LeaP Math G10 Week 3 Q3
No ratings yet
LeaP Math G10 Week 3 Q3
4 pages
ECI 511 Lesson Plan - Integrating Technology Tools in English/Language Arts
No ratings yet
ECI 511 Lesson Plan - Integrating Technology Tools in English/Language Arts
3 pages
MEG 04 - Aspects of Language
100% (1)
MEG 04 - Aspects of Language
11 pages
Product Design QP
No ratings yet
Product Design QP
3 pages
Prof. Ed. 6 Unit 3
No ratings yet
Prof. Ed. 6 Unit 3
6 pages
Humans and Evolution On The Next Species PDF
No ratings yet
Humans and Evolution On The Next Species PDF
53 pages
English Mock Exam 1
No ratings yet
English Mock Exam 1
6 pages

SE-6104 Data Mining and Analytics: Lecture # 13 Advance Classification

Uploaded by

SE-6104 Data Mining and Analytics: Lecture # 13 Advance Classification

Uploaded by

SE-6104

Data Mining and Analytics

• Increase in weight of misclassified

• Decrease in weight of misclassified

Taking a more closer look

Put together, they form the

Mind you, a neuron is a three dimensional entity!

• The n-dimensional input vector x is mapped into variable y by means of

• Given a unit, j in a hidden or output layer, the net input, Ij , to unit j is

You might also like