0% found this document useful (0 votes)

14 views25 pages

Data Mining: Classification & Prediction

The document provides an introduction to classification and prediction in data mining, detailing various methods such as decision tree induction, Bayesian classification, and backpropagation. It discusses the processes involved in model construction and usage, as well as the differences between supervised and unsupervised learning. Additionally, it covers issues related to data preparation, evaluation of classification methods, and the fundamentals of prediction through regression analysis.

Uploaded by

shambelworku8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views25 pages

Data Mining: Classification & Prediction

Uploaded by

shambelworku8

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Debre Tabor University

Gafat Institute of Technology

Department of Computer Science

Introduction to Data Mining & Warehousing

For 4th Year IT Computer Science students
Instructors: Habtu Hailu (PhD)

November, 24
Chapter 04
Classification and Prediction

 What is classification? What is prediction?

 Issues regarding classification and prediction
 Classification by decision tree induction
 Bayesian Classification
 Classification by Backpropagation
 Prediction
 Classification accuracy
 Summary
Classification vs. Prediction
 Classification
 predicts categorical class labels (discrete or nominal)

 classifies data (constructs a model) based on the training

set and the values (class labels) in a classifying attribute

and uses it in classifying new data
 Prediction
 models continuous-valued functions, i.e., predicts

unknown or missing values

 Typical applications
 Credit approval

 Target marketing

 Medical diagnosis

 Fraud detection

3
Classification—A Two-Step
Process
 Model construction: describing a set of predetermined classes
 Each tuple/sample is assumed to belong to a predefined

class, as determined by the class label attribute

 The set of tuples used for model construction is training

set
 The model is represented as classification rules, decision

trees, or mathematical formulae

 Model usage: for classifying future or unknown objects
 Estimate accuracy of the model


The known label of test sample is compared with the
classified result from the model

Accuracy rate is the percentage of test set samples
that are correctly classified by the model

Test set is independent of training set, otherwise over-
fitting will occur
 If the accuracy is acceptable, use the model to classify

data tuples whose class labels are not known 4

Process (1): Model
Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier

Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes IF rank = ‘professor’
Dave Assistant Prof 6 no
OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
5
Process (2): Using the Model in
Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
6
Supervised vs. Unsupervised
Learning

 Supervised learning (classification)

 Supervision: The training data (observations,
measurements, etc.) are accompanied by labels
indicating the class of the observations
 New data is classified based on the training set
 Unsupervised learning (clustering)
 The class labels of training data is unknown
 Given a set of measurements, observations, etc.
with the aim of establishing the existence of
classes or clusters in the data

7
Issues: Data Preparation
 Data cleaning
 Preprocess data in order to reduce noise and
handle missing values
 Relevance analysis (feature selection)
 Remove the irrelevant or redundant attributes
 Data transformation
 Generalize and/or normalize data

8
Issues: Evaluating Classification
Methods
 Accuracy
 classifier accuracy: predicting class label

 predictor accuracy: guessing value of predicted attributes

 Speed
 time to construct the model (training time)

 time to use the model (classification/prediction time)

 Robustness: handling noise and missing values

 Scalability: efficiency in disk-resident databases
 Interpretability
 understanding and insight provided by the model

 Other measures, e.g., goodness of rules, such as decision tree

size or compactness of classification rules

9
Classification by Decision Tree
Induction
 Decision tree
 A flow-chart-like tree structure

 Internal node denotes a test on an attribute

 Branch represents an outcome of the test

 Leaf nodes represent class labels or class distribution

 Decision tree generation consists of two phases

 Tree construction


At start, all the training examples are at the root

Partition examples recursively based on selected
attributes
 Tree pruning


Identify and remove branches that reflect noise or
outliers
 Use of decision tree: Classifying an unknown sample
 Test the attribute values of the sample against the

decision tree
Decision Tree Induction: Training
Dataset

age income student credit_rating buys_computer

<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
<=30 low yes fair yes
>40 medium yes fair yes
<=30 medium yes excellent yes
31…40 medium no excellent yes
31…40 high yes fair yes
>40 medium no excellent no

This follows an example of Quinlan’s ID3 (Playing Tennis) 11

Output: A Decision Tree for
“buys_computer”

age?

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair

no yes no yes

12
Algorithm for Decision Tree
Induction
 Basic algorithm (a greedy algorithm)
 Tree is constructed in a top-down recursive divide-and-

conquer manner
 At start, all the training examples are at the root

 Attributes are categorical (if continuous-valued, they are

discretized in advance)
 Examples are partitioned recursively based on selected

attributes
 Test attributes are selected on the basis of a heuristic or

statistical measure (e.g., information gain)

 Conditions for stopping partitioning
 All samples for a given node belong to the same class

 There are no remaining attributes for further partitioning

– majority voting is employed for classifying the leaf

 There are no samples left
13
Attribute Selection Measure:
Information Gain (ID3/C4.5)
 Select the attribute with the highest information
gain
 Let pi be the probability that an arbitrary tuple in D
belongs to class Ci, estimated by |Ci, D|/|D|
m
 Expected information (entropy)
Info ( D) needed to classify
  pi log 2 ( pi )
a tuple in D: i 1

 Information needed (after using A to | D j | D into v

split
v
Info A ( D)  I ( D j )
partitions) to classify D: j 1 | D |

 Information gained by branching on attribute A

Gain(A) Info(D)  Info A(D)
14
Attribute Selection: Information Gain
g Class P: buys_computer = 5 4
Info age ( D )  I (2,3)  I (4,0)
“yes” 14 14
g Class N: buys_computer = 5
9 9 5 5
Info ( D)“no”
I (9,5)  log 2 ( )  log 2 ( ) 0.940  I (3,2) 0.694
14 14 14 14 14
age pi ni I(p i, n i) 5
I (2,3) means “age <=30”
<=30 2 3 0.971 14
has 5 out of 14 samples,
31…40 4 0 0
with 2 yes’es and 3 no’s.
>40 3 2 0.971
age income student credit_rating buys_computer GainHence
(age) Info ( D )  Info age ( D ) 0.246
<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes Similarly,
>40
31…40
low
low
yes
yes
excellent
excellent
no
yes
Gain(income) 0.029
<=30 medium no fair no
<=30 low yes fair yes Gain( student ) 0.151
>40 medium yes fair yes
<=30
31…40
medium
medium
yes
no
excellent
excellent
yes
yes
Gain(credit _ rating ) 0.048
31…40 high yes fair yes
>40 medium no excellent no 15
Classification by
Backpropagation

 Backpropagation: A neural network learning

algorithm
 Started by psychologists and neurobiologists to
develop and test computational analogues of
neurons
 A neural network: A set of connected input/output
units where each connection has a weight
associated with it
 During the learning phase, the network learns
by adjusting the weights so as to be able to
predict the correct class label of the input tuples
16
Neural Network as a
Classifier
 Weakness
 Long training time
 Require a number of parameters typically best determined
empirically, e.g., the network topology or ``structure."
 Poor interpretability: Difficult to interpret the symbolic
meaning behind the learned weights and of ``hidden
units" in the network
 Strength
 High tolerance to noisy data
 Ability to classify untrained patterns
 Well-suited for continuous-valued inputs and outputs
 Successful on a wide array of real-world data
 Algorithms are inherently parallel
 Techniques have recently been developed for the
extraction of rules from trained neural networks
17
A Neuron (= a perceptron)

- mk
x0 w0
x1 w1
å f
output y
xn wn
For Example
n
Input weight weighted Activation y sign(  wi xi   k )
vector x vector w sum function i 0

 The n-dimensional input vector x is mapped into variable y

by means of the scalar product and a nonlinear function
mapping
18
A Multi-Layer Feed-Forward Neural
Network

Output vector

Err j O j (1  O j ) Errk w jk
Output layer k

 j  j  (l) Err j
wij wij  (l ) Err j Oi
Hidden layer Err j O j (1  O j )(T j  O j )
wij 1
Oj   Ij
1 e
Input layer
I j  wij Oi   j
i
Input vector: X
19
How A Multi-Layer Neural Network
Works?
 The inputs to the network correspond to the attributes
measured for each training tuple
 Inputs are fed simultaneously into the units making up the
input layer
 They are then weighted and fed simultaneously to a
hidden layer
 The number of hidden layers is arbitrary, although usually
only one
 The weighted outputs of the last hidden layer are input to
units making up the output layer, which emits the
network's prediction
 The network is feed-forward in that none of the weights
cycles back to an input unit or to an output unit of a
previous layer 20
Defining a Network Topology
 First decide the network topology: # of units in
the input layer, # of hidden layers (if > 1), # of
units in each hidden layer, and # of units in the
output layer
 Normalizing the input values for each attribute
measured in the training tuples to [0.0—1.0]
 One input unit per domain value, each initialized to
0
 Output, if for classification and more than two
classes, one output unit per class is used
 Once a network has been trained and its accuracy
is unacceptable, repeat the training process with
a different network topology or a different set of 21
What Is Prediction?
 (Numerical) prediction is similar to classification
 construct a model

 use model to predict continuous or ordered value for a

given input
 Prediction is different from classification
 Classification refers to predict categorical class label

 Prediction models continuous-valued functions

 Major method for prediction: regression

 model the relationship between one or more independent

or predictor variables and a dependent or response

variable
 Regression analysis
 Linear and multiple regression

 Non-linear regression

 Other regression methods: generalized linear model,

Poisson regression, log-linear models, regression trees 22

Linear Regression
 Linear regression: involves a response variable y and a single
predictor variable x
y = w0 + w1 x
where w0 (y-intercept) and w1 (slope) are regression
coefficients
 Multiple linear regression: involves more than one predictor
variable
 Training data is of the form (X1, y1), (X2, y2),…, (X|D|, y|D|)
 Ex. For 2-D data, we may have: y = w0 + w1 x1+ w2 x2
 Solvable by extension of least square method or using
SAS, S-Plus
 Many nonlinear functions can be transformed into the
above

23
Nonlinear Regression
 Some nonlinear models can be modeled by a
polynomial function
 A polynomial regression model can be transformed
into linear regression model. For example,
y = w0 + w1 x + w2 x2 + w3 x3
convertible to linear with new variables: x2 = x2, x3=
x3
y = w0 + w1 x + w2 x2 + w3 x3
 Other functions, such as power function, can also be
transformed to linear model

24
Classifier Accuracy C1 C2

Measures C1 True positive False

negative
C2 False True negative
classes buy_computer = buy_computer =positive
total recognition(%
yes no )
buy_computer = 6954 46 7000 99.34
yes
buy_computer = 412 2588 3000 86.27
 Accuracy
no of a classifier M, acc(M): percentage of test set tuples
that total
are correctly classified
7366 by the model
2634 M 1000 95.52

Error rate (misclassification rate) of M = 1 – acc(M)
0

Given m classes, CMi,j, an entry in a confusion matrix,
indicates # of tuples in class i that are labeled by the
classifier as class j
 Alternative accuracy measures (e.g., for cancer diagnosis)
sensitivity = t-pos/pos /* true positive recognition rate */
specificity = t-neg/neg /* true negative recognition rate
*/
precision = t-pos/(t-pos + f-pos)
accuracy = sensitivity * pos/(pos + neg) + specificity *
neg/(pos + neg)

This model can also be used for cost-benefit analysis
25

Classification and Prediction Techniques
100% (3)
Classification and Prediction Techniques
63 pages
Classification and Prediction Methods
No ratings yet
Classification and Prediction Methods
72 pages
Data Classification Techniques Overview
No ratings yet
Data Classification Techniques Overview
51 pages
Supervised vs. Unsupervised Learning
No ratings yet
Supervised vs. Unsupervised Learning
87 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
74 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
Classification and Prediction in Data Mining
No ratings yet
Classification and Prediction in Data Mining
71 pages
Classification Algorithms Overview
No ratings yet
Classification Algorithms Overview
23 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
141 pages
Data Classification in Statistics Overview
No ratings yet
Data Classification in Statistics Overview
81 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
70 pages
Data Mining: Classification & Prediction Techniques
No ratings yet
Data Mining: Classification & Prediction Techniques
91 pages
Decision Tree Algorithm for Classifying Computer Purchases
No ratings yet
Decision Tree Algorithm for Classifying Computer Purchases
159 pages
Understanding Data Classification Methods
No ratings yet
Understanding Data Classification Methods
23 pages
Classification and Prediction in Data Mining
No ratings yet
Classification and Prediction in Data Mining
30 pages
unit4_dwdm
No ratings yet
unit4_dwdm
32 pages
Classification Techniques in Machine Learning
No ratings yet
Classification Techniques in Machine Learning
110 pages
Computer Hour Pricing Overview
No ratings yet
Computer Hour Pricing Overview
129 pages
Classification Techniques Overview
No ratings yet
Classification Techniques Overview
58 pages
Classification Concepts and Techniques
No ratings yet
Classification Concepts and Techniques
103 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
67 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
63 pages
Data Mining: Concepts and Techniques: - Chapter 7
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 7
61 pages
Class V
No ratings yet
Class V
24 pages
Decision Tree Classification Methods
No ratings yet
Decision Tree Classification Methods
116 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
93 pages
Classification and Prediction Overview
No ratings yet
Classification and Prediction Overview
14 pages
Classification vs. Prediction in Data Mining
No ratings yet
Classification vs. Prediction in Data Mining
24 pages
Data Mining-Unit-3
No ratings yet
Data Mining-Unit-3
16 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
88 pages
Classification and Prediction Methods
No ratings yet
Classification and Prediction Methods
46 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
48 pages
Classification and Clustering in Data Mining
No ratings yet
Classification and Clustering in Data Mining
135 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
41 pages
DM Module III 1 of 2
No ratings yet
DM Module III 1 of 2
30 pages
Supervised Learning and Decision Trees
No ratings yet
Supervised Learning and Decision Trees
52 pages
Data Mining: Classification & Prediction
No ratings yet
Data Mining: Classification & Prediction
130 pages
Data Mining: Classification Techniques
No ratings yet
Data Mining: Classification Techniques
28 pages
Classification and Prediction in Data Mining
No ratings yet
Classification and Prediction in Data Mining
71 pages
Understanding Data Classification Techniques
No ratings yet
Understanding Data Classification Techniques
40 pages
Classification vs. Prediction in Machine Learning
No ratings yet
Classification vs. Prediction in Machine Learning
36 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
73 pages
Classification Methods in Data Mining
No ratings yet
Classification Methods in Data Mining
33 pages
Data Science: Classification Techniques
No ratings yet
Data Science: Classification Techniques
31 pages
Data Warehousing: Classification & Clustering
No ratings yet
Data Warehousing: Classification & Clustering
186 pages
K-Nearest Neighbour Algorithm Guide
No ratings yet
K-Nearest Neighbour Algorithm Guide
224 pages
Classification vs. Prediction Overview
100% (1)
Classification vs. Prediction Overview
67 pages
Numeric Prediction and Classification Overview
No ratings yet
Numeric Prediction and Classification Overview
75 pages
Classification vs. Prediction in DWM
No ratings yet
Classification vs. Prediction in DWM
12 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
47 pages
Unit 3 DM
No ratings yet
Unit 3 DM
33 pages
Data Classification and Prediction Overview
No ratings yet
Data Classification and Prediction Overview
21 pages
Data Mining: Classification and Prediction
No ratings yet
Data Mining: Classification and Prediction
75 pages
Data Classification Techniques Overview
No ratings yet
Data Classification Techniques Overview
102 pages
Lecture 4. Decision Tree and Bayesian Classification
No ratings yet
Lecture 4. Decision Tree and Bayesian Classification
42 pages
Data Mining: Classification Techniques
No ratings yet
Data Mining: Classification Techniques
50 pages
Classification and Prediction Techniques
No ratings yet
Classification and Prediction Techniques
67 pages
Data Mining Classification Techniques
No ratings yet
Data Mining Classification Techniques
85 pages
Social Media's Impact on Student Grades
No ratings yet
Social Media's Impact on Student Grades
8 pages
Fiscal Spillovers in OECD Economies
No ratings yet
Fiscal Spillovers in OECD Economies
6 pages
Programming Fundamentals Assessment
No ratings yet
Programming Fundamentals Assessment
4 pages
B1 UNITS 1 and 2 Study Skills
No ratings yet
B1 UNITS 1 and 2 Study Skills
2 pages
VHDL ULA Design with Comparisons
No ratings yet
VHDL ULA Design with Comparisons
2 pages
Newton's Laws: Force and Acceleration
No ratings yet
Newton's Laws: Force and Acceleration
7 pages
Sociocultural Factors of Radicalization in Pakistan
No ratings yet
Sociocultural Factors of Radicalization in Pakistan
19 pages
Carbon Nanotubes: Toxicity and Mechanisms
No ratings yet
Carbon Nanotubes: Toxicity and Mechanisms
23 pages
T-800 Terminator Abilities Overview
100% (1)
T-800 Terminator Abilities Overview
3 pages
Financial Accounting and Reporting 1
No ratings yet
Financial Accounting and Reporting 1
5 pages
Grade 4 Multiplication Lesson Plan
No ratings yet
Grade 4 Multiplication Lesson Plan
2 pages
Tableau Rea Medic
No ratings yet
Tableau Rea Medic
1 page
Advancements in Object Detection: A Decade Review
No ratings yet
Advancements in Object Detection: A Decade Review
11 pages
Hybridization Techniques in Crop Breeding
No ratings yet
Hybridization Techniques in Crop Breeding
11 pages
Self-Management Skills Overview
No ratings yet
Self-Management Skills Overview
5 pages
Peace Journalism in Nigeria's Media
No ratings yet
Peace Journalism in Nigeria's Media
18 pages
Understanding Random Variables and Expectation
No ratings yet
Understanding Random Variables and Expectation
39 pages
Nepal Public Procurement Guidelines 2009
No ratings yet
Nepal Public Procurement Guidelines 2009
61 pages
Ulysses: Myth and Modern Interpretation
No ratings yet
Ulysses: Myth and Modern Interpretation
2 pages
Effective Promotional Campaign Strategies
No ratings yet
Effective Promotional Campaign Strategies
8 pages
Square Roots Quick Check Lesson 6
No ratings yet
Square Roots Quick Check Lesson 6
2 pages
Psychiatric Nursing Communication Techniques
No ratings yet
Psychiatric Nursing Communication Techniques
13 pages
Key English MCQs for ECAT Exam
No ratings yet
Key English MCQs for ECAT Exam
72 pages
Design of UC 356x406x393 Steel Member
No ratings yet
Design of UC 356x406x393 Steel Member
8 pages
Principles of Electrical Engineering
100% (1)
Principles of Electrical Engineering
12 pages
IAS Trupti Dhodmise Marksheet Overview
100% (1)
IAS Trupti Dhodmise Marksheet Overview
51 pages
Georg Fischer PVC-U Product Warranty
100% (1)
Georg Fischer PVC-U Product Warranty
75 pages
Budget Management Assessment Template
No ratings yet
Budget Management Assessment Template
23 pages
School Management
No ratings yet
School Management
115 pages
Versatile Westminster Job Candidate Resume
No ratings yet
Versatile Westminster Job Candidate Resume
4 pages

Data Mining: Classification & Prediction

Uploaded by

Data Mining: Classification & Prediction

Uploaded by

Debre Tabor University

Gafat Institute of Technology

Introduction to Data Mining & Warehousing

 What is classification? What is prediction?

 classifies data (constructs a model) based on the training

set and the values (class labels) in a classifying attribute

unknown or missing values

class, as determined by the class label attribute

trees, or mathematical formulae

data tuples whose class labels are not known 4

NAME RANK YEARS TENURED Classifier

 Supervised learning (classification)

 predictor accuracy: guessing value of predicted attributes

 time to use the model (classification/prediction time)

 Robustness: handling noise and missing values

 Other measures, e.g., goodness of rules, such as decision tree

 Internal node denotes a test on an attribute

 Branch represents an outcome of the test

 Leaf nodes represent class labels or class distribution

 Decision tree generation consists of two phases

age income student credit_rating buys_computer

This follows an example of Quinlan’s ID3 (Playing Tennis) 11

student? yes credit rating?

no yes excellent fair

 Attributes are categorical (if continuous-valued, they are

statistical measure (e.g., information gain)

 There are no remaining attributes for further partitioning

– majority voting is employed for classifying the leaf

 Information needed (after using A to | D j | D into v

 Information gained by branching on attribute A

 Backpropagation: A neural network learning

 The n-dimensional input vector x is mapped into variable y

 use model to predict continuous or ordered value for a

 Prediction models continuous-valued functions

 Major method for prediction: regression

or predictor variables and a dependent or response

 Other regression methods: generalized linear model,

Poisson regression, log-linear models, regression trees 22

Measures C1 True positive False

You might also like