0% found this document useful (0 votes)

13 views

Lecture 21 (DS) - Decision Tree

decision tree

Uploaded by

anayabutt658

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

Lecture 21 (DS) - Decision Tree

decision tree

Uploaded by

anayabutt658

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Data Science

Lecture # 22
Decision Tree
• Lets look at Decision Tree model, a popular method
used for classification
• By the end of this lecture, you should be able to:
• Explain how a decision tree is used for classification
• Describe the process of constructing a decision tree for
classification
• Interpret how a decision tree comes up with a classification
decision
2

Note: All Images are taken from edx.org

Decision Tree Overview

Note: All Images are taken from edx.org

Decision Tree Overview
• The idea behind decision tree is to split the data into
subsets where each subset belongs to only one class
• This is accomplished by dividing the input space into
pure regions
• i.e. regions with samples from only one class
• With real data completely pure subsets may not be
possible, so we divide into subsets that are as pure
as possible
• Decision tree makes classification decision based on
decision boundaries 4

Note: All Images are taken from edx.org

Classification Using Decision Tree
• The root and internal nodes
have test conditions
• Each leaf node has a class
label associated with it
• Decision is made by
traversing the decision tree
• At each node test condition
answer determines which
branch to traverse
• When a leaf node is reached,
the category at the leaf node
determines the decision 5

Note: All Images are taken from edx.org

Classification Using Decision Tree
• Depth of a node is the
number of edges from root
to that node
• The depth of root node is
zero
• Depth of tree is the
number of edges in the
longest path
• Size of tree is the number
of nodes in the tree
6

Note: All Images are taken from edx.org

Example Decision Tree

• This decision tree is used to classify an animal as a 7

mammal or not a mammal

Note: All Images are taken from edx.org
Constructing Decision Tree
• Constructing a decision tree consists of following
steps:
• Start with all samples at a node
• i.e. starting with all samples at root node
• Adding additional nodes when data is split into subsets

• Partition samples based on input to create purest subsets

• i.e. each subset contains as many samples as possible belonging
to just one class

• Repeat to partition data into successively purer subsets

• Do this process until stopping criteria are satisfied

• An algorithm for constructing a decision tree model is 8

called induction algorithm

Note: All Images are taken from edx.org
Greedy Approach
• At each split,
the induction
algorithm only
considers the
best way to split
the particular
portion of the
data
• This is referred
to as greedy
approach
9

Note: All Images are taken from edx.org

How to Determine Best Split?
• Again the goal is
to partition the
data into subsets
as pure as possible
• In this example,
right partition is
more
homogeneous
subsets, since
these contain
more samples
belonging to a
single class 10

Note: All Images are taken from edx.org

Impurity Measure
• Therefore, we need to
measure the purity of a
split
• Impurity measure of a
node specifies how
mixed the resulting
subsets are
• We want the split that
minimizes the impurity
measure
• Other impurity measures
are entropy and
misclassification rate 11

Note: All Images are taken from edx.org

What Variable to Split On?
• The other factor in
determining the best
way to partition a node
is which variable to split
on
• Decision tree will test all
variables to determine
the best way to split the
nodes, using a purity
measure such as Gini
index to compare the
various possibilities 12

Note: All Images are taken from edx.org

When to Stop Splitting a Node?
• Recall that tree induction algorithm repeatedly splits
nodes to get more and more homogeneous datasets
• So when does this process stop building subsets?

• All (or x% of) samples have same class label

• Number of samples in node reaches a minimum value
• Change in impurity measure is smaller than threshold
• Max tree depth is reached
• Others… (but we’ll not discuss here) 13

Note: All Images are taken from edx.org

Tree Induction Example: Split 1

• Let’s say we want to

classify loan
applicants as being
likely to repay a loan,
or not likely to repay a
loan, based on their
income and amount of
debt they have

Note: All Images are taken from edx.org

• Building
Tree aInduction Example:
decision tree for Split
this classification 1
problem
could proceed as follows
• Consider the input space of this problem, as shown
in left figure
• One way to split this dataset into a more
homogeneous subset is to consider the decision
boundary where income is t1.
• To the right of this decision boundary are mostly red
samples
• The subsets are not completely homogeneous, but
that is the best way to split the original dataset 15

based on variable income

Note: All Images are taken from edx.org
Tree Induction Example: Split 2
• Income > t1
represented at root
node
• This is the condition
used to split the original
dataset
• Samples > t1 are
placed in right subset
and < t1 are placed in
left subset
• Because right subset is
almost perfect, it is now
16
labeled as RED
Note: All Images are taken from edx.org
Tree
• RED Induction
means Example:
loan applicant Split
loan applicants 2
likely to
repay the loan
• The second step, then, is to determine how to split
the region outlined in red
• The best way to split this data is specified by the
second decision boundary, with debts equals t2
• This is represented in the decision tree on the right
with the addition of the node with condition debt >
t2
• This region contains all blue samples meaning that
the loan applicant is not likely to repay the loan 17

Note: All Images are taken from edx.org

Decision Boundaries
• The final decision tree
implements the decision
boundaries shown as
dashed lines in left
diagram
• The label for each region
is determined by the
label of the majority of
the samples
• These labels are
reflected in the leaf
nodes of the decision
tree shown on the right 18

Note: All Images are taken from edx.org

Decision Boundaries
• Notice that decision boundaries are parallel to axes
referred as rectilinear
• The boundaries are rectilinear because each split
considers only a single variable
• Some algorithms can consider more than one
variables
• However each split has to consider all combinations
of combined variables
• Such induction algorithms are more computationally
intensive 19

Note: All Images are taken from edx.org

Decision Tree for Classification
• There are few important things to note about the
decision tree classifier
• Resulting tree is often simple and easy to understand
• Induction is computationally inexpensive, so training a
decision tree for classification can be relatively fast
• Greedy approach does not guarantee best solution
• Rectilinear decision boundaries which means it may not be
able to solve complicated classification problems that
require complex decision boundaries
• Discuss Week 7 notebooks 20

Note: All Images are taken from edx.org

TLE10 - (CSS) - Q3 - CLAS4 - Following - 5S-AND-3Rs-Environmental-Policies - v2 (FOR QA) - RHEA ROMERO PDF
100% (2)
TLE10 - (CSS) - Q3 - CLAS4 - Following - 5S-AND-3Rs-Environmental-Policies - v2 (FOR QA) - RHEA ROMERO PDF
14 pages
Data Structures Using C' Language
100% (1)
Data Structures Using C' Language
2 pages
decision_trees
No ratings yet
decision_trees
19 pages
_decision_trees
No ratings yet
_decision_trees
19 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
17 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
No ratings yet
LAB (1) Decision Tree: Islamic University of Gaza Computer Engineering Department Artificial Intelligence ECOM 5038
18 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
TEAA_ Tree Ensembles-1
No ratings yet
TEAA_ Tree Ensembles-1
43 pages
Lecture Note #5_PEC-CS701E
No ratings yet
Lecture Note #5_PEC-CS701E
16 pages
Introduction to Decision Tree Algorithm
No ratings yet
Introduction to Decision Tree Algorithm
11 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Trinh Khanh Ly 20213676
No ratings yet
Trinh Khanh Ly 20213676
13 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
decision tree
No ratings yet
decision tree
13 pages
Decision Tree
No ratings yet
Decision Tree
21 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Unit 3
No ratings yet
Unit 3
31 pages
Decisiontree
No ratings yet
Decisiontree
6 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Wk. 5.2. Decision Trees (27.10.2020)
No ratings yet
Wk. 5.2. Decision Trees (27.10.2020)
57 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
Decision Tree Theory
No ratings yet
Decision Tree Theory
22 pages
Adobe Scan 16 May 2023 (5)
No ratings yet
Adobe Scan 16 May 2023 (5)
12 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Decision Tree
No ratings yet
Decision Tree
43 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
No ratings yet
Machine Learning With Python - Machine Learning Algorithms - Decision Tree
17 pages
ML_UNIT_3_NOTES-1
No ratings yet
ML_UNIT_3_NOTES-1
118 pages
DECISION TREES-jb
No ratings yet
DECISION TREES-jb
8 pages
Decision Trees
No ratings yet
Decision Trees
21 pages
Decision Tree
No ratings yet
Decision Tree
74 pages
Decision Trees
No ratings yet
Decision Trees
45 pages
ESGB_2025_classification and regression tress [Enregistré automatiquement]
No ratings yet
ESGB_2025_classification and regression tress [Enregistré automatiquement]
43 pages
Decision Tree Algorithm, Explained-1-22
No ratings yet
Decision Tree Algorithm, Explained-1-22
22 pages
2 - Decision Tree
No ratings yet
2 - Decision Tree
23 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Chapter 5. Decision Trees
No ratings yet
Chapter 5. Decision Trees
53 pages
ML Unit 3 Notes
No ratings yet
ML Unit 3 Notes
117 pages
ML-chap9_2024_110217
No ratings yet
ML-chap9_2024_110217
52 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
No ratings yet
Apznzayn4iudcvxyoppqs61j04 7hfvwveb4orry3irmq7ekrlv08lh81olz64cb1ycwzmxuattzrg0ox0g-e Tcprei1i3bwhbnbqofqhvtixwokm0ftaoxwee3znpcytoh6jgknlof6 Rukjysosqdyan8wfbovpzrikmrpeywyu07ft Vvpsanuerxuhcghc7g6sd4pcyi9z-Wao8bn
20 pages
Decision Tree Is An Upside
No ratings yet
Decision Tree Is An Upside
7 pages
Ch5 Data Science
No ratings yet
Ch5 Data Science
60 pages
Classification: Decision Trees: Business Analytics Lecture 7/8
No ratings yet
Classification: Decision Trees: Business Analytics Lecture 7/8
35 pages
Les 3 DWM
No ratings yet
Les 3 DWM
21 pages
Decision Tree Learning
No ratings yet
Decision Tree Learning
15 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Session 9 10 Decision Tree
No ratings yet
Session 9 10 Decision Tree
41 pages
PR GTU IMP questions by jay
No ratings yet
PR GTU IMP questions by jay
35 pages
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
From Everand
Convolutional Neural Networks: Fundamentals and Applications for Analyzing Visual Imagery
Fouad Sabry
No ratings yet
Data Structure MCQ
No ratings yet
Data Structure MCQ
16 pages
Mining Frequent Patterns Without Candidate Generation
No ratings yet
Mining Frequent Patterns Without Candidate Generation
12 pages
AVL Trees in Java
No ratings yet
AVL Trees in Java
7 pages
Ejercicios (Árboles Avl)
No ratings yet
Ejercicios (Árboles Avl)
8 pages
Data Structures Full Notes
100% (2)
Data Structures Full Notes
90 pages
The Tree ADT
No ratings yet
The Tree ADT
42 pages
SEMESTER I, 20152016 Midterm
No ratings yet
SEMESTER I, 20152016 Midterm
6 pages
15 Essential Data Structure and Algorithm
No ratings yet
15 Essential Data Structure and Algorithm
8 pages
Gate Questions
100% (1)
Gate Questions
35 pages
Slide Decision Tree Concept With Example
100% (2)
Slide Decision Tree Concept With Example
69 pages
Best Notes You Can Expect
No ratings yet
Best Notes You Can Expect
4 pages
7.4 Huffman Coding (2)
No ratings yet
7.4 Huffman Coding (2)
26 pages
Data Structures & Algorithms: AVL Tree
No ratings yet
Data Structures & Algorithms: AVL Tree
20 pages
Artificial Intelligence 1 Notes On Prolog: Section 1 - Prolog As A Relational Language
No ratings yet
Artificial Intelligence 1 Notes On Prolog: Section 1 - Prolog As A Relational Language
34 pages
73 Ben-Haim Parallel Decision Tree
No ratings yet
73 Ben-Haim Parallel Decision Tree
4 pages
DS Unit 2
No ratings yet
DS Unit 2
32 pages
10-Interactive Reporting in SAP ABAP
100% (1)
10-Interactive Reporting in SAP ABAP
34 pages
Chapter7 (Page 16 To 94)
No ratings yet
Chapter7 (Page 16 To 94)
23 pages
6 BSTs and AVL Trees
No ratings yet
6 BSTs and AVL Trees
12 pages
A Graph Theoretic Approach To Statistical Data Security
No ratings yet
A Graph Theoretic Approach To Statistical Data Security
20 pages
Data Structures Laboratory
No ratings yet
Data Structures Laboratory
50 pages
MCQ For Oba
No ratings yet
MCQ For Oba
17 pages
Lecture 9: AVL-tree 1: CSC2100 Data Structure
No ratings yet
Lecture 9: AVL-tree 1: CSC2100 Data Structure
22 pages
Data Structures Unit 1 and 2
No ratings yet
Data Structures Unit 1 and 2
44 pages
SY BSC Computer Science - Syllabus
No ratings yet
SY BSC Computer Science - Syllabus
41 pages
Gitub Copilot
No ratings yet
Gitub Copilot
27 pages
Data Structures in C Using The Standard
100% (2)
Data Structures in C Using The Standard
13 pages
Expert System
No ratings yet
Expert System
27 pages