Understanding Decision Trees in ML

Uploaded by

shristii365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views14 pages

Understanding Decision Trees in ML

Uploaded by

shristii365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Decision Tree

Decision Tree
•Decision Tree is a Supervised learning technique that can be used for both classification and Regression
problems, but mostly it is preferred for solving Classification problems. It is a tree-structured classifier, where internal
nodes represent the features of a dataset, branches represent the decision rules and each leaf node
represents the outcome.
•In a Decision tree, there are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to
make any decision and have multiple branches, whereas Leaf nodes are the output of those decisions and do not
contain any further branches.
•The decisions or the test are performed on the basis of features of the given dataset.
•It is a graphical representation for getting all the possible solutions to a problem/decision based on given
conditions.
•It is called a decision tree because, similar to a tree, it starts with the root node, which expands on further branches
and constructs a tree-like structure.
•In order to build a tree, we use the CART algorithm, which stands for Classification and Regression Tree
algorithm.
•A decision tree simply asks a question, and based on the answer (Yes/No), it further split the tree into subtrees.

•ID3 (Iterative Dichotomiser 3) is a specific algorithm developed to create decision trees. It uses a particular
approach (Information Gain) to determine how to split the data at each node.

•Below diagram explains the general structure of a decision tree:

Structure of a Decision Tree
•Root Node: Root node is from where the decision tree starts. It represents the entire dataset, which further gets
divided into two or more homogeneous sets.
•Leaf Node: Leaf nodes are the final output node, and the tree cannot be segregated further after getting a leaf node.
•Splitting: Splitting is the process of dividing the decision node/root node into sub-nodes according to the given
conditions.
•Branch/Sub Tree: A tree formed by splitting the tree.
•Pruning: Pruning is the process of removing the unwanted branches from the tree.
•Parent/Child node: The root node of the tree is called the parent node, and other nodes are called the child nodes.
While implementing a Decision tree, the main issue arises that how to select the best attribute for the root node and for sub-nodes. So,
to solve such problems there is a technique which is called as Attribute selection measure or ASM. By this measurement, we can
easily select the best attribute for the nodes of the tree. There are two popular techniques for ASM, which are:
•Information Gain
•Gini Index or Gini impurity

1. Information Gain:
•Information gain is the measurement of changes in entropy after the segmentation of a dataset based on an attribute.
•It calculates how much information a feature provides us about a class.
•According to the value of information gain, we split the node and build the decision tree.
•A decision tree algorithm always tries to maximize the value of information gain, and a node/attribute having the highest information
gain is split first. It can be calculated using the below formula:

Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in data. Entropy can be calculated
as:
Where,
•S= Total number of samples
•P(yes)= probability of yes
•P(no)= probability of no
Entropy: is the measure of uncertainty of a random variable,
it characterizes the impurity of an arbitrary collection of examples.
The higher the entropy more the information content.
2. Gini Index:
•Gini index is a measure of impurity or purity used while creating a decision tree in the
CART(Classification and Regression Tree) algorithm.
•An attribute with the low Gini index should be preferred as compared to the high Gini index.
•It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits.
•Gini index can be calculated using the below formula:
Decision Tree Algorithm
1. What is the entropy of this collection of training examples with respect to the target function classification?
2. What is the information gain of a1 and a2 relative to these training examples?
3. Draw decision tree for the given dataset.
Advantages of Decision Trees:
•Interpretability: Decision trees are easy to understand and visualize.
•No Need for Feature Scaling: They work well with both numerical and categorical data without the need for
normalization.
•Handle Non-linear Relationships: They can capture complex relationships between features and the target
variable.

Disadvantages of Decision Trees:

•Overfitting: Decision trees can create overly complex models that do not generalize well to new data.
•Instability: Small changes in the data can result in a completely different tree structure.
•Bias towards Dominant Classes: If one class dominates the dataset, the tree may be biased towards that class.

Understanding Decision Tree Nodes
No ratings yet
Understanding Decision Tree Nodes
3 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
15 pages
Decision Tree Algorithms Explained
No ratings yet
Decision Tree Algorithms Explained
21 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
6 pages
Understanding Decision Tree Algorithms
No ratings yet
Understanding Decision Tree Algorithms
5 pages
Day48 Decision Trees
No ratings yet
Day48 Decision Trees
5 pages
Decision Tree Learning and Prediction Techniques
No ratings yet
Decision Tree Learning and Prediction Techniques
19 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
11 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
7 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
13 pages
Understanding Decision Tree Algorithms
No ratings yet
Understanding Decision Tree Algorithms
85 pages
Introduction to Machine Learning Concepts
No ratings yet
Introduction to Machine Learning Concepts
46 pages
Understanding Decision Tree Classification
No ratings yet
Understanding Decision Tree Classification
5 pages
Decision Tree Ensemble Techniques
No ratings yet
Decision Tree Ensemble Techniques
38 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Understanding Decision Trees in Machine Learning
No ratings yet
Understanding Decision Trees in Machine Learning
39 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
18 pages
Decision Trees and Classification Algorithms
No ratings yet
Decision Trees and Classification Algorithms
10 pages
Supervised Learning: Decision Trees Explained
No ratings yet
Supervised Learning: Decision Trees Explained
15 pages
Decision Trees in Classification Systems
No ratings yet
Decision Trees in Classification Systems
25 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
11 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
11 pages
Inductive Bias in Decision Trees
No ratings yet
Inductive Bias in Decision Trees
78 pages
Python Decision Tree Pruning Guide
No ratings yet
Python Decision Tree Pruning Guide
5 pages
Nonlinear Models in Supervised Learning
No ratings yet
Nonlinear Models in Supervised Learning
30 pages
Measuring Node Impurity in Decision Trees
No ratings yet
Measuring Node Impurity in Decision Trees
26 pages
Decision Tree Classification Explained
No ratings yet
Decision Tree Classification Explained
7 pages
Pruning Decision Trees in Python
No ratings yet
Pruning Decision Trees in Python
16 pages
Decision Tree Learning Techniques
No ratings yet
Decision Tree Learning Techniques
29 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
14 pages
Decision Tree Overview and Applications
No ratings yet
Decision Tree Overview and Applications
31 pages
Supervised Learning: Nonlinear Models Overview
No ratings yet
Supervised Learning: Nonlinear Models Overview
30 pages
Understanding Decision Tree Classification
No ratings yet
Understanding Decision Tree Classification
16 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
42 pages
Decision Trees and Probabilistic Models
No ratings yet
Decision Trees and Probabilistic Models
32 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
40 pages
Decision Tree Implementation Guide
No ratings yet
Decision Tree Implementation Guide
3 pages
SGD Classifier in Text Classification
No ratings yet
SGD Classifier in Text Classification
65 pages
Decision Tree Algorithm Overview
No ratings yet
Decision Tree Algorithm Overview
17 pages
Machine Learning: Decision Tree Overview
No ratings yet
Machine Learning: Decision Tree Overview
24 pages
Understanding Decision Trees in Data Science
No ratings yet
Understanding Decision Trees in Data Science
61 pages
Decision Tree Algorithm Explained
No ratings yet
Decision Tree Algorithm Explained
14 pages
Decision Tree Analysis in AI & ML
No ratings yet
Decision Tree Analysis in AI & ML
29 pages
Decision Trees: Gini & Information Gain
No ratings yet
Decision Trees: Gini & Information Gain
12 pages
Understanding Decision Tree Classification
No ratings yet
Understanding Decision Tree Classification
30 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
11 pages
Understanding Decision Trees in Data Analysis
No ratings yet
Understanding Decision Trees in Data Analysis
9 pages
Understanding Decision Trees in Machine Learning
No ratings yet
Understanding Decision Trees in Machine Learning
23 pages
Non-Metric Pattern Classification Methods
No ratings yet
Non-Metric Pattern Classification Methods
48 pages
Unit 4
No ratings yet
Unit 4
33 pages
1.decision Trees Concepts
No ratings yet
1.decision Trees Concepts
70 pages
Decision Trees: Understanding Data Relationships
No ratings yet
Decision Trees: Understanding Data Relationships
20 pages
Supervised Learning: KNN & Decision Trees
No ratings yet
Supervised Learning: KNN & Decision Trees
72 pages
Decision Trees and Probabilistic Models
No ratings yet
Decision Trees and Probabilistic Models
25 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
24 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
31 pages
Decision Tree Classification in ML
No ratings yet
Decision Tree Classification in ML
39 pages
Arduino Programming Basics Guide
No ratings yet
Arduino Programming Basics Guide
14 pages
Box Method for Multiplying Polynomials
100% (1)
Box Method for Multiplying Polynomials
3 pages
Essential Pandas Functions Cheat Sheet
No ratings yet
Essential Pandas Functions Cheat Sheet
4 pages
© 2017 by Mcgraw-Hill Education. Permission Required For Reproduction or Display
No ratings yet
© 2017 by Mcgraw-Hill Education. Permission Required For Reproduction or Display
27 pages
U6 Pure Mathematics Term 1 Overview
No ratings yet
U6 Pure Mathematics Term 1 Overview
7 pages
Zeros of Quadratic and Polynomial Functions
No ratings yet
Zeros of Quadratic and Polynomial Functions
10 pages
VCE Mathematical Methods Trial Exam 2023
No ratings yet
VCE Mathematical Methods Trial Exam 2023
23 pages
Multi-Pass Heat Exchanger Performance
No ratings yet
Multi-Pass Heat Exchanger Performance
10 pages
Missing Value in Mean, Median, Mode
No ratings yet
Missing Value in Mean, Median, Mode
1 page
Classical Encryption Techniques Overview
No ratings yet
Classical Encryption Techniques Overview
11 pages
Enhancing HAZOP/LOPA with Advanced Analysis
No ratings yet
Enhancing HAZOP/LOPA with Advanced Analysis
10 pages
Surds and Indices Worksheet for Class 9
No ratings yet
Surds and Indices Worksheet for Class 9
9 pages
Journal Bearing Test Rig Design & Fabrication
No ratings yet
Journal Bearing Test Rig Design & Fabrication
11 pages
Python Full Stack Internship Report
No ratings yet
Python Full Stack Internship Report
55 pages
Understanding Percentage Changes in Math
No ratings yet
Understanding Percentage Changes in Math
5 pages
Exponential Growth and Decay Problems
No ratings yet
Exponential Growth and Decay Problems
1 page
Quadrilaterals MCQs: Set 2
No ratings yet
Quadrilaterals MCQs: Set 2
2 pages
Spaces Linked by Common Areas
No ratings yet
Spaces Linked by Common Areas
46 pages
PEA-305: Alphabet & Series Tests
No ratings yet
PEA-305: Alphabet & Series Tests
55 pages
Hand Gesture Recognition Project Report
No ratings yet
Hand Gesture Recognition Project Report
28 pages
Simplex Method for Maximization Problems
No ratings yet
Simplex Method for Maximization Problems
14 pages
AI Lab Practical Exercises in Python
No ratings yet
AI Lab Practical Exercises in Python
55 pages
MOSEK Optimizer API For Python PDF
No ratings yet
MOSEK Optimizer API For Python PDF
416 pages
Measures of Variation in Data Analysis
No ratings yet
Measures of Variation in Data Analysis
8 pages
A Critique of Biosemiotics
No ratings yet
A Critique of Biosemiotics
7 pages
Asset Price Bubble Detection in the Philippines
100% (1)
Asset Price Bubble Detection in the Philippines
10 pages
Control Systems Notes for B.Tech EEE
No ratings yet
Control Systems Notes for B.Tech EEE
128 pages
Kindergarten Curriculum Overview: ELA, Math, Social Studies
No ratings yet
Kindergarten Curriculum Overview: ELA, Math, Social Studies
8 pages
Primary 6 Math: Calculating Percentages
No ratings yet
Primary 6 Math: Calculating Percentages
12 pages
Gas Behavior: Hydrogen vs. Carbon Dioxide
No ratings yet
Gas Behavior: Hydrogen vs. Carbon Dioxide
26 pages