Machine Learning: Decision Trees & XGBoost

The document provides an overview of machine learning algorithms, specifically Decision Trees, Random Forest, and XGBoost. It explains the structure and terminology of Decision Trees, the concept of Entropy and Information Gain, and the advantages of Random Forest in reducing overfitting. Additionally, it highlights the unique features of XGBoost, including regularization and handling sparse data, emphasizing its effectiveness in complex data processing tasks.

Uploaded by

Mohamed Ragab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views21 pages

Machine Learning: Decision Trees & XGBoost

Uploaded by

Mohamed Ragab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

DT, RF and XGB

Machine learning chapter

Decision Tree
• A decision tree is a versatile non-
parametric algorithm used for both
classification and regression tasks. It
features a hierarchical structure with a
root node, branches, internal nodes,
and leaf nodes. This tree-like model is
employed in decision support systems,
depicting decisions and outcomes
based on conditional control
statements. Its straightforward
structure makes it easy to understand,
and it finds applications in diverse
areas for tasks such as classification
and regression, using feature-based
splits to guide predictions from the
root to the leaves.
Decision Tree Terminologies
• Root Node: The initial node at the beginning of a decision tree, where the entire
population or dataset starts dividing based on various features or conditions.

A Root Node
Decision Tree Terminologies
• Decision Nodes: Nodes resulting from the splitting of root nodes are known as
decision nodes. These nodes represent intermediate decisions or conditions within
the tree.

Decision Nodes
Decision Tree Terminologies
• Leaf Nodes: Nodes where further splitting is not possible, often indicating the final
classification or outcome. Leaf nodes are also referred to as terminal nodes.

Leaf Nodes
Decision Tree Terminologies
• Sub-Tree: Similar to a subsection of a graph being called a sub-graph, a sub-section
of a decision tree is referred to as a sub-tree. It represents a specific portion of the
decision tree.
Sub-Tree
Decision Tree Terminologies
• Pruning: The process of removing or cutting down specific nodes in a decision tree
to prevent overfitting and simplify the model.

Sub-Tree
Decision Tree Terminologies
• Parent and Child Node: In a decision tree, a node that is divided into sub-nodes is known as a parent
node, and the sub-nodes emerging from it are referred to as child nodes. The parent node represents a
decision or condition, while the child nodes represent the potential outcomes or further decisions based
on that condition.

Parent

child
child
child
Entropy
• Entropy is nothing but the uncertainty in our dataset or measure of disorder.

• The formula for Entropy is shown below:

𝐸 ( 𝑆 ) =− 𝑝 ¿ ¿
Here,
• is the probability of positive class
• is the probability of negative class
• is the subset of the training example
Decision Trees use Entropy !
• Entropy basically measures the impurity of a node. Impurity is the degree of
randomness; it tells how random our data is. At some dataset that either you
should be getting “yes”, or you should be getting “no”.
Decision Trees use Entropy !

For feature 3,
Information Gain
• Information gain measures the reduction of uncertainty given some feature and it is
also a deciding factor for which attribute should be selected as a decision node or
root node.
• It is just entropy of the full dataset – entropy of the dataset given some feature.

𝐼𝐺=𝐸 ( 𝑌 ) − 𝐸 ( 𝑌 | 𝑋 )
Random Forest
• Random Forest Algorithm
widespread popularity stems
from its user-friendly nature
and adaptability, enabling it to
tackle both classification and
regression problems effectively.
The algorithm’s strength lies in
its ability to handle complex
datasets and mitigate
overfitting, making it a valuable
tool for various predictive tasks
in machine learning.
Random Forest Understanding
• Let’s dive into a real-life analogy to understand this concept further. A student
named X wants to choose a course after his college, and he is confused about the
choice of course based on his skill set. So he decides to consult various people like
his cousins, teachers, parents, degree students, and working people. He asks them
varied questions like why he should choose, job opportunities with that course,
course fee, etc. Finally, after consulting various people about the course he decides
to take the course suggested by most people.
Ensemble Learning Technique
Ensemble simply means combining multiple models. Thus a collection of
models is used to make predictions rather than an individual model.

• Ensemble uses two types of methods:

 Bagging It creates a different training subset
from sample training data with replacement &
the final output is based on majority voting.

 Boosting It combines weak learners into

strong learners by creating sequential models
such that the final model has the highest
accuracy. For example, ADA BOOST, XG
BOOST.
Random Forest Algorithm
• Step 1: In the Random forest model, a subset
of data points and a subset of features is
selected for constructing each decision tree.
Simply put, n random records and m features
are taken from the data set having k number
of records.
• Step 2: Individual decision trees are
constructed for each sample.
• Step 3: Each decision tree will generate an
output.
• Step 4: Final output is considered based on
Majority Voting or Averaging for Classification
and regression, respectively.
Difference Between Decision Tree
and Random Forest
Decision trees Random Forest

Decision trees normally suffer from the Random forests are created from subsets of
problem of overfitting if it’s allowed to data, and the final output is based on
grow without any control. average or majority ranking; hence the
problem of overfitting is taken care of.

A single decision tree is faster in It is comparatively slower.

computation.

When a data set with features is taken as Random forest randomly selects
input by a decision tree, it will formulate observations, builds a decision tree, and
some rules to make predictions. takes the average result. It doesn’t use any
set of formulas.
XGBoost Algorithm
• a potent algorithm, excels in scalability,
facilitating swift learning through
parallel and distributed computing
while ensuring efficient memory
utilization. CERN recognized its merit
as the optimal approach for classifying
signals from the Large Hadron Collider.
Faced with the challenge of processing
3 petabytes of data annually, XGBoost
emerged as the most effective and
robust solution, adept at distinguishing
extremely rare signals from
background noise in complex physical
processes.
Gradient Boosting
• Gradient Boosting, including algorithms like XGBoost, has proven to be a powerful and
versatile machine learning technique with several advantages and potentials
1. High Predictive Accuracy :It builds a strong predictive model by
combining the predictions of multiple weak learners (typically
decision trees).
2. Handling Nonlinear Relationships : capable of capturing complex,
nonlinear relationships in data, making it suitable for a wide range of
applications.
3. Flexibility :It can be used for both regression and classification
problems, making it a versatile choice for different types of tasks.
And more like : Feature Importance, Robustness to Overfitting,
Parallelization, Ensemble Learning
Unique Features of XGBoost
• Regularization: XGBoost has an option to penalize complex models through both L1
and L2 regularization. Regularization helps in preventing overfitting
• Handling sparse data: Missing values or data processing steps like one-hot encoding
make data sparse. XGBoost incorporates a sparsity-aware split finding algorithm to
handle different types of sparsity patterns in the data
• Weighted quantile sketch: Most existing tree based algorithms can find the split points
when the data points are of equal weights (using quantile sketch algorithm)
• Out-of-core computing: This feature optimizes the available disk space and maximizes
its usage when handling huge datasets that do not fit into memory
• Cache awareness: In XGBoost, non-continuous memory access is required to get the
gradient statistics by row index. Hence, XGBoost has been designed to make optimal
use of hardware.
Session Finished

Thank You!
MACHINFY EDUCATION TEAM

Understanding Random Forest Algorithm
No ratings yet
Understanding Random Forest Algorithm
39 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
35 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
4 pages
Decision Trees and Ensemble Methods
No ratings yet
Decision Trees and Ensemble Methods
69 pages
KNN and Decision Tree Algorithms Explained
No ratings yet
KNN and Decision Tree Algorithms Explained
37 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
8 pages
Unit 4
No ratings yet
Unit 4
33 pages
Present
No ratings yet
Present
20 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
14 pages
Understanding Decision Trees
No ratings yet
Understanding Decision Trees
45 pages
Understanding Decision Trees and Their Uses
No ratings yet
Understanding Decision Trees and Their Uses
8 pages
Supervised Learning: Nonlinear Models Overview
No ratings yet
Supervised Learning: Nonlinear Models Overview
30 pages
Nonlinear Models in Supervised Learning
No ratings yet
Nonlinear Models in Supervised Learning
30 pages
Overview of Random Forest in ML
No ratings yet
Overview of Random Forest in ML
25 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
41 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
76 pages
Random Forests & Gradient Boosting in ML
No ratings yet
Random Forests & Gradient Boosting in ML
24 pages
Clustering and Decision Trees Explained
No ratings yet
Clustering and Decision Trees Explained
34 pages
Decision Trees and Random Forests Overview
No ratings yet
Decision Trees and Random Forests Overview
28 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
22 pages
Supervised Learning: Decision Trees & Random Forest
No ratings yet
Supervised Learning: Decision Trees & Random Forest
73 pages
Ensemble Methods in Machine Learning
No ratings yet
Ensemble Methods in Machine Learning
24 pages
Understanding Decision Trees in ML
0% (1)
Understanding Decision Trees in ML
16 pages
Understanding Decision Trees in Machine Learning
No ratings yet
Understanding Decision Trees in Machine Learning
23 pages
Decision Trees and Ensemble Methods Overview
No ratings yet
Decision Trees and Ensemble Methods Overview
26 pages
Understanding Ensemble Learning Techniques
No ratings yet
Understanding Ensemble Learning Techniques
6 pages
Random Forest Algorithm Overview
No ratings yet
Random Forest Algorithm Overview
6 pages
Understanding Decision Trees in ML
No ratings yet
Understanding Decision Trees in ML
18 pages
Supervised Learning: Decision Trees Explained
No ratings yet
Supervised Learning: Decision Trees Explained
42 pages
Decision Tree Classification Overview
No ratings yet
Decision Tree Classification Overview
11 pages
Decision Tree vs Random Forest Analysis
No ratings yet
Decision Tree vs Random Forest Analysis
10 pages
Ensemble Learning and Random Forests Guide
No ratings yet
Ensemble Learning and Random Forests Guide
32 pages
Decision Trees: Overview and Examples
No ratings yet
Decision Trees: Overview and Examples
22 pages
AI & ML Decision Tree Concepts Guide
No ratings yet
AI & ML Decision Tree Concepts Guide
48 pages
Understanding Bagging and Boosting
No ratings yet
Understanding Bagging and Boosting
32 pages
Decision Trees and Ensemble Learning Guide
No ratings yet
Decision Trees and Ensemble Learning Guide
47 pages
Decision Trees and Probabilistic Models
No ratings yet
Decision Trees and Probabilistic Models
25 pages
Tree and Probabilistic Models Overview
No ratings yet
Tree and Probabilistic Models Overview
46 pages
Understanding Random Forests in ML
No ratings yet
Understanding Random Forests in ML
38 pages
Decision Tree Pros and Cons
No ratings yet
Decision Tree Pros and Cons
16 pages
Decision Tree Concepts Explained
No ratings yet
Decision Tree Concepts Explained
15 pages
Understanding Tree Ensembles
No ratings yet
Understanding Tree Ensembles
3 pages
Decision Trees and Ensemble Classifiers
No ratings yet
Decision Trees and Ensemble Classifiers
25 pages
Understanding Random Forest in ML
No ratings yet
Understanding Random Forest in ML
29 pages
Decision Trees and Ensemble Learning Guide
No ratings yet
Decision Trees and Ensemble Learning Guide
70 pages
Decision Trees and Random Forests Explained
No ratings yet
Decision Trees and Random Forests Explained
60 pages
Understanding Decision Tree Classification
No ratings yet
Understanding Decision Tree Classification
16 pages
Decision Tree Structure and Algorithms
No ratings yet
Decision Tree Structure and Algorithms
5 pages
Overview of Random Forest Algorithm
No ratings yet
Overview of Random Forest Algorithm
13 pages
Decision Trees in Classification Systems
No ratings yet
Decision Trees in Classification Systems
25 pages
Decision Trees and Random Forests Explained
No ratings yet
Decision Trees and Random Forests Explained
24 pages
Decision Trees vs. Random Forests Explained
No ratings yet
Decision Trees vs. Random Forests Explained
21 pages
Understanding Decision Trees and Random Forests
No ratings yet
Understanding Decision Trees and Random Forests
5 pages
Decision Tree Learning Overview
No ratings yet
Decision Tree Learning Overview
22 pages
Decision Tree Ensemble Techniques
No ratings yet
Decision Tree Ensemble Techniques
38 pages
Random Forest Algorithm in Machine Learning
No ratings yet
Random Forest Algorithm in Machine Learning
18 pages
Decision Trees and Random Forests Explained
No ratings yet
Decision Trees and Random Forests Explained
16 pages
Decision Tree Comprehesive
No ratings yet
Decision Tree Comprehesive
7 pages
Machine Learning: Decision Trees Overview
No ratings yet
Machine Learning: Decision Trees Overview
21 pages
Overview of Decision Trees
No ratings yet
Overview of Decision Trees
2 pages
SQL Joins and Functions Explained
No ratings yet
SQL Joins and Functions Explained
29 pages
GPT-5 Prompting Optimization Guide
No ratings yet
GPT-5 Prompting Optimization Guide
27 pages
B2B AI Consultation: LinkedIn Strategy Guide
No ratings yet
B2B AI Consultation: LinkedIn Strategy Guide
3 pages
AI Consultant Certification Strategy Egypt
No ratings yet
AI Consultant Certification Strategy Egypt
6 pages
Best Practices in Demand Forecasting
No ratings yet
Best Practices in Demand Forecasting
52 pages
Statistical Distributions Overview
No ratings yet
Statistical Distributions Overview
641 pages
Fuzzy Entropy for Image Segmentation
No ratings yet
Fuzzy Entropy for Image Segmentation
13 pages
Understanding Channel Capacity Concepts
No ratings yet
Understanding Channel Capacity Concepts
9 pages
Crossword Compiler Algorithms and Entropy
No ratings yet
Crossword Compiler Algorithms and Entropy
24 pages
Reduced Channel Capacity Analysis
No ratings yet
Reduced Channel Capacity Analysis
2 pages
Image Compression Techniques Overview
No ratings yet
Image Compression Techniques Overview
30 pages
Medical Image Processing Fundamentals
No ratings yet
Medical Image Processing Fundamentals
39 pages
Amplitude Modulation and Detection Techniques
No ratings yet
Amplitude Modulation and Detection Techniques
16 pages
IT Branch Exam: Information Theory & Coding
No ratings yet
IT Branch Exam: Information Theory & Coding
6 pages
Tutorial Basic Information and Coding 2 With Answers
100% (1)
Tutorial Basic Information and Coding 2 With Answers
8 pages
Shannon's Perfect Secrecy Explained
No ratings yet
Shannon's Perfect Secrecy Explained
55 pages
Understanding Source Coding and Entropy
No ratings yet
Understanding Source Coding and Entropy
56 pages
Heterogeneous Networks Performance Evaluation
No ratings yet
Heterogeneous Networks Performance Evaluation
442 pages
Video Coding: Lossless Encoding Techniques
No ratings yet
Video Coding: Lossless Encoding Techniques
19 pages
Postgraduate Mathematics Course Overview
No ratings yet
Postgraduate Mathematics Course Overview
14 pages
MSC Syllabus ICE IU (2008 2009)
No ratings yet
MSC Syllabus ICE IU (2008 2009)
20 pages
Huffman Coding and Entropy Explained
No ratings yet
Huffman Coding and Entropy Explained
31 pages
M.Tech ECE Syllabus Overview
No ratings yet
M.Tech ECE Syllabus Overview
14 pages
Shannon's Noisy Channel Theorem
No ratings yet
Shannon's Noisy Channel Theorem
6 pages
Socionics: Information Metabolism Review
No ratings yet
Socionics: Information Metabolism Review
48 pages
Entropy and Source Coding Problem Set
No ratings yet
Entropy and Source Coding Problem Set
5 pages
Classification Techniques in Data Mining
No ratings yet
Classification Techniques in Data Mining
37 pages
Understanding Information Theory Concepts
No ratings yet
Understanding Information Theory Concepts
24 pages
Unifying Diversity and Evenness Measures
No ratings yet
Unifying Diversity and Evenness Measures
6 pages
Decision Trees for Breast Cancer Data
No ratings yet
Decision Trees for Breast Cancer Data
5 pages
Emerging Powers and Global Disorder
No ratings yet
Emerging Powers and Global Disorder
15 pages
Decision Trees in AI: Overview and Learning
No ratings yet
Decision Trees in AI: Overview and Learning
47 pages
Kullback-Leibler Divergence Overview
No ratings yet
Kullback-Leibler Divergence Overview
22 pages
Information Theory Exercises ECE457
No ratings yet
Information Theory Exercises ECE457
2 pages
Duke's Excel to MySQL Specialization
0% (7)
Duke's Excel to MySQL Specialization
30 pages

Machine Learning: Decision Trees & XGBoost

Uploaded by

Machine Learning: Decision Trees & XGBoost

Uploaded by

DT, RF and XGB

Machine learning chapter

• The formula for Entropy is shown below:

• Ensemble uses two types of methods:

 Boosting It combines weak learners into

A single decision tree is faster in It is comparatively slower.

You might also like