0% found this document useful (0 votes)

17 views20 pages

Present

This document provides an overview of decision trees and random forests machine learning algorithms. It defines decision trees as algorithms that recursively partition data into subsets based on feature values at each node, and random forests as an ensemble method that trains multiple decision trees on random subsets of data and features. The document outlines key concepts like information gain, gini impurity, bagging, and the random subspace method. It discusses applications in domains like healthcare, finance, and marketing. Challenges and ethical considerations of these algorithms are also presented.

Uploaded by

ayushkukreja30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views20 pages

Present

Uploaded by

ayushkukreja30

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

Decision Tree

and
Random Forest

Presenters: Atul Jaguri, Ayush Kukreja

Daivik Mohan
Table of Content:
1. Mathematical formulation of algorithm.
1.1 Explain its signification in today's technology driven world.
1.2 Real time applications.
2. Challenges and ethical considerations in data collection an usage.
3. Real-world applications future trends.
4. Evaluation metrics.
5. Model deployment.
6. Problem solving by using the given algorithm.
7. References.
Introduction
Decision Tree:
•Input: Training dataset D = {(X1, y1), (X2, y2), … , (XN, yN)}
•Algorithm: Recursive partitioning based on features, with splitting
criteria (e.g., Gini impurity or entropy).
•Prediction: Traverse tree to reach a leaf node for class prediction.

Random Forest (Ensemble of Decision Trees):

•Train multiple decision trees on random subsets of data and features.
•Aggregate predictions (classification: majority vote, regression:
average).
Definition
A decision tree is a supervised machine learning algorithm used for both
classification and regression tasks. It works by recursively partitioning the
dataset into subsets based on the most significant attribute at each node of the
tree. The goal is to create a tree that makes accurate predictions on unseen data.

Who to loan?
• Not a student
• 45 years old
• Medium income
• Fair credit record
 Yes

• Student
• 27 years old
• Low income
• Excellent credit record
 No
Decision Tree Learning
Entropy
• Entropy measures the degree of randomness in data

• For a set of samples with classes:

where is the proportion of elements of class

• Lower entropy implies greater predictability!

Information Gain
• The information gain of an attribute a is the expected reduction
in entropy due to splitting on values of a:

where is the subset of for which

Gini Impurity
• Gini impurity measures how often a randomly chosen example would be incorrectly
labeled if it was randomly labeled according to the label distribution

• For a set of samples with classes:

where is the proportion of elements of class

• Can be used as an alternative to entropy for selecting attributes!

Random Forests
• Random Forests:
 Instead of building a single decision tree and use it to make predictions,
build many slightly different trees and combine their predictions
• We have a single data set, so how do we obtain slightly different trees?
1. Bagging (Bootstrap Aggregating):
 Take random subsets of data points from the training set to create N smaller
data sets
 Fit a decision tree on each subset
2. Random Subspace Method (also known as Feature Bagging):
 Fit N different decision trees by constraining each one to operate on a
random subset of features
Bagging at training time
N subsets (with
replacement)

Training set
Bagging at inference time

A test sample

75% confidence
Random Forests

Tree 1 Tree 2
Random Forest Tree N
Significance in Today's Technology-Driven World:
• Versatility: Decision trees can handle both classification and
regression tasks.
• Interpretability: Easy to understand and interpret.
• Ensemble Power: Random Forests improve accuracy and
generalization.
• Applications: Widely used in finance, healthcare, marketing, and
more.

Real-Time Applications:
• Fraud Detection: Identify unusual patterns in real-time transactions.
• Health Monitoring: Predict patient conditions based on real-time
data.
• Online Retail: Personalized recommendations for users.
Challenges: Ethical Considerations:
1. Overfitting: Decision trees, especially deep ones, are 1. Transparency: Ensuring transparency in how decision
prone to overfitting, capturing noise in the training data trees make predictions.
rather than the underlying patterns. • Action: Providing explanations for model decisions,
• Impact: Reduced generalization performance on new, especially in critical applications like healthcare or
unseen data. finance.
2. Sensitivity to Small Variations: Small changes in the 2. Fairness: Addressing and mitigating biases in the training
training data can lead to the generation of significantly data to promote fair and unbiased predictions.
different decision trees. • Action: Regularly auditing and updating training data to
• Impact: Lack of stability and consistency in the correct biases.
model's predictions. 3. Privacy Preservation: Safeguarding individuals' privacy in
3. Bias in Data: Decision trees can perpetuate and amplify the training and deployment of decision trees.
biases present in the training data. • Action: Implementing data anonymization and
• Impact: Unfair or discriminatory predictions, encryption protocols to protect sensitive information.
reinforcing societal biases. 4. Accountability: Establishing accountability for the
4. Lack of Robustness to Outliers: Decision trees can be outcomes of decision tree models.
sensitive to outliers, leading to skewed decision • Action: Clearly defining responsibility for model
boundaries. development, monitoring, and addressing any negative
• Impact: Outliers may disproportionately influence consequences.
model predictions.
Real-World Applications and Future Trends:
• Healthcare: Predicting diseases and personalized treatment.
• Finance: Credit scoring, fraud detection.
• Marketing: Customer segmentation, recommendation systems.

Applications:
• Cybersecurity: Decision trees and random forests are used for anomaly
detection and identifying patterns indicative of cyber threats.
• Environmental Monitoring: Decision trees can be employed for analyzing
environmental data, predicting climate patterns, and assessing the
impact of human activities on ecosystems.

• Explainable AI: Enhancing interpretability.

Future
• Automated Machine Learning (AutoML): Streamlining model
development.
• Federated Learning: The future trend involves training decision
Trends: tree models across decentralized devices or servers without
exchanging raw data
Example Problem: Effect
of weather on Play?

Data Collection:
weather_forecast data. Problem Solving
Using the Given
Model Development: Train a
Decision Tree or Random Forest. Algorithm
Evaluation: Use appropriate
metrics.

Deployment: Deploy model for

real-time credit scoring.
Model Creation:
Evaluation Metrics & Model Deployment
Decision Tree [App] Random Forest
7. References:
• Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
• Scikit-learn Decision Trees Documentation
• Scikit-learn Random Forest Documentation
• Streamlit Documentation for model deployment
• Decision Tree Implementation
• Random Forest Implementation
• Scikit Learn
THANK YOU

IOS Shortcuts User Guide V 2.1
100% (2)
IOS Shortcuts User Guide V 2.1
203 pages
Good Health Clinic - Case Study v1
No ratings yet
Good Health Clinic - Case Study v1
15 pages
USER-MANUAL-EN
No ratings yet
USER-MANUAL-EN
42 pages
Dbms Lab Final A
No ratings yet
Dbms Lab Final A
9 pages
Brksec 3667
No ratings yet
Brksec 3667
166 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Ayush ML
No ratings yet
Ayush ML
29 pages
Final - Emt 11 - 12 Q2 0802 PS
No ratings yet
Final - Emt 11 - 12 Q2 0802 PS
53 pages
Random Forests
No ratings yet
Random Forests
22 pages
Randon Forest
No ratings yet
Randon Forest
34 pages
Decision Tree & Random Forest
No ratings yet
Decision Tree & Random Forest
34 pages
R-8000 Series Operation Manual (General Dynamics)
No ratings yet
R-8000 Series Operation Manual (General Dynamics)
124 pages
14 - Ensemble Methods
No ratings yet
14 - Ensemble Methods
38 pages
Ilovepdf Merged-3
No ratings yet
Ilovepdf Merged-3
70 pages
Lecture2 Decision Tree and Random Forest
No ratings yet
Lecture2 Decision Tree and Random Forest
24 pages
Machine_Learning_Lecture_08_Decision Tree Learning (1)
No ratings yet
Machine_Learning_Lecture_08_Decision Tree Learning (1)
67 pages
ENSA-Bridge v7 Scope and Sequence
100% (1)
ENSA-Bridge v7 Scope and Sequence
4 pages
Lecture 3 - Decision Trees and Random Forest
No ratings yet
Lecture 3 - Decision Trees and Random Forest
20 pages
Naive Bayes and Decision Tree Classification
No ratings yet
Naive Bayes and Decision Tree Classification
21 pages
DMI UNIT 4
No ratings yet
DMI UNIT 4
34 pages
Get Goosebumps Writing Contest OFFICIAL RULES
No ratings yet
Get Goosebumps Writing Contest OFFICIAL RULES
2 pages
Hire Your Geek
No ratings yet
Hire Your Geek
4 pages
DS Unit - 4
No ratings yet
DS Unit - 4
76 pages
Ch5 Data Science
No ratings yet
Ch5 Data Science
60 pages
Random Forest Algorithm Updated PPT
No ratings yet
Random Forest Algorithm Updated PPT
11 pages
Presentation Report S2019 Artificial Intelligence-CS360
No ratings yet
Presentation Report S2019 Artificial Intelligence-CS360
9 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
39 pages
Module 5 Machine Learning
No ratings yet
Module 5 Machine Learning
36 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Da MS
No ratings yet
Da MS
24 pages
Hls7000dn Use QSG Leb502001
No ratings yet
Hls7000dn Use QSG Leb502001
35 pages
Decision Trees Notes
No ratings yet
Decision Trees Notes
5 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
Practice Ca
No ratings yet
Practice Ca
5 pages
ISO 27000 Szabványcsalád 20200114
No ratings yet
ISO 27000 Szabványcsalád 20200114
6 pages
Random Forests
No ratings yet
Random Forests
35 pages
Top Ten Test - Foundation - Set 1 - Red - Paper 1
No ratings yet
Top Ten Test - Foundation - Set 1 - Red - Paper 1
11 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
32 pages
ML pp12_u2
No ratings yet
ML pp12_u2
18 pages
MLS+1+-+Decision+Trees+and+Random+Forests
No ratings yet
MLS+1+-+Decision+Trees+and+Random+Forests
16 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Random Forest
No ratings yet
Random Forest
25 pages
Iot5X Module 1 Activity - Testing An Iot House: Open The Activity Starting Point File
No ratings yet
Iot5X Module 1 Activity - Testing An Iot House: Open The Activity Starting Point File
6 pages
Excel Shortcut and Function Keys Microsoft Office Excel 2003
No ratings yet
Excel Shortcut and Function Keys Microsoft Office Excel 2003
5 pages
7452ug PDF
No ratings yet
7452ug PDF
272 pages
Decision-Trees-in-AI
No ratings yet
Decision-Trees-in-AI
8 pages
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
No ratings yet
Decision Trees_ a Complete Introduction With Examples _ by Shubham Koli _ Medium
22 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Sciencedirect: Workforce Scheduling Linear Programming Formulation
No ratings yet
Sciencedirect: Workforce Scheduling Linear Programming Formulation
11 pages
EN Jabra Evolve2 30 UC Datasheet A4 220323 - WEB
No ratings yet
EN Jabra Evolve2 30 UC Datasheet A4 220323 - WEB
2 pages
Introduction To R
No ratings yet
Introduction To R
8 pages
Random Forest
No ratings yet
Random Forest
29 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
kiran ppt
No ratings yet
kiran ppt
12 pages
Mtech Blockchain Big Data Brochure
No ratings yet
Mtech Blockchain Big Data Brochure
15 pages
Decision Tree Comprehesive
No ratings yet
Decision Tree Comprehesive
7 pages
Unit 4
No ratings yet
Unit 4
33 pages
JYOTITIWARI
No ratings yet
JYOTITIWARI
2 pages
Lecture 6
No ratings yet
Lecture 6
24 pages
2023AIB1008_Lab08
No ratings yet
2023AIB1008_Lab08
8 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
13030822039_Aditri Chaudhuri_DM_
No ratings yet
13030822039_Aditri Chaudhuri_DM_
10 pages
Random Forest
No ratings yet
Random Forest
25 pages
Pricing Powtoon
No ratings yet
Pricing Powtoon
1 page
(Google Ads) Ad Format - Search - Account Structure Playbook - Best Practise
No ratings yet
(Google Ads) Ad Format - Search - Account Structure Playbook - Best Practise
79 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
4 pages
Random FOrest
No ratings yet
Random FOrest
19 pages
Objectives
No ratings yet
Objectives
23 pages
Random forest algorithm 1
No ratings yet
Random forest algorithm 1
14 pages
Release Information: ACM X-Gateway
No ratings yet
Release Information: ACM X-Gateway
3 pages
Project Outline
No ratings yet
Project Outline
2 pages
Salesforce Supported Virtual Internship Program 2024 Overview
No ratings yet
Salesforce Supported Virtual Internship Program 2024 Overview
26 pages
Lect 6-7 Notes Decision Tree
No ratings yet
Lect 6-7 Notes Decision Tree
4 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Business Analytics: Foundation: Material Handouts
No ratings yet
Business Analytics: Foundation: Material Handouts
7 pages
Machine learning
No ratings yet
Machine learning
5 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
E R P System Assignment-3
No ratings yet
E R P System Assignment-3
12 pages
Ca-Project: Aryan Devesh Puja Shabnas Mudit
No ratings yet
Ca-Project: Aryan Devesh Puja Shabnas Mudit
8 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
4 pages
Media and Information Literacy Exam Prelim
No ratings yet
Media and Information Literacy Exam Prelim
2 pages
Learning Decision Trees
No ratings yet
Learning Decision Trees
13 pages
Administering Splunk Soar Course Description
No ratings yet
Administering Splunk Soar Course Description
1 page
Unit-1 Introduction To Microprocessor Architecture PDF
No ratings yet
Unit-1 Introduction To Microprocessor Architecture PDF
15 pages
Rab Alat Jurusan Animasi PDF
No ratings yet
Rab Alat Jurusan Animasi PDF
3 pages
Principles of Data Mining
From Everand
Principles of Data Mining
Subodh Keshari
No ratings yet
Data-Driven Decision Making
From Everand
Data-Driven Decision Making
Aadinath Pothuvaal
No ratings yet
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
"Big Data Science" Basic Concepts and Applications
From Everand
"Big Data Science" Basic Concepts and Applications
Sukanta Bhattacharya
No ratings yet

Present

Uploaded by

Present

Uploaded by

Decision Tree

Presenters: Atul Jaguri, Ayush Kukreja

Random Forest (Ensemble of Decision Trees):

• For a set of samples with classes:

where is the proportion of elements of class

• Lower entropy implies greater predictability!

where is the subset of for which

• For a set of samples with classes:

where is the proportion of elements of class

• Can be used as an alternative to entropy for selecting attributes!

• Explainable AI: Enhancing interpretability.

Deployment: Deploy model for

You might also like