Unit-5 Decision Trees & Ensembles Methods

Uploaded by

idalgavearpita31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

43 views11 pages

Unit-5 Decision Trees & Ensembles Methods

Uploaded by

idalgavearpita31

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Unit-5 DECISION TREES & ENSEMBLES METHODS

What Is a Decision Tree:

A decision tree is a popular supervised machine learning algorithm used for both classification and
regression tasks. It is a flowchart-like structure where each internal node represents a "decision" based on
one of the input features, each branch represents the outcome of that decision, and each leaf node
represents the final decision or the target value.
In a classification decision tree, the goal is to predict the class label of the input data, while in a
regression decision tree, the goal is to predict a continuous value. Decision trees are constructed by
recursively partitioning the input space into smaller regions, making decisions at each step based on the
features that best separate the data.
The decision-making process in a decision tree involves selecting the most informative feature at each
step to split the data into subsets that are as pure as possible with respect to the target variable. This
process continues until a stopping criterion is met, such as reaching a maximum tree depth, achieving a
minimum number of samples in a leaf node, or when further splitting does not significantly improve the
model's performance.
Decision trees are easy to interpret and visualize, making them particularly useful for understanding the
underlying patterns in the data. However, they can be prone to overfitting, especially when the trees are
deep and complex.
• Entropy:
• ntropy, in the context of decision trees and machine learning, is a measure of impurity or disorder in a set
of data. In decision tree algorithms, entropy is commonly used to determine the best feature to split the
data on at each node of the tree.
• The entropy of a set is calculated using the formula:
• H(S)=−∑i=1cpilog2(pi)
• Where:
• H(S) is the entropy of the set S.
• piis the proportion of examples in class i in set S.
• c is the number of classes.
• The entropy is highest when the classes in the dataset are evenly distributed, meaning there is maximum
uncertainty about which class a given example belongs to. Conversely, entropy is lowest (zero) when all
examples in the set belong to the same class, indicating perfect purity or homogeneity.
• When building a decision tree, the algorithm selects the feature that minimizes entropy or maximizes
information gain, which is the reduction in entropy that results from splitting the data on that feature. The
goal is to find the feature that separates the data into subsets that are as pure as possible with respect to the
target variable, leading to more accurate predictions.
• Creating a Decision Tree:
• Creating a decision tree involves several steps. Here's a simplified overview:

• 1. **Data Collection**: Gather the dataset containing the features and the target variable you want to predict.

• 2. **Data Preprocessing**: This step involves handling missing values, encoding categorical variables, and
splitting the dataset into a training set and a testing set for evaluation.

• 3. **Tree Building**: The tree-building process typically follows a recursive, top-down approach. At each node
of the tree:
• - Select the best feature to split the data based on a criterion such as entropy or Gini impurity.
• - Split the data into subsets based on the chosen feature.
• - Recursively repeat the process on each subset until certain stopping criteria are met (e.g., maximum tree
depth, minimum number of samples per leaf).

• 4. **Stopping Criteria**: These criteria determine when to stop growing the tree. Common stopping criteria
include reaching a maximum tree depth, having a minimum number of samples in a node, or when further
splitting does not significantly improve model performance.

• 5. **Pruning (Optional)**: After the tree is built, pruning can be applied to reduce overfitting. Pruning involves
removing parts of the tree that do not provide significant improvements in prediction accuracy on a validation
dataset.

• 6. **Prediction**: Once the tree is constructed, it can be used to make predictions on new data. For
classification tasks, predictions are made by traversing the tree from the root to a leaf node and assigning the
majority class in that leaf node. For regression tasks, predictions are made by averaging the target values of
samples in the leaf node.
• 7. **Evaluation**: Finally, evaluate the performance of the decision tree model on the testing set
using appropriate evaluation metrics such as accuracy, precision, recall, F1-score (for classification),
or mean squared error (for regression).

• When implementing a decision tree, libraries like scikit-learn in Python provide convenient functions
for building and training decision tree models. Here's a simplified example using scikit-learn:

• ```python
from sklearn.tree import DecisionTreeClassifier
• from sklearn.datasets import load_iris
• from sklearn.model_selection import train_test_split
• from sklearn.metrics import accuracy_score

• # Load the iris dataset

• iris = load_iris()
• X = iris.data
• y = iris.target

• # Split the data into training and testing sets

• X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

• # Create a decision tree classifier

• clf = DecisionTreeClassifier()

• # Train the classifier on the training data

• clf.fit(X_train, y_train)

• # Make predictions on the testing data

• y_pred = clf.predict(X_test)

• # Evaluate the accuracy of the model

• accuracy = accuracy_score(y_test, y_pred)
• print("Accuracy:", accuracy)
• ```
This example demonstrates how to create and train a decision tree classifier using scikit-learn and evaluate its performance
on a test set.
ID4 Algorithm:
• It seems like you're referring to the Iterative Dichotomiser 4 (ID3) algorithm, which is one of the earliest and
simplest algorithms used for constructing decision trees. Here's an overview of the ID3 algorithm:
• Selecting the Best Attribute: The algorithm begins by selecting the best attribute to split the dataset. It uses a
criterion such as information gain or entropy to determine which attribute provides the most significant
reduction in uncertainty.
• Splitting the Dataset: Once the best attribute is chosen, the dataset is split into subsets based on the possible
values of that attribute.
• Building the Tree Recursively: This process is repeated recursively on each subset of data until one of the
following conditions is met:
– All instances in a subset belong to the same class (pure node).
– There are no more attributes left to split on.
– A stopping criterion is met (e.g., maximum tree depth, minimum number of samples per leaf).
• Handling Missing Values: ID3 can handle missing attribute values by assigning the most common value of
the attribute in the dataset to the missing values.
• Handling Categorical Data: ID3 is designed for categorical (discrete) attributes. If the dataset contains
continuous attributes, they need to be discretized before applying the algorithm.
• Pruning (Optional): ID3 does not perform pruning, which can lead to overfitting on the training data.
However, post-pruning techniques can be applied to reduce overfitting.
• Tree Representation: The resulting decision tree is typically represented in a hierarchical structure, where
each internal node represents a decision based on an attribute, and each leaf node represents a class label.
• ID3 has some limitations, such as its inability to handle continuous attributes directly and its tendency to
create biased trees when attributes with many values are favored. These limitations have led to the
development of more advanced algorithms like C4.5 and CART (Classification and Regression Trees), which
address some of the shortcomings of ID3.
• C4.5:
• C4.5 is an extension of the ID3 (Iterative Dichotomiser 3) algorithm, developed by Ross Quinlan. It overcomes some of
the limitations of ID3 and introduces several improvements, making it one of the most widely used decision tree
algorithms. Here's an overview of the C4.5 algorithm:

• 1. **Handling Continuous Attributes**: Unlike ID3, which only works with categorical attributes, C4.5 can handle
both categorical and continuous attributes. It accomplishes this by first sorting the values of continuous attributes and
then selecting thresholds for splitting.

• 2. **Handling Missing Values**: C4.5 includes a mechanism to handle missing attribute values. Instead of assigning
the most common value like ID3, C4.5 evaluates all possible split points and chooses the one that maximizes
information gain.

• 3. **Information Gain Ratio**: While ID3 uses information gain to select the best attribute for splitting, C4.5 uses
information gain ratio. Information gain ratio adjusts for bias towards attributes with a large number of values. It
penalizes attributes with many distinct values and encourages smaller trees with more meaningful splits.

• 4. **Pruning**: C4.5 includes a pruning step to reduce overfitting. After the decision tree is built, pruning involves
removing branches that do not significantly improve the tree's accuracy on a separate validation dataset. Pruning
helps to create simpler, more generalizable trees.

• 5. **Dealing with Overfitting**: C4.5 addresses overfitting by using pruning and by setting a minimum number of
instances required to split a node. This helps prevent the algorithm from creating overly complex trees that capture
noise in the training data.

• 6. **Tree Representation**: Like ID3, the resulting decision tree in C4.5 is represented in a hierarchical structure,
where each internal node represents a decision based on an attribute, and each leaf node represents a class label.

• C4.5 has been influential in the field of machine learning and data mining due to its effectiveness and flexibility. It has
inspired many variations and improvements, including the popular open-source implementation called C5.0.
• CART:
• CART, which stands for Classification and Regression Trees, is a versatile decision tree algorithm
introduced by Leo Breiman, Jerome Friedman, Richard Olshen, and Charles Stone. CART can be used
for both classification and regression tasks, making it highly flexible. Here's an overview of the CART
algorithm:
• Binary Splitting: Unlike ID3 and C4.5, which can handle multi-way splits, CART performs binary splits
at each node of the tree. It considers all possible splits for each attribute and selects the one that
maximizes a criterion such as Gini impurity (for classification) or mean squared error (for regression).
• Handling Continuous and Categorical Attributes: CART can handle both continuous and categorical
attributes. For continuous attributes, it finds the best split point based on the chosen criterion. For
categorical attributes, it performs a binary split for each category.
• Pruning: CART includes a pruning step to prevent overfitting. After the decision tree is built, pruning
involves iteratively removing nodes from the tree while monitoring the tree's performance on a separate
validation dataset. Pruning helps to create simpler, more interpretable trees that generalize well to unseen
data.
• Regression Trees: In regression tasks, CART constructs regression trees to predict continuous target
variables. At each node, it minimizes the mean squared error between the predicted values and the actual
values of the target variable.
• Classification Trees: In classification tasks, CART constructs classification trees to predict class labels.
At each node, it minimizes the Gini impurity, which measures the degree of impurity in the node. CART
aims to create pure nodes with predominantly one class label.
• Tree Representation: The resulting decision tree in CART is represented in a hierarchical structure,
similar to other decision tree algorithms. Each internal node represents a decision based on an attribute,
and each leaf node represents a predicted class label (for classification) or a predicted value (for
regression).
Bagging & boosting and its impact on bias and variance:
• Bagging:
– Process: Bagging involves training multiple base learners independently on random subsets of the
training data, sampled with replacement (bootstrap sampling). Each base learner is trained on a
different subset of the data.
– Combining Predictions: In bagging, predictions from the base learners are typically averaged (for
regression tasks) or aggregated using voting (for classification tasks) to make the final prediction.
– Impact on Bias and Variance:
• Bias: Bagging tends to reduce bias by averaging predictions from multiple models, which can
improve the overall accuracy of the ensemble model.
• Variance: Bagging reduces variance by reducing the risk of overfitting. Each base learner is
trained on a different subset of the data, which introduces diversity among the models.
Combining these diverse models helps to reduce variance and make the ensemble model more
robust to variations in the training data.
• Boosting:
– Process: Boosting involves training a sequence of base learners iteratively, where each subsequent
learner focuses more on the instances that were misclassified by the previous ones. Examples are
weighted based on their classification performance during training.
– Combining Predictions: Boosting combines predictions from all base learners, giving more weight
to those with higher performance on the training data.
– Impact on Bias and Variance:
• Bias: Boosting tends to reduce bias by iteratively improving the model's ability to fit the
training data. It can learn complex patterns in the data, potentially leading to lower bias.
• Variance: Boosting can increase variance as it adapts the model to the training data, potentially
leading to overfitting. However, techniques like early stopping and regularization can be used to
mitigate this issue.

Producs Laboratory Shift Table PDF
No ratings yet
Producs Laboratory Shift Table PDF
10 pages
Prac 6
No ratings yet
Prac 6
6 pages
Decision Tree and Related Techniques For Classification in Scalation
No ratings yet
Decision Tree and Related Techniques For Classification in Scalation
12 pages
entropy and information gain for decision tree algorithm
No ratings yet
entropy and information gain for decision tree algorithm
12 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
ml unit3
No ratings yet
ml unit3
8 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Decision Tree Induction Algorithm
No ratings yet
Decision Tree Induction Algorithm
6 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
L04 Decision Trees
No ratings yet
L04 Decision Trees
34 pages
EST Cheatsheet
No ratings yet
EST Cheatsheet
5 pages
decision_trees_implementation (1)
No ratings yet
decision_trees_implementation (1)
13 pages
U4 ML Updated
No ratings yet
U4 ML Updated
32 pages
Classification and Prediction
No ratings yet
Classification and Prediction
81 pages
An Introduction TO Decision Trees
No ratings yet
An Introduction TO Decision Trees
30 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
DMI UNIT 4
No ratings yet
DMI UNIT 4
34 pages
Decision Tree Ppt
0% (1)
Decision Tree Ppt
24 pages
decision tree performance, limitations
No ratings yet
decision tree performance, limitations
4 pages
Classification
No ratings yet
Classification
148 pages
Decision Tree
No ratings yet
Decision Tree
68 pages
AI&Ml-module 4 (Part 1)
No ratings yet
AI&Ml-module 4 (Part 1)
85 pages
AI&Ml-module 4 (Complete)
No ratings yet
AI&Ml-module 4 (Complete)
124 pages
DM Lab 04
No ratings yet
DM Lab 04
6 pages
unit3-ml
No ratings yet
unit3-ml
23 pages
Decision Tree
No ratings yet
Decision Tree
31 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
HSMC
No ratings yet
HSMC
5 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
UNIT-3 Machine Learning
No ratings yet
UNIT-3 Machine Learning
40 pages
UNIT-3 Machine Learning
No ratings yet
UNIT-3 Machine Learning
43 pages
Machine Learning Approaches: Decision Trees
No ratings yet
Machine Learning Approaches: Decision Trees
44 pages
Types of Pruning Techniques
No ratings yet
Types of Pruning Techniques
10 pages
Decision Tree
No ratings yet
Decision Tree
5 pages
Lecture 04 Decession Trees 04112022 015118pm
No ratings yet
Lecture 04 Decession Trees 04112022 015118pm
43 pages
Aiml Qb With Ans
No ratings yet
Aiml Qb With Ans
70 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
14 pages
Decision Tree Classification Algorithm (2)
No ratings yet
Decision Tree Classification Algorithm (2)
11 pages
Experiment No-2
No ratings yet
Experiment No-2
4 pages
Data Mining Notes Unit 4
No ratings yet
Data Mining Notes Unit 4
30 pages
ML Unit 2
No ratings yet
ML Unit 2
8 pages
Chapter 4classification and Prediction
No ratings yet
Chapter 4classification and Prediction
19 pages
Unit-3 Alt
No ratings yet
Unit-3 Alt
24 pages
Decision Tree For Classification (ID3 Information Gain Entropy)
No ratings yet
Decision Tree For Classification (ID3 Information Gain Entropy)
3 pages
Decision Trees and Regression Techniques
No ratings yet
Decision Trees and Regression Techniques
27 pages
Decision tree
No ratings yet
Decision tree
16 pages
Trees and Forests: Machine Learning With Python Cookbook
No ratings yet
Trees and Forests: Machine Learning With Python Cookbook
5 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Building A Decision Tree Classifier From Scratch
No ratings yet
Building A Decision Tree Classifier From Scratch
10 pages
ML_UNIT3
No ratings yet
ML_UNIT3
24 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Lists and Trees
No ratings yet
Decision Lists and Trees
29 pages
Decision Trees
No ratings yet
Decision Trees
38 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
Lab # 10
No ratings yet
Lab # 10
6 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet
ch5 (1)
No ratings yet
ch5 (1)
56 pages
Dcsea ZFGL GLBL Upldprg Top 20220224 0503PM
No ratings yet
Dcsea ZFGL GLBL Upldprg Top 20220224 0503PM
15 pages
TLM-1 Implementation
No ratings yet
TLM-1 Implementation
6 pages
S3 - Cse231 - Summer 2024 - Muu
No ratings yet
S3 - Cse231 - Summer 2024 - Muu
104 pages
DAA Slides U3
No ratings yet
DAA Slides U3
240 pages
LMCSS Appendix-TIP 1
No ratings yet
LMCSS Appendix-TIP 1
2 pages
PIAIC AI Q1 Python Assignment 1
No ratings yet
PIAIC AI Q1 Python Assignment 1
6 pages
Question Bank-All Modules
No ratings yet
Question Bank-All Modules
6 pages
Factoring Polynomials PDF
No ratings yet
Factoring Polynomials PDF
5 pages
BCA-V AIDA syllabus
No ratings yet
BCA-V AIDA syllabus
15 pages
Context Free Grammar
No ratings yet
Context Free Grammar
33 pages
SVAPIProgrammers Guide
No ratings yet
SVAPIProgrammers Guide
37 pages
IP Unit Wise Important Questions
No ratings yet
IP Unit Wise Important Questions
2 pages
Cat 2020
No ratings yet
Cat 2020
10 pages
Daa Question Bank Srmist
No ratings yet
Daa Question Bank Srmist
58 pages
Syllabus_3rdsem
No ratings yet
Syllabus_3rdsem
11 pages
Lesson 2 Hashing and Hashing Algorithms
No ratings yet
Lesson 2 Hashing and Hashing Algorithms
30 pages
Lab Manual - OOC - 21CSL35 (Final)
100% (2)
Lab Manual - OOC - 21CSL35 (Final)
69 pages
Hexacon 0 Click Rce On Tesla Model 3 Through Tpms Sensors Light
No ratings yet
Hexacon 0 Click Rce On Tesla Model 3 Through Tpms Sensors Light
44 pages
The Gold Mine
No ratings yet
The Gold Mine
6 pages
Series Worksheet
No ratings yet
Series Worksheet
9 pages
Unit 3 - OOP
No ratings yet
Unit 3 - OOP
15 pages
CS8082U4L03 RadialBasisFunctions
No ratings yet
CS8082U4L03 RadialBasisFunctions
9 pages
Psipl Exp 8
No ratings yet
Psipl Exp 8
11 pages
Market Basket Analysis
No ratings yet
Market Basket Analysis
27 pages
An Introduction to Parallel Programming Pacheco Peter S Malensek Matthew pdf download
No ratings yet
An Introduction to Parallel Programming Pacheco Peter S Malensek Matthew pdf download
77 pages
Design and Analysis of Algorithms Important Questions - 2024
No ratings yet
Design and Analysis of Algorithms Important Questions - 2024
5 pages
Infix-to-Postfix Examples
No ratings yet
Infix-to-Postfix Examples
4 pages
DSA All Collection
No ratings yet
DSA All Collection
1,279 pages