0% found this document useful (0 votes)

60 views

ML-3-Decision Tree

decision Tree of Machine Learning

Uploaded by

Sanchit Verma ji

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

60 views

ML-3-Decision Tree

decision Tree of Machine Learning

Uploaded by

Sanchit Verma ji

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

1|Page

KRISHNA ENGINEERING COLLEGE

(Approved by AICTE & Affiliated to Dr. APJ Abdul Kalam Technical University (Formerly UPTU), Lucknow)
Department of CSE-Artificial Intelligence
Department of CSE-Artificial Intelligence & Machine Learning

Machine Learning Techniques (KAI601)

Unit-1:
INTRODUCTION
Learning, Types of Learning, Well defined learning problems, Designing a learning System, History of ML,
Introduction of Machine Learning Approaches – (Artificial Neural Network, Clustering, Reinforcement Learning,
Decision Tree Learning, Bayesian networks, Support Vector Machine, Genetic Algorithm), Issues in Machine
Learning and Data Science Vs Machine Learning;
Unit-2:
REGRESSION: Linear Regression and Logistic Regression
BAYESIAN LEARNING - Bayes theorem, Concept learning, Bayes Optimal Classifier, Naïve Bayes classifier,
Bayesian belief networks, EM algorithm.
SUPPORT VECTOR MACHINE: Introduction, Types of support vector kernel – (Linear kernel, polynomial kernel,
and Gaussian kernel), Hyperplane – (Decision surface), Properties of SVM, and Issues in SVM.
Unit-3:
DECISION TREE LEARNING - Decision tree learning algorithm, Inductive bias, Inductive inference with decision
trees, Entropy and information theory, Information gain, ID-3 Algorithm, Issues in Decision tree learning.
INSTANCE-BASED LEARNING - k-Nearest Neighbour Learning, Locally Weighted Regression, Radial basis
function networks, Case-based learning.
Unit-4:
ARTIFICIAL NEURAL NETWORKS – Perceptron’s, Multilayer perceptron, Gradient descent and the Delta rule,
Multilayer networks, Derivation of Backpropagation Algorithm, Generalization, Unsupervised Learning – SOM
Algorithm and its variant;
DEEP LEARNING - Introduction, concept of convolutional neural network , Types of layers – (Convolutional Layers
, Activation function , pooling , fully connected) , Concept of Convolution (1D and 2D) layers, Training of network,
Case study of CNN for e.g. on Diabetic Retinopathy, Building a smart speaker, Self-deriving car etc.

Unit-5:
REINFORCEMENT LEARNING–Introduction to Reinforcement Learning, Learning Task, Example of
Reinforcement Learning in Practice, Learning Models for Reinforcement – (Markov Decision process , Q Learning -
Q Learning function, Q Learning Algorithm ), Application of Reinforcement Learning, Introduction to Deep Q
Learning.
GENETIC ALGORITHMS: Introduction, Components, GA cycle of reproduction, Crossover, Mutation, Genetic
Programming, Models of Evolution and Learning, Applications.

Books:

1. Tom M. Mitchell, ―Machine Learning, McGraw-Hill Education (India) Private Limited, 2013.
2. Ethem Alpaydin, ―Introduction to Machine Learning (Adaptive Computation and Machine Learning), MIT Press
3. Stephen Marsland, ―Machine learning: An Algorithmic Perspective, CRC Press, 2009.
4. Bishop, C., Pattern Recognition and Machine Learning. Berlin: Springer-Verlag.
5. M. Gopal, “Applied Machine Learning”, McGraw Hill Education
2|Page

Unit-3:
DECISION TREE LEARNING - Decision tree learning algorithm, Inductive bias, Inductive
inference with decision trees, Entropy and information theory, Information gain, ID-3 Algorithm,
Issues in Decision tree learning.

Decision tree

 A decision tree is a non-parametric supervised learning algorithm, which is utilized for both
classification and regression tasks.
 Decision tree learning is a method for approximating discrete-valued target functions, in
which the learned function is represented by a decision tree.
 Learned trees can also be re-represented as sets of if-then rules

Decision trees classify instances by sorting them down the tree from the root to some leaf node,
which provides the classification of the instance. Each node in the tree specifies a test of some
attribute of the instance, and each branch descending from that node corresponds to one of the
possible values for this attribute.

An instance is classified by starting at the root node of the tree, testing the attribute specified by
this node, then moving down the tree branch corresponding to the value of the attribute. This
process is then repeated for the subtree rooted at the new node.
3|Page

For example, the instance (Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong)
would be sorted down the left most branch of this decision tree and would therefore be classified
as a negative instance.

In general, decision trees represent a disjunction of conjunctions of constraints on the attribute

values of instances. For example, the decision tree shown above corresponds to the expression

(Outlook = Sunny ∧ Humidity = Normal)

∨ (Outlook = Overcast)
∨ (Outlook = Rain A Wind = Weak)

Appropriate problems for decision tree learning

 Decision tree learning is generally best suited to problems with the following characteristics:
 Instances are represented by attribute-value pairs
 The target function has discrete output values
 Disjunctive descriptions may be required
 The training data may contain missing attribute values

Decision tree learning algorithm

The basic idea behind any decision tree algorithm is as follows:

1. Select the best attribute using Attribute Selection Measures (ASM) to split the records.
2. Make that attribute a decision node and breaks the dataset into smaller subsets.
3. Start tree building by repeating this process recursively for each child until one of the
conditions will match:
 All the tuples belong to the same attribute value.
 There are no more remaining attributes.
 There are no more instances.
4|Page

Inductive bias, Inductive inference with decision trees

Given a collection of training examples, there are typically many decision trees consistent with
these examples.

Approximate inductive bias of ID3: Shorter trees are preferred over larger trees.

A closer approximation to the inductive bias of ID3: Shorter trees are preferred over longer trees.
Trees that place high information gain attributes close to the root are preferred over those that do
not.

Inductive inference refers to the process of generalizing knowledge from specific examples to make
probabilistic predictions about new, unseen instances.

Inductive Learning Algorithm (ILA) is an iterative and inductive machine learning algorithm that
is used for generating a set of classification rules, which produces rules of the form “IF-THEN”,
for a set of examples, producing rules at each iteration and appending to the set of rules.

Entropy in information theory

Entropy characterizes the (im)purity of an arbitrary collection of examples. Given a collection S,

containing positive and negative examples of some target concept, the entropy of S relative to this
Boolean classification is
Entropy(S)  - (p+) (log2p+) – (p-) (log2p-)

p+: is the proportion of positive examples in S

p- : is the proportion of negative examples in S.
In all calculations involving entropy we define 0log0 to be 0

Suppose S is a collection of 14 examples of some Boolean concept, including 9 positive and 5

negative examples. Then the entropy of S relative to this Boolean classification is
5|Page

Entropy ([9+, 5-])  - (9/14) log2 (9/14) – (5/14) log2 (5/14) = 0.940

Note:
 The entropy is 0 if all members of S belong to the same class
Entropy ([14+, 0-])  - (14/14) log2 (14/14) – (0/14) log2 (0/14) = 0

 The entropy is 1 when the collection contains an equal number of positive and negative
examples.
Entropy ([7+, 7-])  - (7/14) log2 (7/14) – (7/14) log2 (7/14) = 1

 If the collection contains unequal numbers of positive and negative examples, the entropy is
between 0 and 1.
 If the target attribute can take on c different values, then the entropy of S relative to this c-
wise classification is defined as

Entropy(S) = ∑𝑐𝑖=1 −𝑝𝑖 log 2 𝑝𝑖

Where pi is the proportion of S belonging to class i. If the target attribute can take on c possible
values, the entropy can be as large as log2c.

Information gain

Information gain is the decrease in entropy. Information gain computes the difference between
entropy before the split and average entropy after the split of the dataset based on given attribute
values. ID3 (Iterative Dichotomiser) decision tree algorithm uses information gain.

The information gain, Gain(S, A) of an attribute A, relative to a collection of examples S, is defined

where Values(A) is the set of all possible values for attribute A, and Sv, is the subset of S for which
attribute A has value v.

The information gain due to sorting the original 14 examples by the attribute Wind may then be
calculated as
6|Page

Information gain is precisely the measure used by ID3 to select the best attribute at each step in
growing the tree.

Answer: Humidity

ID-3 Algorithm

ID3 stands for Iterative Dichotomiser 3 and is named such because the algorithm iteratively
(repeatedly) dichotomizes (divides) features into two or more groups at each step.

Invented by Ross Quinlan, ID3 uses a top-down greedy approach to build a decision tree.

Our basic algorithm, ID3, learns decision trees by constructing them top down, beginning with the
question "which attribute should be tested at the root of the tree?" To answer this question, each
7|Page

instance attribute is evaluated using a statistical test to determine how well it alone classifies the
training examples.
The best attribute is selected and used as the test at the root node of the tree. A descendant of the
root node is then created for each possible value of this attribute, and the training examples are
sorted to the appropriate descendant node (i.e., down the branch corresponding to the example's
value for this attribute).
The entire process is then repeated using the training examples associated with each descendant
node to select the best attribute to test at that point in the tree

Steps of the ID3 Algorithm

1. Calculate the Information Gain of each feature.

2. Considering that all rows don’t belong to the same class, split the dataset S into subsets
using the feature for which the Information Gain is maximum.
3. Make a decision tree node using the feature with the maximum Information gain.
4. If all rows belong to the same class, make the current node as a leaf node with the class as
its label.
5. Repeat for the remaining features until we run out of all features, or the decision tree has
all leaf nodes.

An Illustrative Example

Refer to the solution discussed in the class.

8|Page

Issues in Decision tree learning

1. Avoiding Overfitting the Data

Overfitting is a significant practical difficulty for decision tree learning and many other learning
methods. Random noise in the training examples can lead to overfitting.

There are several approaches to avoiding overfitting in decision tree learning. These can be grouped
into two classes:
 Pre-pruning – we can stop growing the tree earlier, which means we can prune/remove/cut a
node if it has low importance while growing the tree.
 Post-pruning – once our tree is built to its depth, we can start pruning the nodes based on their
significance. Approaches that allow the tree to overfit the data, and then post-prune the tree.

Another approach is often referred to as a training and validation set approach. Even though the
learner may be misled by random errors and coincidental regularities within the training set, the
validation set provide a safety check against overfitting.

2. Incorporating Continuous-Valued Attributes

Our initial definition of ID3 is restricted to attributes that take on a discrete set of values. The
continuous-valued decision attributes can be incorporated into the learned tree.
This can be accomplished by dynamically defining new discrete valued attributes that partition the
continuous attribute value into a discrete set of intervals.
In particular, for an attribute A that is continuous-valued, the algorithm can dynamically create a
new Boolean attribute A, that is true if A < c and false otherwise.

3. Alternative Measures for Selecting Attributes

The GainRatio measure is defined in terms of the earlier Gain measure, as well as this
SplitInformation, as follows

where

Where S1 through Sc are the c subsets of examples resulting from partitioning S by the c-valued
attribute A.
A variety of other selection measures have been proposed as well. However, in his experimental
domains the choice of attribute selection measure appears to have a smaller impact on final
accuracy than does the extent and method of post-pruning
9|Page

4. Handling Training Examples with Missing Attribute Values

In certain cases, the available data may be missing values for some attributes. One strategy for
dealing with the missing attribute value is to assign it the value that is most common among training
examples at node n.

5. Handling Attributes with Differing Costs

In some learning tasks the instance attributes may have associated costs. These attributes vary
significantly in their costs. We would prefer decision trees that use low-cost attributes.
ID3 can be modified to take into account attribute costs by introducing a cost term into the attribute
selection measure. For example, we might divide the Gain by the cost of the attribute, so that lower-
cost attributes would be preferred.
While such cost-sensitive measures do not guarantee finding an optimal cost-sensitive decision
tree, they do bias the search in favor of low-cost attributes.

Reference
https://www.datacamp.com/tutorial/decision-tree-classification-python
https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html
https://scikit-learn.org/stable/modules/tree.html
10 | P a g e

Apache Cassandra Administrator Associate - Exam Practice Tests
From Everand
Apache Cassandra Administrator Associate - Exam Practice Tests
Cristian Scutaru
No ratings yet
Daa M-4
No ratings yet
Daa M-4
28 pages
Unit 5 2
No ratings yet
Unit 5 2
31 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Chapter-V CLASSIFICATION & CLUSTERING
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
153 pages
Chapter 6 - Feedforward Deep Networks
No ratings yet
Chapter 6 - Feedforward Deep Networks
27 pages
AI & ML Unit 4 Notes
No ratings yet
AI & ML Unit 4 Notes
16 pages
Unit 5
No ratings yet
Unit 5
8 pages
Ai Unit I
No ratings yet
Ai Unit I
31 pages
Unit 2 - Computer Organization and Architecture - WWW - Rgpvnotes.in
No ratings yet
Unit 2 - Computer Organization and Architecture - WWW - Rgpvnotes.in
25 pages
Machine Learning 1707965934
No ratings yet
Machine Learning 1707965934
15 pages
Raghu Engineering College (Autonomous)
No ratings yet
Raghu Engineering College (Autonomous)
22 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
14 pages
Deep Learning: Prof:Naveen Ghorpade
No ratings yet
Deep Learning: Prof:Naveen Ghorpade
43 pages
Ai-Unit Ii
No ratings yet
Ai-Unit Ii
61 pages
Back Propagation
100% (1)
Back Propagation
27 pages
AI-unit 3
No ratings yet
AI-unit 3
55 pages
Data Engineering Lab: List of Programs
No ratings yet
Data Engineering Lab: List of Programs
2 pages
Iot Based Smart Precision Irrigation System and Crop Protection
No ratings yet
Iot Based Smart Precision Irrigation System and Crop Protection
7 pages
AIML Module - 03 21CS4
No ratings yet
AIML Module - 03 21CS4
34 pages
AI Unit 4 QA
No ratings yet
AI Unit 4 QA
22 pages
Guido Van Rossum, Fred L. Drake, JR., (Editor) - Python Tutorial. Release 3.2.3 (2012, Python Software Foundation)
No ratings yet
Guido Van Rossum, Fred L. Drake, JR., (Editor) - Python Tutorial. Release 3.2.3 (2012, Python Software Foundation)
105 pages
Check Below The Important Formulas, Terms and Properties For CBSE Class 10 Maths Exam 2020: 1. Real Numbers
No ratings yet
Check Below The Important Formulas, Terms and Properties For CBSE Class 10 Maths Exam 2020: 1. Real Numbers
78 pages
Rpa Unit 1
No ratings yet
Rpa Unit 1
13 pages
Linux and Shell Scripting
No ratings yet
Linux and Shell Scripting
2 pages
Unit 3
No ratings yet
Unit 3
10 pages
Phy Vol - 1 Formulas With Solved Obj Problems - F
No ratings yet
Phy Vol - 1 Formulas With Solved Obj Problems - F
625 pages
Biology - Ecosystem Revision Notes For NEET (AIPMT) & Medical Exams - askIITians
No ratings yet
Biology - Ecosystem Revision Notes For NEET (AIPMT) & Medical Exams - askIITians
6 pages
C Interview Questions Answers
No ratings yet
C Interview Questions Answers
19 pages
Ai Unit 2 Notes
No ratings yet
Ai Unit 2 Notes
52 pages
C Interview Questions and Answers: Main Char Char
No ratings yet
C Interview Questions and Answers: Main Char Char
4 pages
Unit I
No ratings yet
Unit I
10 pages
Agricultural Advancements Through IoT and Machine Learning
No ratings yet
Agricultural Advancements Through IoT and Machine Learning
6 pages
Unit 5 Intro To Machine Learning
No ratings yet
Unit 5 Intro To Machine Learning
25 pages
Visual Studio Shortcut Keys
No ratings yet
Visual Studio Shortcut Keys
15 pages
AIML Module - 03
No ratings yet
AIML Module - 03
34 pages
Machine Learning QB
No ratings yet
Machine Learning QB
3 pages
C Interview Questions
No ratings yet
C Interview Questions
14 pages
Unit-4 (OOAD)
92% (12)
Unit-4 (OOAD)
83 pages
Artificial Intelligence Unit 2 Solving Problems by Searching
100% (1)
Artificial Intelligence Unit 2 Solving Problems by Searching
11 pages
Lecture 5
No ratings yet
Lecture 5
66 pages
Cpu Scheduling
No ratings yet
Cpu Scheduling
26 pages
Convolutional Neural Network
No ratings yet
Convolutional Neural Network
35 pages
SQL Sailor Queries Answers
No ratings yet
SQL Sailor Queries Answers
1 page
Computer Networks Question Bank
No ratings yet
Computer Networks Question Bank
9 pages
Machine Learning
100% (1)
Machine Learning
124 pages
ML Decode TE IT
No ratings yet
ML Decode TE IT
71 pages
CPU Scheduling
100% (1)
CPU Scheduling
57 pages
Memory Management: Background Swapping Contiguous Allocation
No ratings yet
Memory Management: Background Swapping Contiguous Allocation
51 pages
Unit 1
100% (1)
Unit 1
89 pages
Walmart Labs - LeetCode
No ratings yet
Walmart Labs - LeetCode
4 pages
Unit 2 v1.
No ratings yet
Unit 2 v1.
41 pages
File and Exception Handling: Name of Faculty (Asst. Professor) CSE Department
No ratings yet
File and Exception Handling: Name of Faculty (Asst. Professor) CSE Department
120 pages
Logistic Regression
No ratings yet
Logistic Regression
8 pages
Chapter 3 RPA
No ratings yet
Chapter 3 RPA
9 pages
Computer Science - Syllabus
No ratings yet
Computer Science - Syllabus
15 pages
Unit 3
No ratings yet
Unit 3
99 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
46 pages
Textbook of Engineering Chemistry
From Everand
Textbook of Engineering Chemistry
C. Parameswara Murthy
No ratings yet
Module - 2 Decision Tree Learning
No ratings yet
Module - 2 Decision Tree Learning
79 pages
Laboratory Development Plan
100% (1)
Laboratory Development Plan
5 pages
Partner Benefits Package - Walking Deck
No ratings yet
Partner Benefits Package - Walking Deck
9 pages
MSC Data Science Oncampus 2020
No ratings yet
MSC Data Science Oncampus 2020
14 pages
Kesya Nursyahada - 21081010120 - Monthly Report Learning Progress 3&4
No ratings yet
Kesya Nursyahada - 21081010120 - Monthly Report Learning Progress 3&4
9 pages
The Impact of Prompt Engineering in Large Language Model Performance - A Psychiatric Example
No ratings yet
The Impact of Prompt Engineering in Large Language Model Performance - A Psychiatric Example
5 pages
Digital MArketing Trends 2019
No ratings yet
Digital MArketing Trends 2019
11 pages
AIoT Integration Book
No ratings yet
AIoT Integration Book
64 pages
Akash Nayak GradedIndividual
No ratings yet
Akash Nayak GradedIndividual
20 pages
position paper - Kenya
No ratings yet
position paper - Kenya
3 pages
ISO 42001 Summary
No ratings yet
ISO 42001 Summary
4 pages
Microsoft Azure Training and Certifications: Aka - Ms/Azuretraincertdeck
No ratings yet
Microsoft Azure Training and Certifications: Aka - Ms/Azuretraincertdeck
55 pages
Ppt on machine learning
No ratings yet
Ppt on machine learning
39 pages
The Third Age of Artificial Intelligence: Field Actions Science Reports
No ratings yet
The Third Age of Artificial Intelligence: Field Actions Science Reports
7 pages
Unit 1 Introduction To Intelligence and Artificial Intelligence
No ratings yet
Unit 1 Introduction To Intelligence and Artificial Intelligence
16 pages
REVIEW OF LITERATURE - sTUDY ON eMPLOYEE gRIVENCES
No ratings yet
REVIEW OF LITERATURE - sTUDY ON eMPLOYEE gRIVENCES
9 pages
2021 Great Grads
No ratings yet
2021 Great Grads
17 pages
Lata CV March 2024 MS
No ratings yet
Lata CV March 2024 MS
4 pages
Cover: TRADOC Pamphlet 525-92
No ratings yet
Cover: TRADOC Pamphlet 525-92
30 pages
Internship PPT 1
No ratings yet
Internship PPT 1
13 pages
ILG 1.5 Evaluating Improvement
No ratings yet
ILG 1.5 Evaluating Improvement
31 pages
Plagiarism3
No ratings yet
Plagiarism3
8 pages
13th Asian Criminology Conference - Brochure
No ratings yet
13th Asian Criminology Conference - Brochure
10 pages
machine learning LIST OF EXPERIMENTS
No ratings yet
machine learning LIST OF EXPERIMENTS
2 pages
Introduction
No ratings yet
Introduction
9 pages
Chap-1 Introduction To Artificial Intelligence
No ratings yet
Chap-1 Introduction To Artificial Intelligence
37 pages
Cse402 - Artificial Intelligence
0% (1)
Cse402 - Artificial Intelligence
8 pages
Gamma-Tips-and-Tricks
No ratings yet
Gamma-Tips-and-Tricks
5 pages
AI Tools for Academic Research-final
No ratings yet
AI Tools for Academic Research-final
54 pages
Gws Duet Ai Handbook
No ratings yet
Gws Duet Ai Handbook
18 pages
1.1 Artificial Intelligence: Unit 1
100% (1)
1.1 Artificial Intelligence: Unit 1
16 pages