Lec 3&4
Lec 3&4
Tanmay Basu
2
P 5 5 9 9
E (X ) = −pi log2 pi = − log2 − log2 = 0.94
i=1 14 14 14 14
X |X : outlook = v |
IG (X , outlook) = E (X ) − E (X : outlook = v )
|X |
v ∈outlook
2 2 3 3
E (X |outlook = sunny ) = − log2 − log2 = 0.970
5 5 5 5
E (X |outlook = overcast) = −1 log2 1 − 0 log2 0 = 0
3 3 2 2
E (X |outlook = rainy ) = − log2 − log2 = 0.970
5 5 5 5
5 5 5
IG (X , outlook) = 0.94− ∗ 0.97 + ∗0+ ∗ 0.970 = 0.94 − 0.692 = 0.248
14 14 14
Tanmay Basu Decision Tree Classifiers 13
How Information Gain Works?
X X : wind = v &outlook = sunny
IG (X , wind : outlook = sunny ) = E (X : outlook = sunny ) −
v ∈wind
|X |
2 3
∴ IG (X , wind : outlook = sunny ) = 0.97 − ∗1+
∗ 0.918
5 5
= 0.97 − 0.950 = 0.020
Thus the left subtree of outlook will split based on Humidity as it has highest
information gain. Eventually, the decision tree will look as follows:
Here c be the number of classes in the data set and pi,j be the
proportion of Xj belonging to the i th class.
Gini index tends to isolate the largest class from the data.
IG (X , f )
GainRatio(X , f ) = (6)
E (X , f )