STAT 432: Basics of Statistical Learning: Tree and Random Forests
STAT 432: Basics of Statistical Learning: Tree and Random Forests
https://teazrq.github.io/stat432/
1/50
Classification and Regression
Trees (CART)
Tree-based Methods
3/50
Titanic Survivals
4/50
Classification and regression Trees
5/50
Example
● ● ● ●● ● ● ●
● ● ●
● ● ● ● ● ● ●
● ● ● ● ●● ● ●●
● ● ● ● ● ● ● ● ●
●● ● ● ● ● ● ●
● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ●
●● ●● ● ● ●
● ● ● ● ●
● ● ● ●
● ● ●
● ● ● ● ●● ● ● ●
● ●● ● ●● ● ●
● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ● ●
●
● ● ●● ● ● ● ● ● ● ●
● ● ●● ●
● ●● ● ● ●● ● ●
● ●● ● ●● ● ●
● ● ●● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ●
●● ●
●● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ●
●
● ● ●● ● ● ●
●● ● ● ● ●
● ●
● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ● ●
● ●●● ● ● ● ●
● ●
● ● ●
● ●● ● ●
● ●
●● ● ● ● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ●
● ●●
● ● ● ●● ●● ● ●
● ● ● ●●
● ● ● ●
● ●
● ● ● ● ●
● ● ●● ●
● ●● ●
● ● ● ●
● ● ● ●●
● ● ● ●● ● ●
●● ● ●● ● ● ●● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ●
● ● ●
●● ●
● ●
● ● ●
● ● ● ● ●
● ● ● ● ●
●● ● ● ● ●
● ● ● ● ●
● ●● ●
● ● ● ● ● ●
● ● ● ● ●● ●
●
● ● ● ● ● ● ●
● ● ●●
●●●● ●
● ● ● ●
● ● ● ● ●
●● ● ●
●● ● ●
● ● ● ● ● ● ●
● ●● ● ● ●
● ●
● ●● ●● ●
●● ● ● ● ● ● ●
● ●●
●● ●● ● ● ● ●●
● ●
●
● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ●
● ● ●
6/50
Example
● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
●● ●● ● ● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●
● ●● ● ●● ● ● ● ●● ● ●● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●● ● ● ● ●●
●
● ●● ● ● ●● ● ● ● ● ●
● ●● ● ● ●● ● ● ● ●
● ● ●● ● ● ● ●● ●
● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ●
● ●● ●● ● ● ● ● ●● ●● ● ● ●
● ● ●● ● ● ● ● ● ●● ● ● ●
● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
●● ●
●● ● ● ● ●● ●
●● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ●● ● ● ● ● ●● ●
●
● ● ●● ● ● ● ●
● ● ●● ● ● ●
●● ● ● ● ● ●● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ●●● ● ● ● ● ● ●●● ● ● ● ●
● ●
● ● ● ● ●
● ● ●
●
● ●● ● ● ● ● ●
● ●● ● ● ● ●
● ●● ● ● ● ●● ● ●
● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ●● ● ●●
● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ●
● ● ● ●● ● ● ● ●●
● ●● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ●● ● ● ● ● ●● ●
● ●● ●
● ● ● ● ● ●● ●
● ● ● ●
● ● ● ●● ● ● ● ●●
● ● ● ● ● ● ● ● ● ● ● ●
●● ● ●● ● ● ●● ●● ● ●● ● ●● ● ● ●● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
●● ●
● ● ●● ●
● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
●● ● ● ● ● ●● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ●● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ●
● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ●● ● ● ●●
●●●● ● ●●●● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
●● ● ● ●● ● ●
●● ● ●
● ● ● ● ● ● ● ●● ● ●
● ● ● ● ● ● ●
● ●● ● ● ● ●● ● ●
● ● ● ● ● ●
● ●● ●● ● ● ●● ●● ●
●● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ●●
●● ●● ● ● ● ●● ● ●●
●● ●● ● ● ● ●●
● ● ● ●
●
● ● ●● ●● ● ● ● ●
●
● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ●
● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ●● ● ●●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
●● ●● ● ● ● ●● ●● ● ● ● ●● ●● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ● ● ●
● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ●● ● ● ● ● ● ● ● ●
● ● ●● ● ● ● ● ● ● ● ●
● ● ●● ● ● ● ● ● ● ●
● ● ●
● ● ● ● ●
● ● ● ● ●
● ●
● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ●● ● ● ●● ● ●
● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ●● ● ●
● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ●
●● ●
●● ● ● ● ●● ●
●● ● ● ● ●● ●
●● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ●
● ●● ●● ● ● ●● ●● ● ● ●● ●● ●
●
● ● ●● ● ● ● ●
● ● ●● ● ● ● ●
● ● ●● ● ● ●
●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●●●● ● ● ● ● ●●● ● ● ● ● ●●● ● ● ●
● ●
● ● ● ● ●
● ● ● ● ●
● ● ●
●
● ●● ● ● ● ● ●
● ●● ● ● ● ● ●
● ●● ● ● ● ●
● ●● ● ● ● ●● ● ● ● ●● ● ●
● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ●● ● ●● ● ●●
● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ●● ●● ● ●
● ● ● ●● ● ● ● ●● ● ● ● ●●
● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ●
● ●● ●
● ● ● ● ● ●● ●
● ● ● ● ● ●● ●
● ● ● ●
● ● ● ●● ● ● ● ●● ● ● ● ●●
● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ● ●
●● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ●
●● ●● ● ●● ●● ● ●● ●● ●
● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ●● ● ● ● ●● ● ● ● ●● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ●● ●
● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ●● ● ● ●● ● ● ●●
●●●● ● ●●●● ● ●●●● ●
● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
●
● ● ● ●
● ● ● ●
● ● ●
●● ● ●
● ● ● ● ● ● ● ●● ● ●
● ● ● ● ● ● ● ●● ● ●
● ● ● ● ● ● ●
● ● ● ● ● ●
● ●●
● ● ● ●●
● ● ● ●●
● ●
● ● ●
● ●● ●● ● ● ●● ●● ● ● ●● ●● ●
●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●
● ●●
●● ●● ● ● ● ●● ● ●●
●● ●● ● ● ● ●● ● ●●
●● ●● ● ● ● ●●
● ● ● ● ● ●
●
● ● ●● ●● ● ● ● ●
●
● ● ●● ●● ● ● ● ●
●
● ● ●● ●● ● ● ● ●
● ● ● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●
● ● ● ● ● ● ● ● ●
7/50
Example
• There are many popular packages that can fit a CART model:
rpart , tree and party .
• Read the reference manual carefully!
8/50
Example
x2 < −0.644432
|
x1 < 0.767078
0
x2 < 0.722725
0
x1 < −0.76759
0
x1 < −0.61653
0 x2 < 0.356438 x1 < 0.547162
x2 < −0.28598 x2 < 0.369569
0 1
0 1 1 0
9/50
Tree Algorithm: Recursive Partitioning
Root node
10/50
Tree Algorithm: Recursive Partitioning
Root node
10/50
Tree Algorithm: Recursive Partitioning
Root node
No Age ≤ 45 Yes
Internal A1
10/50
Tree Algorithm: Recursive Partitioning
Root node
Age ≤ 45
Internal A1
No Yes
Female
A2 A3
10/50
Tree Algorithm: Recursive Partitioning
Root node
Age ≤ 45
Internal A1
fb(x), for x ∈ A1
Female
A2 A3
fb(x), for x ∈ A2 fb(x), for x ∈ A3
10/50
Classification and Regression Trees
11/50
Constructing Splitting Rules
Splitting Using Continuous Covariates
AL = {x ∈ A, x(j) ≤ c}
AR = {x ∈ A, x(j) > c}
13/50
Impurity for Classification
14/50
Impurity for Classification
|AL | |AR |
score = Gini(A) − Gini(AL ) − Gini(AR ),
|A| |A|
15/50
Impurity for Classification
16/50
Choosing the Split
17/50
Other Impurity Measures
• Misclassification error
18/50
Comparing Impurity Measures
19/50
Comparing Different Measures
20/50
Regression Problems
where for any A, Var(A) is just the variance of the node samples:
1 X
Var(A) = (yi − y A )2 ,
|A|
i∈A
21/50
Categorical Predictors
AL = {x ∈ A, x(j) ∈ C}
AR = {x ∈ A, x(j) 6∈ C}
22/50
Overfitting and Tree Pruning
23/50
Overfitting and Tree Pruning
24/50
Cost-Complexity Pruning
• First, fit the maximum tree Tmax (possibly one observation per
terminal node).
• Specify a complexity penalty parameter α.
• For any sub-tree of Tmax , denoted as T Tmax , calculate
X
Cα (T ) = |A| · Gini(A) + α|T |
all terminal nodes A of T
= C(T ) + α|T |
25/50
Missing Values
26/50
Remark
27/50
Random Forests
Weak and Strong Learners
29/50
Bagging Predictors
30/50
Ensemble of Trees
Dn
...
fb(x)
31/50
Bagging Predictors
32/50
CART vs. Bagging
33/50
Remarks about Bagging
34/50
Remarks about Bagging
35/50
Random Forests
36/50
Tuning Parameter: mtry
37/50
Tuning Parameter: nodesize
38/50
Tuning parameters
39/50
CART vs. Bagging vs. RF
CART Bagging RF
40/50
Smoothness Effect of Random Forests
CART RF
41/50
Random Forests vs. Kernel
Random Forests vs. Kernel
43/50
RF vs. Kernel
44/50
RF vs. Kernel
45/50
Variable Importance
Variable Importance
Errbj
VIbj = −1
Errb0
• Average VIbj across all trees
B
X
VIj = VIbj
b=1
47/50
Variable Importance
48/50
Variable Importance in RF
10
5
0
x1 x4 x7 x11 x15 x19 x23 x27 x31 x35 x39 x43 x47
50/50