10 Random - Forest - Algo
10 Random - Forest - Algo
Randomly select “K” features from total “m” features where k << m
Among the “K” features, calculate the node “d” using the best split point
Split the node into daughter nodes using the best split
Repeat the a to c steps until “l” number of nodes has been reached
Build forest by repeating steps a to d for “n” number times to create “n” number of trees
Takes the test features and use the rules of each randomly created decision tree to predict the outcome
and stores the predicted outcome (target)
Calculate the votes for each predicted target
Consider the high voted predicted target as the final prediction from the random forest algorithm
In [1]:
In [2]:
In [3]:
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
knn.score(X_test, y_test)
Out[4]:
0.6388888888888888
In [5]:
Out[5]:
0.6944444444444444
See here if i use only Knn classifier then the accuracy 63.88 , But when i use bagging over our KNN
classifier and see our score improves to 69.44. So now you think how powerful bagging is?
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, -1].values
Splitting the dataset into the Training set and Test set
In [8]:
Out[10]:
y_pred = classifier.predict(X_test)
'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.
'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.
'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.
'c' argument looks like a single numeric RGB or RGBA sequence, which should
be avoided as value-mapping will have precedence in case its length matches
with 'x' & 'y'. Please use a 2-D array with a single row if you really want
to specify the same RGB or RGBA value for all points.