0% found this document useful (0 votes)

12 views

Exp 3

This document discusses ensemble learning techniques bagging and boosting. It explains that bagging and boosting are used to reduce bias and variance in machine learning models by combining multiple weak learners. Bagging creates random subsets of data and trains learners in parallel, then averages their predictions. Boosting trains learners sequentially, with each focusing on misclassified examples from the previous model. Specific techniques discussed include AdaBoost, which assigns weights to examples based on classification error, and Gradient Boosting. Code examples are provided to demonstrate implementing bagging and AdaBoost with scikit-learn.

Uploaded by

Mr. S

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Exp 3

Uploaded by

Mr. S

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Machine Learning 1

Experiment No 3
Aim - To implement ensemble learning bagging and boosting

Objective: LO3: To demonstrate ensemble techniques to combine predictions from different models.

Theory:

What is ensemble

Figure 1: Concept of Ensemble.

Ensemble Learning: Bagging & Boosting

How to combine weak learners to build a stronger learner to
reduce bias and variance in your ML model

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 2

Figure 1. Bagging and Boosting | Spreadsheet, Robot and Idea icons by Freepik on Flaticon

The bias and variance tradeoff is one of the key concerns when working
with machine learning algorithms. Fortunately there are
some Ensemble Learning based techniques that machine learning
practitioners can take advantage of in order to tackle the bias and
variance tradeoff, these techniques are bagging and boosting. So, in
this blog we are going to explain how bagging and boosting works,
what theirs components are and how you can implement them in your
ML problem, thus this blog will be divided in the following sections:

 What is Bagging?

 What is Boosting?

 AdaBoost

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 3

What is Bagging?

Bagging or Bootstrap Aggregation was formally introduced by Leo

Breiman in 1996 [3]. Bagging is an Ensemble Learning technique
which aims to reduce the error learning through the implementation of
a set of homogeneous machine learning algorithms. The key idea
of bagging is the use of multiple base learners which are trained
separately with a random sample from the training set, which through
a voting or averaging approach, produce a more stable and accurate
model.

The main two components of bagging technique are: the random

sampling with replacement (bootstraping) and the set of
homogeneous machine learning algorithms (ensemble learning).
The bagging process is quite easy to understand, first it is extracted
“n” subsets from the training set, then these subsets are used to train
“n” base learners of the same type. For making a prediction, each one
of the “n” learners are feed with the test sample, the output of each
learner is averaged (in case of regression) or voted (in case of
classification). Figure 2 shows an overview of
the bagging architecture.

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 4

Figure 2. Bagging | Image by Author

It is important to notice that the number of subsets as well as the

number of items per subset will be determined by the nature of your
ML problem, the same for the type of ML algorithm to be used. In
addition, Leo Breiman mention in his paper that he noticed that for
classification problems are required more subsets in comparison with
regression problems.

For implementing bagging, scikit-learn provides a function to do it

easily. For a basic execution we only need to provide some parameters
such as the base learner, the number of estimators and the maximum
number of samples per subset.
Code snippet 1. Bagging implementation

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 5

In the previous code snippet was created a bagging based model for
the well know breast cancer dataset. As base learner was implemented
a Decision Tree, 5 subsets were created randomly with replacement
from the training set (to train 5 decision tree models). The number of
items per subset were 50. By running it we will get:
Train score: 0.9583568075117371
Test score: 0.941048951048951

One of the key advantages of bagging is that it can be executed in

parallel since there is no dependency between estimators. For small
datasets, a few estimators will be enough (such as the example
above), larger dataset may require more estimators.

Great, so far we’ve already seen what bagging is and how it works.
Let’s see what boosting is, its components and why it is related
to bagging, let’s go for it!

What is Boosting?

Boosting is an Ensemble Learning technique that, like bagging,

makes use of a set of base learners to improve the stability and
effectiveness of a ML model. The idea behind a boosting architecture
is the generation of sequential hypotheses, where each hypothesis tries
to improve or correct the mistakes made in the previous one [4]. The
central idea of boosting is the implementation of homogeneous ML
algorithms in a sequential way, where each of these ML algorithms
tries to improve the stability of the model by focusing on the errors
made by the previous ML algorithm. The way in which the errors of

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 6

each base learner is considered to be improved with the next base

learner in the sequence, is the key differentiator between all variations
of the boosting technique.

The boosting technique has been studied and improved over the
years, several variations have been added to the core idea of boosting,
some of the most popular are: AdaBoost (Adaptive
Boosting), Gradient Boosting and XGBoost (Extreme Gradient
Boosting). As mentioned above, the key differentiator
between boosting-based techniques is the way in which errors are
penalized (by modifying weights or minimizing a loss function) as
well as how the data is sampled.

For a better understanding of the differences between some of

the boosting techniques, let’s see in a general way
how AdaBoost and Gradient Boosting work, two of the most
common variations of the boosting technique, let’s go for it!

AdaBoost

AdaBoost is an algorithm based on the boosting technique, it was

introduced in 1995 by Freund and Schapire
[5]. AdaBoost implements a vector of weights to penalize those
samples that were incorrectly inferred (by increasing the weight) and
reward those that were correctly inferred (by decreasing the weight).
Updating this weight vector will generate a distribution where it will
be more likely to extract those samples with higher weight (that is,

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 7

those that were incorrectly inferred), this sample will be introduced to

the next base learner in the sequence. This will be repeated until a stop
criterion is met. Likewise, each base learner in the sequence will have
assigned a weight, the higher the performance, the higher the weight
and the greater the impact of this base learner for the final decision.
Finally, to make a prediction, each base learner in the sequence will be
fed with the test data, each of the predictions of each model will be
voted (for the classification case) or averaged (for the regression case).
In Figure 3 we observe the descriptive architecture of
the AdaBoost operation.

Figure 3. AdaBoost: a descriptive architecture | Image by Author

Scikit-learn provides the function to implement

the AdaBoost technique, let’s see how to perform a basic
implementation.

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 8

Implementation

Bagging
# For this basic implementation, we only need these modules
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import BaggingClassifier

# Load the well-known Breast Cancer dataset

# Split into train and test sets
x, y = load_breast_cancer(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25,
random_state=23)

# For simplicity, we are going to use as base estimator a Decision Tree with
fixed parameters
tree = DecisionTreeClassifier(max_depth=3, random_state=23)

# The baggging ensemble classifier is initialized with:

# base_estimator = DecisionTree
# n_estimators = 5 : it's gonna be created 5 subsets to train 5 Decision Tree
models
# max_samples = 50 : it's gonna be taken randomly 50 items with replacement
# bootstrap = True : means that the sampling is gonna be with replacement
bagging = BaggingClassifier(base_estimator=tree, n_estimators=5,
max_samples=50, bootstrap=True)

# Training
bagging.fit(x_train, y_train)

# Evaluating
print(f"Train score: {bagging.score(x_train, y_train)}")
print(f"Test score: {bagging.score(x_test, y_test)}")

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 9

Output:

Boosting
# For this basic implementation, we only need these modules
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import AdaBoostClassifier

# Load the well-known Breast Cancer dataset

# Split into train and test sets

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 10

x, y = load_breast_cancer(return_X_y=True)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25,
random_state=23)

# The base learner will be a decision tree with depth = 2

tree = DecisionTreeClassifier(max_depth=2, random_state=23)

# AdaBoost initialization
# It's defined the decision tree as the base learner
# The number of estimators will be 5
# The penalizer for the weights of each estimator is 0.1
adaboost = AdaBoostClassifier(base_estimator=tree, n_estimators=5,
learning_rate=0.1, random_state=23)

# Train!
adaboost.fit(x_train, y_train)

# Evaluation
print(f"Train score: {adaboost.score(x_train, y_train)}")
print(f"Test score: {adaboost.score(x_test, y_test)}")

Output:

Name – Aarushi Tiwari Roll no. - 60

Machine Learning 11

Conclusion: Thus we studied how to implement ensemble learning Bagging

and Boosting.

Name – Aarushi Tiwari Roll no. - 60

CSE312 Sınav Öncesi Tekrar
No ratings yet
CSE312 Sınav Öncesi Tekrar
27 pages
Cobol CICS DB2 Program
100% (2)
Cobol CICS DB2 Program
5 pages
DELTA IA-OSW DIADesigner UM ENG 20201211 PDF
No ratings yet
DELTA IA-OSW DIADesigner UM ENG 20201211 PDF
669 pages
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
From Everand
Machine Learning with Clustering: A Visual Guide for Beginners with Examples in Python
Artem Kovera
No ratings yet
Bagging vs Boosting in Machine Learning
No ratings yet
Bagging vs Boosting in Machine Learning
5 pages
ML Mod 5.1
No ratings yet
ML Mod 5.1
18 pages
Bagging+Boosting+Gradient Boosting
100% (1)
Bagging+Boosting+Gradient Boosting
48 pages
Bagging Vs Boosting in Machine Learning
No ratings yet
Bagging Vs Boosting in Machine Learning
4 pages
ensemble
No ratings yet
ensemble
33 pages
Unit-3 ML
No ratings yet
Unit-3 ML
18 pages
Boosting
No ratings yet
Boosting
6 pages
Handout9 Trees Bagging Boosting
100% (1)
Handout9 Trees Bagging Boosting
23 pages
U1-Ensemble Methods
No ratings yet
U1-Ensemble Methods
17 pages
Lec06 - Ensembling Methods Bagging Boosting
No ratings yet
Lec06 - Ensembling Methods Bagging Boosting
48 pages
Ensemble - Part 1
No ratings yet
Ensemble - Part 1
33 pages
Voting or Averaging of Predictions of Multiple Pre-Trained Models
No ratings yet
Voting or Averaging of Predictions of Multiple Pre-Trained Models
23 pages
Ensemble Learning Helps Improve Machine Learning Results by Combining Several Models
No ratings yet
Ensemble Learning Helps Improve Machine Learning Results by Combining Several Models
2 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
8 pages
ML U3 Notes
No ratings yet
ML U3 Notes
10 pages
Chapter Five
No ratings yet
Chapter Five
42 pages
5 - EnsembleModeling
No ratings yet
5 - EnsembleModeling
80 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
4 pages
Ch-4 Ensemble Learning
No ratings yet
Ch-4 Ensemble Learning
18 pages
Ensemble Learning
No ratings yet
Ensemble Learning
35 pages
ML UNIT 3-1
No ratings yet
ML UNIT 3-1
14 pages
AI25
No ratings yet
AI25
7 pages
Bagging and Boosting: Amit Srinet Dave Snyder
No ratings yet
Bagging and Boosting: Amit Srinet Dave Snyder
33 pages
Ensemble Final
No ratings yet
Ensemble Final
41 pages
ML8Ensembles (1)
No ratings yet
ML8Ensembles (1)
31 pages
Class Adv Classification V
No ratings yet
Class Adv Classification V
50 pages
Ensemble Classifiers
No ratings yet
Ensemble Classifiers
37 pages
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
No ratings yet
Machine Learning and Data Mining: Prof. Alexander Ihler Fall 2012
36 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
Lesson 8 - Ensemble Learning
No ratings yet
Lesson 8 - Ensemble Learning
61 pages
Lecture 16: Boosting — Applied ML
No ratings yet
Lecture 16: Boosting — Applied ML
20 pages
What Is Ensemble Learning
No ratings yet
What Is Ensemble Learning
4 pages
Ensemble Learning
No ratings yet
Ensemble Learning
30 pages
Ensemble Learning Methods
100% (1)
Ensemble Learning Methods
24 pages
Introduction To Boosting - 2
No ratings yet
Introduction To Boosting - 2
79 pages
16-Ensemble Learning - Cont... - 12-04-2024
No ratings yet
16-Ensemble Learning - Cont... - 12-04-2024
13 pages
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
No ratings yet
1.1 - Xgboost, GBboost, Adaboost - Boosting - Medium
6 pages
Ensemble Learning
No ratings yet
Ensemble Learning
16 pages
Lecture 17 - Ensemble Learning
No ratings yet
Lecture 17 - Ensemble Learning
31 pages
Machine Learning: Ensemble Methods
No ratings yet
Machine Learning: Ensemble Methods
54 pages
Module 7 - Ensemble Learning
No ratings yet
Module 7 - Ensemble Learning
41 pages
Module 5,1 Ensemble_Bagging, RF,Boosting
No ratings yet
Module 5,1 Ensemble_Bagging, RF,Boosting
66 pages
Unit-3(1)
No ratings yet
Unit-3(1)
63 pages
Ensemble
No ratings yet
Ensemble
14 pages
107 Boostong Models
No ratings yet
107 Boostong Models
27 pages
Ensemble Methods
No ratings yet
Ensemble Methods
31 pages
ML-Unit I - Ensemble Methods
No ratings yet
ML-Unit I - Ensemble Methods
54 pages
Bagging and Boosting
No ratings yet
Bagging and Boosting
40 pages
AIML Lect6 Ensembles
No ratings yet
AIML Lect6 Ensembles
41 pages
Unit-3(1)
No ratings yet
Unit-3(1)
59 pages
کتاب هفتم بارگزاری شده
No ratings yet
کتاب هفتم بارگزاری شده
57 pages
Week 11 EnsembleLearning
No ratings yet
Week 11 EnsembleLearning
34 pages
Bagging & Boosting
No ratings yet
Bagging & Boosting
10 pages
Random Forest-Supervised ML
No ratings yet
Random Forest-Supervised ML
45 pages
12 Ensemble Model
No ratings yet
12 Ensemble Model
90 pages
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
No ratings yet
E4fbc2f-C755-Ed1a-C18-F18ec25eb0d Ensemble Learning Bagging Boosting and Stacking
6 pages
Ensembles 1
No ratings yet
Ensembles 1
4 pages
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
From Everand
Mastering Classification Algorithms for Machine Learning: Learn how to apply Classification algorithms for effective Machine Learning solutions (English Edition)
PARTHA MAJUMDAR
No ratings yet
Anudeep Google Resume PDF
0% (1)
Anudeep Google Resume PDF
2 pages
Interview Questions To Ask A Full Stack Engineer Xobin Downloaded
No ratings yet
Interview Questions To Ask A Full Stack Engineer Xobin Downloaded
8 pages
Project Internship
No ratings yet
Project Internship
27 pages
Data Science
100% (2)
Data Science
33 pages
PrestaShop Developer Guide PDF
No ratings yet
PrestaShop Developer Guide PDF
36 pages
Switch Statement IN C/C++: by Quratulain Naqvi (Paki Tech)
No ratings yet
Switch Statement IN C/C++: by Quratulain Naqvi (Paki Tech)
8 pages
IT 304 Computer Networks Lab # 1: Introduction To Network Simulator NS2
No ratings yet
IT 304 Computer Networks Lab # 1: Introduction To Network Simulator NS2
9 pages
NTRCA All Previous Question
No ratings yet
NTRCA All Previous Question
27 pages
Refactoring Workbook
No ratings yet
Refactoring Workbook
148 pages
A Man Walking in The Rain
55% (20)
A Man Walking in The Rain
31 pages
Introduction To PL PGSQL Development
No ratings yet
Introduction To PL PGSQL Development
145 pages
Stack and SUBROUTINES Bindu Agarwalla
No ratings yet
Stack and SUBROUTINES Bindu Agarwalla
15 pages
Cryptography and Cyber Security - CB3491 2021 Regulation - Question Paper 2023 Nov Dec
No ratings yet
Cryptography and Cyber Security - CB3491 2021 Regulation - Question Paper 2023 Nov Dec
6 pages
Chronicles Sem II 2023 24
No ratings yet
Chronicles Sem II 2023 24
212 pages
Programming in VB 6.0 MCQs
No ratings yet
Programming in VB 6.0 MCQs
5 pages
ARM Processor Fundamentals: (Note: MAC Multiply-Accumulate Unit)
No ratings yet
ARM Processor Fundamentals: (Note: MAC Multiply-Accumulate Unit)
18 pages
Anas Resume
No ratings yet
Anas Resume
5 pages
100 Skills To Better Python
100% (8)
100 Skills To Better Python
80 pages
Altivec Programming
No ratings yet
Altivec Programming
67 pages
Week 9 Examples GUI - Layout - Examples
No ratings yet
Week 9 Examples GUI - Layout - Examples
9 pages
Design and Analysis of Algorithm: Course Code: Cosc 3094
No ratings yet
Design and Analysis of Algorithm: Course Code: Cosc 3094
12 pages
Suba
No ratings yet
Suba
7 pages
Compiler CH-2
No ratings yet
Compiler CH-2
60 pages
Siemens Simatic c7 623b2 - e
No ratings yet
Siemens Simatic c7 623b2 - e
298 pages
Assignment 3: School of Computer Sciences Semester 2, Academic Session 2016/2017 CPT 111/CPM 111 Principle of Programming
No ratings yet
Assignment 3: School of Computer Sciences Semester 2, Academic Session 2016/2017 CPT 111/CPM 111 Principle of Programming
6 pages
M.Mushtaq: Download and Setup
No ratings yet
M.Mushtaq: Download and Setup
54 pages
CSCE 231 Project 1 MIPS Dissembler
No ratings yet
CSCE 231 Project 1 MIPS Dissembler
1 page

Exp 3

Uploaded by

Exp 3

Uploaded by

Machine Learning 1

Figure 1: Concept of Ensemble.

Ensemble Learning: Bagging & Boosting

Name – Aarushi Tiwari Roll no. - 60

Name – Aarushi Tiwari Roll no. - 60

Bagging or Bootstrap Aggregation was formally introduced by Leo

The main two components of bagging technique are: the random

Name – Aarushi Tiwari Roll no. - 60

Figure 2. Bagging | Image by Author

It is important to notice that the number of subsets as well as the

For implementing bagging, scikit-learn provides a function to do it

Name – Aarushi Tiwari Roll no. - 60

One of the key advantages of bagging is that it can be executed in

Boosting is an Ensemble Learning technique that, like bagging,

Name – Aarushi Tiwari Roll no. - 60

each base learner is considered to be improved with the next base

For a better understanding of the differences between some of

AdaBoost is an algorithm based on the boosting technique, it was

Name – Aarushi Tiwari Roll no. - 60

those that were incorrectly inferred), this sample will be introduced to

Figure 3. AdaBoost: a descriptive architecture | Image by Author

Scikit-learn provides the function to implement

Name – Aarushi Tiwari Roll no. - 60

# Load the well-known Breast Cancer dataset

# The baggging ensemble classifier is initialized with:

Name – Aarushi Tiwari Roll no. - 60

# Load the well-known Breast Cancer dataset

Name – Aarushi Tiwari Roll no. - 60

# The base learner will be a decision tree with depth = 2

Name – Aarushi Tiwari Roll no. - 60

Conclusion: Thus we studied how to implement ensemble learning Bagging

Name – Aarushi Tiwari Roll no. - 60

You might also like