Module 5-Part 1
Module 5-Part 1
INTRODUCTION TO
MACHINE LEARNING
MODULE-5 (UNSUPERVISED LEARNING) ENSEMBLE METHODS,
VOTING, BAGGING, BOOSTING. UNSUPERVISED LEARNING -
CLUSTERING METHODS -SIMILARITY MEASURES, K-MEANS
CLUSTERING, EXPECTATION-MAXIMIZATION FOR SOFT
CLUSTERING, HIERARCHICAL CLUSTERING METHODS , DENSITY
BASED CLUSTERING
2
MODULE 5—PART I
Ensemble methods, Voting, Bagging,
Boosting. Unsupervised Learning - Clustering
Methods -Similarity measures, K-means
clustering
Classification
L
y i w j d ji
j 1
Advantages:
Reduces variance by averaging predictions.
Robust against overfitting, especially with high-variance models like
decision trees.
AMT 305 Introduction to Machine Learning,prepared by DEpt. of CSE, CE Kottarakkara
BOOSTING
9
Reduce bias and improve predictive accuracy
In boosting, models are trained sequentially, where each new model
focuses on correcting the mistakes of the previous one.
Boosting gives more weight to misclassified examples, and each
subsequent model tries to improve upon the errors made by the
previous ones.
Example: AdaBoost (Adaptive Boosting) and Gradient Boosting.
AdaBoost:
Each weak learner is trained on the data, and misclassified examples
are given higher weights so that the next model pays more attention
to these difficult examples.
Final predictions are made by a weighted majority vote of the weak
learners.
AMT 305 Introduction to Machine Learning,prepared by DEpt. of CSE, CE Kottarakkara
Contd…
10
• Gradient Boosting:
• Models are built sequentially, but instead of adjusting the weights, gradient
boosting fits the next model to the residuals (the errors of the previous model).
This effectively minimizes the error at each iteration.
• Advantages:
• Reduces bias and increases accuracy.
• Works well with weak learners (e.g., shallow decision trees)
Generate a
sequence of
base-learners
each focusing
on previous
one’s errors
(Freund and
Schapire, 1996)
Minkowski Distance
Where:
•A B is the dot product of vectors A and B.
• A and B are the magnitudes (Euclidean norms) of vectors A and B.
repeat the process until the centres converge to some fixed points.
Hint:-The distance between the points (x1, x2) and (y1, y2) will be
calculated using the familiar distance formula of elementary analytical
geometry:
3. We compute the distances of the given data points from the cluster
centers.
5. We compute the distances of the given data points from the new cluster
centers.
27
(2,1.75)
(4.5,4)
7. We compute the distances of the given data points from the new
cluster centers.
11. These are identical to the cluster centres calculated in Step 8. So there
will be no reassignment of data points to different clusters and hence the
computations are stopped here.
12. Conclusion: The k means clustering algorithm with k = 2 applied to
the dataset yields the following clusters and the associated cluster centres: