0% found this document useful (0 votes)
79 views23 pages

Mean Shift Clustering

Explains Mean Shift Clustering

Uploaded by

Shivam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views23 pages

Mean Shift Clustering

Explains Mean Shift Clustering

Uploaded by

Shivam
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Mean Shift Clustering

Mean Shift Clustering


• Mean shift is a non-parametric clustering algorithm that provides
more flexibility in identifying clusters and does not require prior
knowledge of the number of clusters.
• Mean shift clustering is a density-based clustering algorithm that
identifies the modes of a density function, which represent the
clusters.
• In other words, it finds the areas of the dataset where the probability
density function is the highest and clusters the data points in those
areas together.
• This method of clustering can be useful for identifying clusters of data
that may not be easily separated using other methods.
• In many cases, mean shift clustering is able to find good clustering of
the data, and it is often used in practice for this reason.
How does Mean Shift Clustering work?
• The algorithm starts by initializing a window or kernel around each data
point.
• The kernel can be any type of function that decreases in value as the
distance from the center of the kernel increases.
• The most common kernel function used in mean shift clustering is the
Gaussian kernel.
• The algorithm then computes the mean shift vector for each data point.
• The mean shift vector represents the direction in which the density
function is increasing the most, and its magnitude represents the rate
of increase.
• The mean shift vector is computed as follows:

• where m(xᵢ) is the mean shift vector for data point xᵢ, K is the kernel
function, and N(xᵢ) is the set of data points within the window or
kernel centered at xᵢ.

• The mean shift algorithm then updates the position of each data
point by shifting it in the direction of the mean shift vector:
• Once the algorithm converges, the clusters are defined as the set of
data points that converge to the same mode of the density function.

• Hyperparameter of the Mean Shift algorithm:


• Bandwidth: It determines the size of the kernel (or window) used for
density estimation. A larger bandwidth will result in a smoother and
more spread-out density estimate, while a smaller bandwidth will
result in a more localized density estimate.
Choosing the Right Kernel
• Problem: Consider the following 2D data points:
• X={(1, 1), (1.5, 1.5), (2, 2), (5, 5), (5.5, 5.5), (6, 6)}. We will use a
Gaussian Kernel with a bandwidth (h) of 1.5 to compute the PDF and
perform Mean Shift clustering.
• Step 1: Decide Kernel Function: Gaussian Kernel
• Given n datapoints 𝑥𝑖 , 𝑖 = 1, … , 𝑛 on a d-dimensional space 𝑅𝑑 , the
multivariate kernel density estimate obtained with kernel 𝐾(𝑥) and
window radius or bandwidth (h) is

• Mean Shift vector:


• The mean shift vector always points toward the direction of the maximum
increase in the density. The mean shift procedure, obtained by successive

• Mean shift vector is guaranteed to converge to a point where the gradient


of density function is zero.
• In Mean Shift Clustering, every data point iteratively moves towards the
centroid (or mode) of its local neighborhood, and eventually, all points
converge to high-density regions. These high-density regions represent the
final centroids of the clusters.
• The final positions of the points are the final centroids of the clusters.
Step-by-Step Explanation of the problem
How Final Centroids Are Obtained
Conclusion
Advantages of Mean Shift Clustering over K-Means
Clustering
• No need to specify the number of clusters (k) as a hyperparameter.
The algorithm automatically learns the number of clusters from the
data.
• Mean-Shift can find clusters of arbitrary shapes. It can handle
complex cluster structures. So, we don’t need to make any
assumptions on the shape of clusters.
• Mean-Shift tends to be more robust to noise and outliers in the data.
Unlike K-Means, it does not rely on distances to the centroids of the
clusters. Instead, it relies on the density of the data points.
• In K-Means, we assume all clusters are roughly the same size. Mean
Shift can handle clusters of varying sizes because it focuses on the
density of points.
• Mean Shift can handle clusters that are not well separated by linear
decision boundaries.
• The output does not depend on the random initialization of clusters.

• Disadvantages of Mean-Shift Clustering


• Mean-Shift is computationally expensive O(n²).
• We need to define the radius of the region to search through when
assigning data points into clusters.
Scenarios where to use Mean Shift and DBSCAN
Comparison of K-Means, Hierarchical, DBSCAN,
and Mean Shift Clustering

You might also like