Lecture - 10 Unsupervised Learning & K-Means Clustering
Lecture - 10 Unsupervised Learning & K-Means Clustering
K-Means Clustering
1
Unsupervised Learning
2
Unsupervised Learning
Supervised Learning
Unsupervised Learning
3
Unsupervised Learning - Applications
• Clustering allows you to automatically split the dataset into groups according to
similarity. Often, however, cluster analysis overestimates the similarity between groups
and doesn’t treat data points as individuals. For this reason, cluster analysis is a poor
choice for applications like customer segmentation and targeting.
• Anomaly detection can automatically discover unusual data points in your dataset. This
is useful in pinpointing fraudulent transactions, discovering faulty pieces of hardware, or
identifying an outlier caused by a human error during data entry.
• Association rules discovery identifies sets of items that frequently occur together in your
dataset. Retailers often use it for basket analysis, because it allows analysts to discover
goods often purchased at the same time and develop more effective marketing and
merchandising strategies.
• Latent variable models are commonly used for data preprocessing, such as reducing the
number of features in a dataset (dimensionality reduction) or decomposing the dataset
into multiple components.
4
Clustering for Understanding
5
What is Cluster Analysis?
6
Clustering Example – news.google.com
7
Cluster formation methods
9
Centroid based clusters
10
k-Means Clustering Algorithm
• An iterative algorithm that partitions the dataset
according to their features into K number of predefined
non- overlapping distinct clusters or subgroups.
• It makes the data points of inter clusters as similar as
possible and also tries to keep the clusters as far as
possible.
• It allocates the data points to a cluster if the sum of the
squared distance between the cluster’s centroid and the
data points is at a minimum where the cluster’s centroid
is the arithmetic mean of the data points that are in the
cluster.
11
Step 1: Initialization
12
Step 2: Cluster Assignment
13
Step 3: Moving Centroid
14
Step 4: Optimization
15
Step 5: Convergence
16
Euclidean Distance
17
k-Means Demo – Step 1: Initialization
S# X1 X2
A 1 1
B 1 0
C 0 2
D 2 4
E 3 5
18
k-Means Demo – Step 2: Cluster Assignment
Consider:
ID X1 X2 q(1, 1)
A 1 1 p(1,0)
B 1 0
C 0 2
D 2 4
E 3 5
CID X1 X2
1 1
0 2
19
k-Means Demo – Step 2: Cluster Assignment
ID X1 X2 C1 C2
A 1 1 0 1.4
B 1 0 1 2.2
C 0 2 1.4 0
D 2 4 3.2 2.8
E 3 5 4.5 4.2
CID X1 X2
1 1
0 2
20
k-Means Demo – Step 2: Cluster Assignment
ID X1 X2 C1 C2 CID
A 1 1 0 1.4 1
B 1 0 1 2.2 1
C 0 2 1.4 0 2
D 2 4 3.2 2.8 2
E 3 5 4.5 4.2 2
CID X1 X2
1 1
0 2
21
k-Means Demo – Step 3: Moving Centroid
ID X1 X2 C1 C2 CID
A 1 1 0 1.4 1
B 1 0 1 2.2 1
C 0 2 1.4 0 2
D 2 4 3.2 2.8 2
E 3 5 4.5 4.2 2
CID X1 X2 CID X1 X2
1 1
0 2
22
k-Means Demo – Step 3: Moving Centroid
ID X1 X2 C1 C2 CID
A 1 1 0 1.4 1
B 1 0 1 2.2 1
C 0 2 1.4 0 2
D 2 4 3.2 2.8 2
E 3 5 4.5 4.2 2
23
k-Means Demo – Step 4: Optimization
ID X1 X2 C1 C2 CID
A 1 1 0.5 2.7 1
B 1 0 0.5 3.7 1
C 0 2 1.8 2.4 1
D 2 4 3.6 0.5 2
E 3 5 4.9 1.9 2
CID X1 X2 CID X1 X2
1 1 1 0.5
0 2 1.67 3.67
24
k-Means Demo – Step 4: Optimization
ID X1 X2 C1 C2 CID
A 1 1 0.5 2.7 1
B 1 0 0.5 3.7 1
C 0 2 1.8 2.4 1
D 2 4 3.6 0.5 2
E 3 5 4.9 1.9 2
25
k-Means Demo – Step 4: Optimization
ID X1 X2 C1 C2 CID
A 1 1 0.5 2.7 1
B 1 0 0.5 3.7 1
C 0 2 1.8 2.4 1
D 2 4 3.6 0.5 2
E 3 5 4.9 1.9 2
26
k-Means Demo – Step 5: Convergence
ID X1 X2 C1 C2 CID
A 1 1 0.33 3.81 1
B 1 0 1.05 4.74 1
C 0 2 1.20 3.54 1
D 2 4 3.28 0.71 2
E 3 5 4.63 0.71 2
CID X1 X2 CID X1 X2
No change, Done! 1 0.5
1.67 3.67
0.67
2.5
1
4.5
27
Another Example
Iteration 1 Iteration 2 Iteration 3
3 3 3
2 2 2
y
1 1 1
0 0 0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x x x
2 2 2
y
1 1 1
0 0 0
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
x x x
28
k-Means Clustering: Discussion
29
k-Means Clustering: Advantages
31