0% found this document useful (0 votes)
3 views43 pages

Unit 3

Uploaded by

Chitra M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views43 pages

Unit 3

Uploaded by

Chitra M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 43

Unit – III

Unsupervised Learning, Clustering Support Vector Machines


• Parametric methods when data is known
to have a distribution
• Use data to get the parameters of the
distribution
– Typically few in number
• Maybe too rigid in some cases – ie always
same distribution
Clustering
Clustering

4
Clustering
• (Q) Supervised or unsupervised?

5
Clustering
• (Q) Supervised or unsupervised?

6
Clustering

7
K-means Clustering
• A method to allot the cluster to each
sample
• Minimize overall cost = reconstruction
error = distance between cluster “centers”
• Choosing a cluster for each sample:

8
K-Means Clustering
• Once new centers are obtained,
recalculate the assignment
• Repeat with new means and re-assign
samples
• Continue till no change in centers

9
10
11
12
13
14
15
16
K-Means Clustering
• Choose a random set of cluster centers
• The reconstruction error should be
minimized, hence, differentiate and equate
to zero

• These are the new cluster centers –


nothing but mean of the samples
17
K-Means Clustering

18
Leader Clustering
• Some outliers may skew the means
• Add a parameter “t” – max length
– If not within t of any cluster, sample become a
new cluster head
• Recalculate with new mean and continue
• (Q) What is the value of t for which leader
clustering become K-means?

19
EM Algorithm

20
Expectation Maximization (EM)
Algorithm
• Unsupervised data – only data, no class labels

• Hence, the data consists of two parts – observables X


and unknowns Z

• E-step we estimate these labels given our current


knowledge of components
• M-step we update our component knowledge given the
labels estimated in the E-step.
– (Q) Is K-Means also a type of EM algorithm?

21
EM Example
• Example: Missing Normally Distributed Data
• Assume Gaussian variable
• {5, 11, x, x}
• Start: choose random value for x
• E-step: calculate mean of data
• M-step: replace with new mean from
previous step
• Continue till convergence (Will it??)

22
Coin Example

23
Sample EM Algorithm
• Consider the six samples and the true likelihood of the two classes
• Problem in EM
– What we have: K (ie number of classes), Distribution Pattern Data
points
– What we don’t have: parameters of distribution, class labels
• How to get the two class means?

24
Sample EM Algorithm
• (Q) Outline the application of the EM algorithm for
Gaussian Mixtures.

25
Hierarchical Clustering

26
Hierarchical Clustering

27
Hierarchical Clustering

28
Agglomerative Clustering
• Agglomerate – collect into a group
• Idea: take each sample as a separate class
– Bottom up approach
• Start by combining samples close to each other
– Distance, e.g., in documents, word to vector and
Euclidean distance
• Each iteration – combine two closest samples to
agglomerate (ie, combine)
• Continue until only one group is left

29
Agglomerative Clustering
• Consider the second iteration of agglomerative
clustering – ie, all groups have two elements
• How to combine two groups – what is the
distance between the two groups with two
elements each?
– Single link clustering – calculate all the four (how?)
distances, smallest is distance between groups
– Complete Link Clustering – distance between
clusters is largest distance between all pairs

30
Agglomerative Clustering

31
• Perform Agglomerative clustering and
show the intermetidate steps on the
following data:
1 2 3 4
1 0 7 4 8
2 0 2 1
3 0 8
4 0

32
33
34
35
36
Dendrogram

37
Dendrogram
• Used to visualize the clusters
• Clusters joined at “height” – based on linkage type
• Two combined to one – agglomeration
• Can be “cut” at a height h to not group beyond that
distance

38
• (Q) Draw the dendrogram using single link
clustering and cut at height 0.55. What are
the resulting clusters?

39
Kernelised SVM

40
41
42
Kernels
• Example:
Types for feature extraction: Identity, Blur,
Edge Detection, Sharpening
Operations: Convolution, Pooling, Dilation,
Erosion, Cross-correlation

43

You might also like