0% found this document useful (0 votes)

54 views32 pages

06 Clus3

The document discusses hierarchical clustering, focusing on various linkage methods such as single, complete, average, centroid, and minimax linkage, each with distinct properties and implications for clustering results. It emphasizes the challenges of determining the optimal number of clusters and highlights the importance of interpreting dendrograms and cluster centers. The document also notes that choosing the right linkage method can be situation-dependent and that further empirical comparisons are needed to assess their performance.

Uploaded by

pierrick.verhoosel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views32 pages

06 Clus3

Uploaded by

pierrick.verhoosel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Clustering 3: Hierarchical clustering (continued);

choosing the number of clusters

Ryan Tibshirani
Data Mining: 36-462/36-662

January 31 2013

Optional reading: ISL 10.3, ESL 14.3

1
Even more linkages
Last time we learned about hierarchical agglomerative clustering,
basic idea is to repeatedly merge two most similar groups, as
measured by the linkage

Three linkages: single, complete, average linkage. Properties:

I Single and complete linkage can have problems with chaining
and crowding, respectively, but average linkage doesn’t
I Cutting an average linkage tree provides no interpretation, but
there is a nice interpretation for single, complete linkage trees
I Average linkage is sensitive to a monotone transformation of
the dissimilarities dij , but single and complete linkage are not
I All three linkages produce dendrograms with no inversions

Actually, there are many more linkages out there, each having
different properties. Today: we’ll look at two more

2
Reminder: linkages
Our setup: given X1 , . . . Xn and pairwise dissimilarities dij . (E.g.,
think of Xi ∈ Rp and dij = kXi − Xj k2 )

Single linkage: measures the closest pair of points

dsingle (G, H) = min dij

i∈G, j∈H

Complete linkage: measures the farthest pair of points

dcomplete (G, H) = max dij

i∈G, j∈H

Average linkage: measures the average dissimilarity over all pairs

1 X
daverage (G, H) = dij
nG · nH
i∈G, j∈H

3
Centroid linkage
Centroid linkage1 is commonly used. Assume that Xi ∈ Rp , and
dij = kXi − Xj k2 . Let X̄G , X̄H denote group averages for G, H.
Then:
dcentroid (G, H) = kX̄G − X̄H k2
●

2
● ● ●
●
● ●
●
● ● ●●
Example (dissimilarities dij are ● ●
●
●
●● ● ● ● ●
●
●● ● ●
●

1
● ●● ● ●
● ●●
distances, groups are marked ●
●
●
●
●

●●
●
● ●
●
● ●● ● ●

by colors): centroid linkage ●

●●
●

0
●
●
●●
score dcentroid (G, H) is the dis- ●
●

●
●
●
● ●
●

tance between the group cen- ● ● ●

−1
● ●● ● ●
●
● ● ●
● ● ●
● ●
● ●
troids (i.e., group averages) ●● ●
●●
●●
●●
●

●
−2

● ●
●

−2 −1 0 1 2

1
Eisen et al. (1998), “Cluster Analysis and Display of Genome-Wide
Expression Patterns”
4
Centroid linkage is the standard in biology
Centroid linkage is simple: easy to understand, and easy to
implement. Maybe for these reasons, it has become the standard
for hierarchical clustering in biology

5
Centroid linkage example
Here n = 60, Xi ∈ R2 , dij = kXi − Xj k2 . Cutting the tree at
some heights wouldn’t make sense ... because the dendrogram has
inversions! But we can, e.g., still look at ouptut with 3 clusters

● ●●
●
3

2.5
●
● ●
● ●
●
● ● ●
2

2.0
● ●
●● ●
●
● ●
● ●
● ●
1

● ●
●

1.5
● ●
● ●

Height
●
● ●
● ● ●● ●
0

●
●
●

1.0
● ●
●
●
●
−1

●
●
● ● ●

0.5
● ●
●
●
●
−2

●
0.0

−2 −1 0 1 2 3

Cut interpretation: there isn’t one, even with no inversions

6
Shortcomings of centroid linkage
I Can produce dendrograms with inversions, which really messes
up the visualization
I Even if were we lucky enough to have no inversions, still no
interpretation for the clusters resulting from cutting the tree
I Answers change with a monotone transformation of the
dissimilarity measure dij = kXi − Xj k2 . E.g., changing to
dij = kXi − Xj k22 would give a different clustering
distance distance^2
● ●●
● ● ●●
●
3

3
● ●
● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ●
2

2
● ● ● ●
●● ● ●● ●
● ●
● ● ● ●
● ● ● ●
● ● ● ●
1

1
● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ●
● ● ● ●
● ● ●● ● ● ● ●● ●
0

● ●
● ● ● ●
● ● ● ●
● ● ● ●
● ●
−1

−1

● ●
● ●
● ● ● ● ● ●
● ● ● ●
● ●
●
● ●
●
−2

−2

● ●

−2 −1 0 1 2 3 −2 −1 0 1 2 3

7
Minimax linkage
Minimax linkage2 is a newcomer. First define radius of a group of
points G around Xi as r(Xi , G) = maxj∈G dij . Then:
dminimax (G, H) = min r(Xi , G ∪ H)
i∈G∪H

2
● ● ●
●
● ●
Example (dissimilarities dij are ● ●
●
●●
●

●
● ● ● ●● ● ●●
distances, groups marked by ●● ● ● ●

1
● ●● ● ●
● ●●
●
● ● ●
●

colors): minimax linkage score ●

●
●
●
●● ●
●●
●
● ●

●
●●

0
dminimax (G, H) is the smallest ●
●
●●
●
●
●
●
● ●
radius encompassing all points ●
●
●
●

−1
●
● ●● ● ●

in G and H. The center Xc is ●

● ●
●● ●
●
●
●
●
●
●

●
●
●●

the black point ●●

● ●●
−2

● ●
●

−2 −1 0 1 2
2
Bien et al. (2011), “Hierarchical Clustering with Prototypes via Minimax
Linkage”
8
Minimax linkage example
Same data s before. Cutting the tree at h = 2.5 gives clustering
assignments marked by the colors

3.5
● ●●
●
3

3.0
●
● ●
● ●
●
● ● ●
2

2.5
● ●
●● ●
●
● ●
● ●
● ●
1

2.0
● ●
● ● ●
● ●

Height
●
● ●
● ●●

1.5
● ●
0

●
●
●
● ●
●
●

1.0
●
−1

●
●
● ● ●
● ●
●

0.5
●
●
−2

0.0
−2 −1 0 1 2 3

Cut interpretation: each point Xi belongs to a cluster whose

center Xc satisfies dic ≤ 2.5

9
Properties of minimax linkage

I Cutting a minimax tree at a height h a nice interpretation:

each point is ≤ h in dissimilarity to the center of its cluster.
(This is related to a famous set cover problem)
I Produces dendrograms with no inversions
I Unchanged by monotone transformation of dissimilarities dij
I Produces clusters whose centers are chosen among the data
points themselves. Remember that, depending on the
application, this can be a very important property. (Hence
minimax clustering is the analogy to K-medoids in the world
of hierarchical clustering)

10
Example: Olivetti faces dataset

(From Bien et al. (2011))

11
(From Bien et al. (2011))
12
Centroid and minimax linkage in R

The function hclust in the base package performs hierarchical

agglomerative clustering with centroid linkage (as well as many
other linkages)

E.g.,

d = dist(x)
tree.cent = hclust(d, method="centroid")
plot(tree.cent)

The function protoclust in the package protoclust implements

hierarchical agglomerative clustering with minimax linkage

13
Linkages summary
Unchanged
No Cut
Linkage with monotone Notes
inversions? interpretation?
transformation?
Single X X X chaining
Complete X X X crowding
Average X × ×
Centroid × × × simple
centers are
Minimax X X X
data points

Note: this doesn’t tell us what “best linkage” is

What’s missing here: a detailed empirical comparison of how they

perform. On top of this, remember that choosing a linkage can be
very situation dependent

14
Designing a clever radio system (e.g., Pandora)
Suppose we have a bunch of songs, and dissimilarity scores between
each pair. We’re building a clever radio system—a user is going to
give us an initial song, and a measure of how “risky” he is going to
be, i.e., maximal tolerable dissimilarity between suggested songs

How could we use hierarchical clustering, and with what linkage?

15
Placing cell phone towers

Suppose we are helping to place cell phone towers on top of some

buildings throughout the city. The cell phone company is looking
to build a small number of towers, such that no building is further
than half a mile from a tower

How could we use hierarchical clustering, and with what linkage?

16
How many clusters?

Sometimes, using K-means, K-medoids, or hierarchical clustering,

we might have no problem specifying the number of clusters K
ahead of time, e.g.,
I Segmenting a client database into K clusters for K salesman
I Compressing an image using vector quantization, where K
controls the compression rate

Other times, K is implicitly defined by cutting a hierarchical

clustering tree at a given height, e.g., designing a clever radio
system or placing cell phone towers

But in most exploratory applications, the number of clusters K is

unknown. So we are left asking the question: what is the “right”
value of K?

17
This is a hard problem
Determining the number of clusters is a hard problem!

Why is it hard?
I Determining the number of clusters is a hard task for humans
to perform (unless the data are low-dimensional). Not only
that, it’s just as hard to explain what it is we’re looking for.
Usually, statistical learning is successful when at least one of
these is possible

Why is it important?
I E.g., it might mean a big difference scientifically if we were
convinced that there were K = 2 subtypes of breast cancer
vs. K = 3 subtypes
I One of the (larger) goals of data mining/statistical learning is
automatic inference; choosing K is certainly part of this

18
Reminder: within-cluster variation
We’re going to focus on K-means, but most ideas will carry over
to other settings

Recall: given the number of clusters K, the K-means algorithm

approximately minimizes the within-cluster variation:
K
X X
W = kXi − X̄k k22
k=1 C(i)=k

over clustering assignments C, where X̄k is the average of points

in group k, X̄k = n1k C(i)=k Xi
P

Clearly a lower value of W is better. So why not just run K-means

for a bunch of different values of K, and choose the value of K
that gives the smallest W (K)?

19
That’s not going to work
Problem: within-cluster variation just keeps decreasing

Example: n = 250, p = 2, K = 1, . . . 10

120
● ●
1.5

●
●

100
● ● ●
● ●
● ●
● ● ● ● ●
●● ● ● ● ●
●● ● ●● ● ●● ●
●● ● ● ● ● ●●● ●

Within−cluster variation
●
1.0

●●
● ●
● ● ●●●
● ●
●
●
●●●●● ●
●
●●

80
●
● ●●
●
● ● ● ● ● ●● ●
●● ● ● ● ● ●
● ●● ● ● ●● ●● ●
● ●● ●● ●●
● ●
● ●
● ● ● ●● ●●● ●
● ● ●
●
● ●●● ●● ●●

60
●
0.5

● ●
●
● ● ● ● ● ●
● ● ●
● ●
● ● ●●
● ●
● ● ● ● ●
● ● ● ●●
● ● ● ●
●●

40
●● ● ● ● ● ●
● ●
●● ● ●
● ●● ● ●● ●● ●● ●
0.0

●
● ● ●●
● ●●
● ● ● ●
●
● ●●
●● ●
●
● ●
● ●
● ●
● ● ● ● ● ● ● ●
● ●●●●● ●
●● ● ●
● ● ● ● ●
●● ● 20
● ● ● ●
●● ● ●
● ● ● ●
−0.5

● ●

0.0 0.5 1.0 1.5 2 4 6 8 10

20
Between-cluster variation
Within-cluster variation measures how tightly grouped the clusters
are. As we increase the number of clusters K, this just keeps going
down. What are we missing?

Between-cluster variation measures how spread apart the groups

are from each other:
K
X
B= nK kX̄k − X̄k22
k=1

where as before X̄k is the average of points in group k, and X̄ is

the overall average, i.e.
n
1 X 1X
X̄k = Xi and X̄ = Xi
nk n
C(i)=k i=1

21
Example: between-cluster variation
Example: n = 100, p = 2, K = 2

●
3

● ●
●
●
2

● ●
●
●● ●
●
● ●
● ● ●
● ●
● ●
●● ● ● ●●
●●●● ●
1

● ● ● ●
● ● ● ● ● ● ● ●
● ● ●
● ● ● ●
● ● ● ● ●●

● ●
●● X1
●
●
●
●
X ● X2
●●
●●●
●●
●
●
●
● ●
● ● ●● ● ● ●
● ● ●
● ● ● ●
● ●● ●●●
0

● ● ●● ● ● ● ● ●
● ● ●● ● ● ●
● ●● ● ● ● ● ●●
● ●
● ● ● ● ●● ●● ●
● ● ● ● ● ●
● ● ● ● ● ●
●● ●● ● ●
● ●
●●
● ● ● ● ●
●● ●● ● ●
−1

● ● ● ● ● ●● ●
● ● ● ●● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ●
●● ● ●
● ● ●

−2 0 2 4 6 8

B = n1 kX̄1 − X̄k22 + n2 kX̄2 − X̄k22

X X
W = kXi − X̄1 k22 + kXi − X̄2 k22
C(i)=1 C(i)=2

22
Still not going to work
Bigger B is better, can we use it to choose K? Problem: between-
cluster variation just keeps increasing

Running example: n = 250, p = 2, K = 1, . . . 10

● ● ●
● ●
●
1.5

100
● ●
●
●
● ● ●
● ● ●
● ●
● ● ● ● ●
●● ● ● ● ●
●● ● ●● ●

Between−cluster variation

80
●● ●
●● ● ● ● ● ●●● ●
●
1.0

●
● ●
● ● ●●
● ●
●
●
●
●●●●
●
●
● ●
●
●●
● ●●
●
● ● ● ● ● ●● ●
●● ● ● ● ● ●
● ●● ● ● ●●●●● ● ●
● ●● ●● ●

60
● ●
● ●
● ● ● ●● ●●● ●
● ● ●
● ●
● ●●● ●● ●●
0.5

●
●
● ● ● ●● ● ●
● ●
● ●● ● ●●
●

40
●
● ● ● ● ●
● ● ● ●●
● ● ● ●
●●
●● ● ● ● ● ●
● ●
● ● ● ●
● ●● ● ●● ●●● ● ●
0.0

● ● ●●
●
● ●●
● ● ●
●
● ●● ●
● ● ● ●
20
● ● ● ● ●● ● ● ●
● ●
●
●● ● ●
●
●
● ● ●●●
●● ● ● ● ● ●●
●
● ● ●
●● ●
−0.5

● ●
0

0.0 0.5 1.0 1.5 2 4 6 8 10

23
CH index
Ideally we’d like our clustering assignments C to simultaneously
have a small W and a large B

This is the idea behind the CH index.3 For clustering assignments

coming from K clusters, we record CH score:

B(K)/(K − 1)
CH(K) =
W (K)/(n − K)

To choose K, just pick some maximum number of clusters to be

considered Kmax (e.g., K = 20), and choose the value of K with
the largest score CH(K), i.e.,

K̂ = argmax CH(K)
K∈{2,...Kmax }

3
Calinski and Harabasz (1974), “A dendrite method for cluster analysis”
24
Example: CH index
Running example: n = 250, p = 2, K = 2, . . . 10.

450
● ●
1.5

●
●

●
● ● ●
● ●
● ●
● ● ● ● ●
●● ● ● ● ● ●
●● ● ●● ● ●● ●
●● ● ● ● ● ●●● ●
●

400
●
1.0

●●
● ●
● ● ●●●
●
●
●●●
●●●
●● ●
●
●
●
●●
● ●
● ● ● ● ● ●● ●
●● ● ● ● ● ●
● ●● ● ● ●● ●● ● ●

CH index
● ●● ●● ●●
● ●
● ●
● ● ● ●● ●●● ●
● ● ●
● ●
● ●●● ●● ●●
0.5

●
●
● ● ● ●● ● ● ●
● ●
● ●
● ● ●● ●

350
● ● ●
● ● ● ● ●
● ● ● ●● ●
● ● ● ●●
●
●● ● ● ● ● ●
● ●
●● ● ●
● ●● ● ●● ●● ●● ●
0.0

●
● ● ●●
● ●
● ●●
● ●
●
● ●●
●● ●
● ● ●
● ●
● ●
● ● ● ● ●● ● ●
● ●● ●
●
●● ● ● ● ● ● ●●● ●●
●
● ● ●
●●

300
●
−0.5

● ●

0.0 0.5 1.0 1.5 2 4 6 8 10

We would choose K = 4 clusters, which seems reasonable

General problem: the CH index is not defined for K = 1. We could
never choose just one cluster (the null model)!
25
Gap statistic
It’s true that W (K) keeps dropping, but how much it drops at any
one K should be informative

The gap statistic4 is based on this idea. We compare the observed

within-cluster variation W (K) to Wunif (K), the within-cluster
variation we’d see if we instead had points distributed uniformly
(over an encapsulating box). The gap for K clusters is defined as
Gap(K) = log Wunif (K) − log W (K)

The quantity log Wunif (K) is computed by simulation: we average

the log within-cluster variation over, say, 20 simulated uniform
data sets. We also compute the standard error of s(K) of
log Wunif (K) over the simulations. Then we choose K by
n o
K̂ = min K ∈ {1, . . . Kmax } : Gap(K) ≥ Gap(K +1)−s(K +1)
4
Tibshirani et al. (2001), “Estimating the number of clusters in a data set
via the gap statistic”
26
Example: gap statistic
Running example: n = 250, p = 2, K = 1, . . . 10
● ● ●

0.7
1.5

●
●
●
●
● ● ●
● ●
● ●
● ● ● ● ● ●
●● ● ● ● ●
●● ● ●● ●

0.6
●● ●
●● ● ● ● ● ●●● ●
●
1.0

●●●●●
●
● ●
●
●
●
●
●●●●
●●
●● ●
●
●
●
●●
●
● ● ● ● ● ●● ● ●
●● ● ● ● ● ● ●
● ●● ● ● ●● ●● ● ●
● ●● ●● ●●
● ● ●

0.5
● ●
● ●● ●●● ●

Gap
● ●
● ● ●
● ● ●●● ●● ●●
0.5

●
●
● ● ● ●● ● ●
●
● ●
● ●
● ● ●●
● ●
●
● ● ●
● ●●
● ●

0.4
● ●
● ● ● ●●
●
●● ● ● ● ● ●
● ●
●● ● ●
● ●● ● ●● ●● ●● ●
0.0

●
● ● ●●
●
● ●●
● ●
●
● ●●
●● ●
● ●●
● ●
● ●
● ● ● ● ●● ● ●
● ●● ●
●
●● ● ● ● ● ●●●

0.3
● ●●
●
● ● ●
●● ● ●
−0.5

0.0 0.5 1.0 1.5 2 4 6 8 10

We would choose K = 3 clusters, which is also reasonable

The gap statistic does especially well when the data fall into one
cluster. (Why? Hint: think about the null distribution that it uses)
27
CH index and gap statistic in R
The CH index can be computed using the kmeans function in the
base distribution, which returns both the within-cluster variation
and the between-cluster varation (Homework 2)

E.g.,

k = 5
km = kmeans(x, k, alg="Lloyd")
names(km)
# Now use some of these return items to compute ch

The gap statistic is implemented by the function gap in the

package lga, and by the function gap in the package SAGx.
(Beware: these functions are poorly documented ... it’s unclear
what clustering method they’re using)

28
Once again, it really is a hard problem

(Taken from George Cassella’s CMU talk on January 16 2011)

29
(From George Cassella’s CMU talk on January 16 2011)

30
Recap: more linkages, and determining K
Centroid linkage is commonly used in biology. It measures the
distance between group averages, and is simple to understand and
to implement. But it also has some drawbacks (inversions!)

Minimax linkage is a little more complex. It asks the question:

“which point’s furthest point is closest?”, and defines the answer
as the cluster center. This could be useful for some applications

Determining the number of clusters is both a hard and important

problem. We can’t simply try to find K that gives the smallest
achieved within-class variation. We defined between-cluster
variation, and saw we also can’t choose K to just maximize this

Two methods for choosing K: the CH index, which looks at a

ratio of between to within, and the gap statistic, which is based on
the difference between within-class variation for our data and what
we’d see from uniform data
31
Next time: principal components analysis

Finding interesting directions in our data set

(From ESL page 67)

Hierarchical Clustering Techniques
No ratings yet
Hierarchical Clustering Techniques
84 pages
K-Means vs Hierarchical Clustering
No ratings yet
K-Means vs Hierarchical Clustering
30 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
11 pages
Hierarchical Clustering Explained
No ratings yet
Hierarchical Clustering Explained
20 pages
Unsupervised Learning Lecture
No ratings yet
Unsupervised Learning Lecture
6 pages
Clustering
No ratings yet
Clustering
36 pages
Module 3 - 1
No ratings yet
Module 3 - 1
149 pages
Lecture+Notes+ +clustering
No ratings yet
Lecture+Notes+ +clustering
13 pages
Unit 3 DVA
No ratings yet
Unit 3 DVA
50 pages
Lecture 3
No ratings yet
Lecture 3
46 pages
Clustering - The Data Ensemble
No ratings yet
Clustering - The Data Ensemble
4 pages
Lec 2
No ratings yet
Lec 2
32 pages
Mlclustering2022 10 26
No ratings yet
Mlclustering2022 10 26
36 pages
Hierarchical vs K-Means Clustering Guide
No ratings yet
Hierarchical vs K-Means Clustering Guide
23 pages
Unsupervised Learning: Clustering Algorithms
No ratings yet
Unsupervised Learning: Clustering Algorithms
13 pages
Chap15 Cluster Analysis
No ratings yet
Chap15 Cluster Analysis
55 pages
Clustering
No ratings yet
Clustering
20 pages
What Is Cluster Analysis?
No ratings yet
What Is Cluster Analysis?
20 pages
Intro to Clustering Methods
No ratings yet
Intro to Clustering Methods
39 pages
Hierarchical Clustering Techniques Explained
100% (1)
Hierarchical Clustering Techniques Explained
33 pages
Clustering Techniques in Data Analytics
No ratings yet
Clustering Techniques in Data Analytics
42 pages
Cluster Analysis Concepts & Algorithms
No ratings yet
Cluster Analysis Concepts & Algorithms
82 pages
Clustering
No ratings yet
Clustering
38 pages
K-Means Clustering Overview
No ratings yet
K-Means Clustering Overview
24 pages
ML PR 5
No ratings yet
ML PR 5
23 pages
Clustering
No ratings yet
Clustering
52 pages
Un Supervised Learning
No ratings yet
Un Supervised Learning
22 pages
Statistical Computing With R: Masters in Data Sciences 503 (S27) Third Batch, SMS, TU, 2024
No ratings yet
Statistical Computing With R: Masters in Data Sciences 503 (S27) Third Batch, SMS, TU, 2024
30 pages
Week 6 AM Slides
No ratings yet
Week 6 AM Slides
39 pages
Module12.02 UnsupervisedLearning
No ratings yet
Module12.02 UnsupervisedLearning
25 pages
Clustering (Class 38-39)
No ratings yet
Clustering (Class 38-39)
45 pages
Hierarchical Clustering Guide
No ratings yet
Hierarchical Clustering Guide
110 pages
Group Average Clustering Overview
No ratings yet
Group Average Clustering Overview
41 pages
22AIP3101A Session 9
No ratings yet
22AIP3101A Session 9
38 pages
Clustering: EE-671 Prof L. Behera, IITK
No ratings yet
Clustering: EE-671 Prof L. Behera, IITK
33 pages
Lp2-Etl Model Assignment No. 2: R (2) C (4) V (2) T (2) Total (10) Dated Sign
No ratings yet
Lp2-Etl Model Assignment No. 2: R (2) C (4) V (2) T (2) Total (10) Dated Sign
7 pages
Hierarchical Clustering Overview
No ratings yet
Hierarchical Clustering Overview
10 pages
Text Analytics Unit-3
No ratings yet
Text Analytics Unit-3
11 pages
Lec 05 Unsupervised-Kmeans
No ratings yet
Lec 05 Unsupervised-Kmeans
50 pages
Clustering for Data Analysis
No ratings yet
Clustering for Data Analysis
16 pages
Unit-4 New
No ratings yet
Unit-4 New
36 pages
K-means Clustering Overview
No ratings yet
K-means Clustering Overview
35 pages
Cluster Analysis in Business Intelligence
No ratings yet
Cluster Analysis in Business Intelligence
37 pages
CE345 - Lecture #10 - Clustering (Part 2)
No ratings yet
CE345 - Lecture #10 - Clustering (Part 2)
64 pages
UNIT5
No ratings yet
UNIT5
60 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
15 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
84 pages
7 Clustering-Hierarichal Clustering
No ratings yet
7 Clustering-Hierarichal Clustering
13 pages
03 Clustering
No ratings yet
03 Clustering
63 pages
19 - Sessionppt - Clusteringalgos
No ratings yet
19 - Sessionppt - Clusteringalgos
36 pages
Clustering
No ratings yet
Clustering
75 pages
10.cluster Analysis
No ratings yet
10.cluster Analysis
68 pages
AIMLB PGP 2025 Session 12
No ratings yet
AIMLB PGP 2025 Session 12
45 pages
4.4 Hierarchical Clustering Methods
No ratings yet
4.4 Hierarchical Clustering Methods
39 pages
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
No ratings yet
Agglomerative Hierarchical Clustering Algorithm-A Review: K.Sasirekha, P.Baby
3 pages
Day 3
No ratings yet
Day 3
74 pages
10Hierarchical&Probabilistic Clustering & GMM (ML)
No ratings yet
10Hierarchical&Probabilistic Clustering & GMM (ML)
24 pages
Customer Segmentation Techniques Explained
No ratings yet
Customer Segmentation Techniques Explained
46 pages
Clustering Techniques in ML
No ratings yet
Clustering Techniques in ML
3 pages
Information Sciences: Gang Kou, Yi Peng, Guoxun Wang
No ratings yet
Information Sciences: Gang Kou, Yi Peng, Guoxun Wang
12 pages
MLT Unit-1
No ratings yet
MLT Unit-1
19 pages
Improved Segmentation Model For Melanoma Lesion de
No ratings yet
Improved Segmentation Model For Melanoma Lesion de
14 pages
Machine Learning Algorithms - A Review: January 2019
No ratings yet
Machine Learning Algorithms - A Review: January 2019
7 pages
Solar Powered Soil and Weather Monitoring System For Farmers
No ratings yet
Solar Powered Soil and Weather Monitoring System For Farmers
12 pages
Aiml Model Exam
No ratings yet
Aiml Model Exam
2 pages
Lesson Plan - ML3
No ratings yet
Lesson Plan - ML3
4 pages
Fuzzy Modeling
No ratings yet
Fuzzy Modeling
65 pages
Mean Shift Cluster
No ratings yet
Mean Shift Cluster
10 pages
IJCRT2403067
No ratings yet
IJCRT2403067
6 pages
Unsupervised Learning 1
No ratings yet
Unsupervised Learning 1
40 pages
Crime Prediction via Data Mining
No ratings yet
Crime Prediction via Data Mining
6 pages
Full Stack AI Engineer Roadmap
No ratings yet
Full Stack AI Engineer Roadmap
5 pages
Unit 3 Supervised Learning
No ratings yet
Unit 3 Supervised Learning
89 pages
Data Mining Complete Lab Manual - DRSNR
No ratings yet
Data Mining Complete Lab Manual - DRSNR
27 pages
Machine Learning With R The Tidyverse and MLR 1st Edition Hefin I Rhys 2024 Scribd Download
100% (1)
Machine Learning With R The Tidyverse and MLR 1st Edition Hefin I Rhys 2024 Scribd Download
55 pages
Question Bank Aiml
No ratings yet
Question Bank Aiml
10 pages
Kernel Clustering for CSE Students
No ratings yet
Kernel Clustering for CSE Students
57 pages
Sabesan K. Generative AI for Everyone. Deep Learning, NLP, And LLMs...2025
No ratings yet
Sabesan K. Generative AI for Everyone. Deep Learning, NLP, And LLMs...2025
614 pages
Drashti CVML
No ratings yet
Drashti CVML
83 pages
Kernel Smoothing & Regression Guide
No ratings yet
Kernel Smoothing & Regression Guide
5 pages
ML Important Questions
No ratings yet
ML Important Questions
7 pages
Clustering Techniques Overview
No ratings yet
Clustering Techniques Overview
45 pages
6450 18335 1 PB
No ratings yet
6450 18335 1 PB
15 pages
An Effective Hybrid Learning System For
No ratings yet
An Effective Hybrid Learning System For
13 pages
EY & Zepto Data Analyst Interview Questions
No ratings yet
EY & Zepto Data Analyst Interview Questions
24 pages
Data Science: A First Introduction 1st Edition Tiffany Timbers PDF Download
No ratings yet
Data Science: A First Introduction 1st Edition Tiffany Timbers PDF Download
44 pages
Rs&gis Unit-2 Material
No ratings yet
Rs&gis Unit-2 Material
26 pages
Grad-Level Machine Learning Course
No ratings yet
Grad-Level Machine Learning Course
12 pages
Optimizing Customer Segmentationinthe Banking Sector
No ratings yet
Optimizing Customer Segmentationinthe Banking Sector
8 pages