2 1clustering
2 1clustering
▪ Manhattan distance
Question:
Compute the Euclidean distance and the Manhattan distance
Question:
Compute the Euclidean distance and the Manhattan distance
Question:
Compute the Euclidean distance and the Manhattan distance
1.Centroid-based clustering methods
2.Gaussian mixtures models clustering methods
3.Hierarchical clustering methods
4.Density based clustering methods
▪ Centroid-based clustering searches for a pre-determined number of clusters
within an unlabeled and possibly multidimensional dataset
▪ Each data record is assigned to one, and only one.
▪ The rule is that the distance between a data record and each of the cluster's
centroids is calculated, and this data record is assigned to the cluster achieving the
minimum distance.
▪ K-Means Clustering: a centroid-based clustering approach commonly used in
practice.
▪ The approach consists of three main steps: initialization, assignment, and update
step.
▪ In the initialization step, the number of clusters are assumed (i.e., k is
predetermined), and the centroid of each cluster is randomly defined.
▪ The simple procedure to define the initial placement of a cluster’s centroid is to
locate it at one of the given data records
▪ In the assignment step, the clusters are formed by connecting each data record
with its nearest centroid.
▪ In the assignment step, the clusters are formed by connecting each data record
with its nearest centroid.
▪ a more accurate centroid of each cluster is calculated as the mean point of its
included data records. Then, the assignment and update steps are repeated until
convergence.
▪ In the assignment step, the clusters are formed by connecting each data record
with its nearest centroid.
▪ a more accurate centroid of each cluster is calculated as the mean point of its
included data records. Then, the assignment and update steps are repeated until
convergence.
Given: