K Means Clustering - Experiment 12
K Means Clustering - Experiment 12
12
1. Title:
Perform unsupervised classification using k means algorithm in Python Environment
2. Aim:
To study the given dataset, apply the k-means classifier for classification of the dataset
using Python.
3. Theory:
Here K defines the number of pre-defined clusters that need to be created in the process, as if
K=2, there will be two clusters, and for K=3, there will be three clusters, and so on. It is an
iterative algorithm that divides the unlabelled dataset into k different clusters in such a way
that each dataset belongs only one group that has similar properties. It allows us to cluster the
data into different groups and a convenient way to discover the categories of groups in the
unlabelled dataset on its own without the need for any training. It is a centroid-based algorithm,
where each cluster is associated with a centroid. The main aim of this algorithm is to minimize
the sum of distances between the data point and their corresponding clusters. The algorithm
takes the unlabelled dataset as input, divides the dataset into k-number of clusters, and repeats
the process until it does not find the best clusters. The value of k should be predetermined in
this algorithm.
4. Pre-Questions:
1. What is clustering?
2. How does unsupervised is different from supervised classification?
3. What is the library that is used to import k-means clustering?
5. Post-Questions:
1. What are the applications of clustering?
2. What are the different methods of clustering?
3. What are the parameter on the basis of which k-means clustering is done?
6. Program Code: