0% found this document useful (0 votes)

14 views

K Means Clustering - Experiment 12

Unsupervised

Uploaded by

Prateek Verma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

K Means Clustering - Experiment 12

Unsupervised

Uploaded by

Prateek Verma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

EXPERIMENT NO.

12
1. Title:
Perform unsupervised classification using k means algorithm in Python Environment

2. Aim:
To study the given dataset, apply the k-means classifier for classification of the dataset
using Python.

3. Theory:

K-Means Clustering is an Unsupervised Machine Learning algorithm, which groups the

unlabelled dataset into different clusters. It is the process of teaching a computer to use
unlabelled, unclassified data and enabling the algorithm to operate on that data without
supervision. Without any previous data training, the machine’s job in this case is to organize
unsorted data according to parallels, patterns, and variations. The goal of clustering is to divide
the population or set of data points into a number of groups so that the data points within each
group are more comparable to one another and different from the data points within the other
groups. It is essentially a grouping of things based on how similar and different they are to one
another.
The algorithm works as follows:
1. First, we randomly initialize k points, called means or cluster centroids.
2. We categorize each item to its closest mean, and we update the mean’s coordinates, which
are the averages of the items categorized in that cluster so far.
3. We repeat the process for a given number of iterations and at the end, we have our clusters.

Here K defines the number of pre-defined clusters that need to be created in the process, as if
K=2, there will be two clusters, and for K=3, there will be three clusters, and so on. It is an
iterative algorithm that divides the unlabelled dataset into k different clusters in such a way
that each dataset belongs only one group that has similar properties. It allows us to cluster the
data into different groups and a convenient way to discover the categories of groups in the
unlabelled dataset on its own without the need for any training. It is a centroid-based algorithm,
where each cluster is associated with a centroid. The main aim of this algorithm is to minimize
the sum of distances between the data point and their corresponding clusters. The algorithm
takes the unlabelled dataset as input, divides the dataset into k-number of clusters, and repeats
the process until it does not find the best clusters. The value of k should be predetermined in
this algorithm.

4. Pre-Questions:
1. What is clustering?
2. How does unsupervised is different from supervised classification?
3. What is the library that is used to import k-means clustering?
5. Post-Questions:
1. What are the applications of clustering?
2. What are the different methods of clustering?
3. What are the parameter on the basis of which k-means clustering is done?
6. Program Code:

##to create dataframe and import libraries

from sklearn.cluster import KMeans
import pandas as pd
from matplotlib import pyplot as plt

##to read csv file

df = pd.read_csv(‘/kaggle/input/income-dataset-for-k-means/income.csv’)
df
##to check first 5 rows
df = df.head()
## to check the basic statistics of the data
df.describe()
df.shape
df.info
## to plot scatter plot between age and income
plt.scatter(df.Age, df['Income($)'])
plt.xlabel('Age')
plt.ylabel('Income($)')

## to fit dataset into clusters

km = KMeans(n_clusters=3)
y_predicted = km.fit_predict(df[['Age','Income($)']])
y_predicted

## print the predicted cluster number for each datapoint

df['cluster']=y_predicted
df.head()

##to check the cluster centers

km.cluster_centers_

##to plot the different datapoints as per their assigned clusters

df1 = df[df.cluster==0]
df2 = df[df.cluster==1]
df3 = df[df.cluster==2]
plt.scatter(df1.Age,df1['Income($)'],color='green')
plt.scatter(df2.Age,df2['Income($)'],color='red')
plt.scatter(df3.Age,df3['Income($)'],color='black')
plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1],color='purple',marker='*',labe
l='centroid')
plt.xlabel('Age')
plt.ylabel('Income ($)')
plt.legend()
7. Outcome: The k means classifier has been used successfully for clustering of an income
dataset for 3 clusters using Python platform.

Fig. 1. Input Dataset

Fig. 2. 3 Clusters using k-means clustering

8. Conclusion: Thus we have completed the k-means clustering for income dataset
successfully.

Sy0 701
No ratings yet
Sy0 701
261 pages
Isaca Crisc: Exam Summary - Syllabus - Questions
No ratings yet
Isaca Crisc: Exam Summary - Syllabus - Questions
9 pages
Digi Ds-425p Manual
No ratings yet
Digi Ds-425p Manual
19 pages
Microsoft - Pre .AZ-104.by .VCEplus.59q-DEMO PDF
No ratings yet
Microsoft - Pre .AZ-104.by .VCEplus.59q-DEMO PDF
44 pages
10.Lab Activity
No ratings yet
10.Lab Activity
11 pages
Exp 7
No ratings yet
Exp 7
3 pages
Pa66 ML Exp6
No ratings yet
Pa66 ML Exp6
9 pages
ML - K-Means
No ratings yet
ML - K-Means
12 pages
K Means Clustering
No ratings yet
K Means Clustering
5 pages
Experiment 3.1 K-Mean
No ratings yet
Experiment 3.1 K-Mean
8 pages
Aiml 8
No ratings yet
Aiml 8
7 pages
Machine Learning K Means - Unsupervised
No ratings yet
Machine Learning K Means - Unsupervised
5 pages
Unsupervisd Learning Algorithm
No ratings yet
Unsupervisd Learning Algorithm
6 pages
Elbow Method
No ratings yet
Elbow Method
2 pages
JAVIER KMeans Clustering Jupyter Notebook
No ratings yet
JAVIER KMeans Clustering Jupyter Notebook
7 pages
K Means Clustering
No ratings yet
K Means Clustering
11 pages
Presentation 1
No ratings yet
Presentation 1
47 pages
K_means.ipynb_-_Colab
No ratings yet
K_means.ipynb_-_Colab
10 pages
Building K-Means Clustering Algorithm From Scratch
No ratings yet
Building K-Means Clustering Algorithm From Scratch
10 pages
K Means Algorithm
No ratings yet
K Means Algorithm
2 pages
Report ML 2
No ratings yet
Report ML 2
10 pages
ML Exp5 C36
No ratings yet
ML Exp5 C36
18 pages
Artificial Intelligence Report
No ratings yet
Artificial Intelligence Report
23 pages
01 K Means - Merged
No ratings yet
01 K Means - Merged
26 pages
AI Week 11
No ratings yet
AI Week 11
21 pages
MLT 8 KK
No ratings yet
MLT 8 KK
2 pages
AAI101 - Session 2 - Unsupervised Learning
No ratings yet
AAI101 - Session 2 - Unsupervised Learning
38 pages
Algorithms New
No ratings yet
Algorithms New
8 pages
UNIT - 3 - Clustering
No ratings yet
UNIT - 3 - Clustering
21 pages
DA_EXP_10 (1)
No ratings yet
DA_EXP_10 (1)
6 pages
Unit 3 Data
No ratings yet
Unit 3 Data
37 pages
Vid 4
No ratings yet
Vid 4
6 pages
Data Mining
No ratings yet
Data Mining
27 pages
02.1 K-Means Example
No ratings yet
02.1 K-Means Example
12 pages
AML - LAB (1-6)
No ratings yet
AML - LAB (1-6)
15 pages
DWDM Lab All
No ratings yet
DWDM Lab All
20 pages
DWM_EXP4
No ratings yet
DWM_EXP4
9 pages
Unsupervised Learning - Clustering Cheatsheet - Codecademy
No ratings yet
Unsupervised Learning - Clustering Cheatsheet - Codecademy
5 pages
Detecting Patterns with Unsupervised Learning
No ratings yet
Detecting Patterns with Unsupervised Learning
21 pages
3.unsupervised Learning
No ratings yet
3.unsupervised Learning
9 pages
3.1 K - Means
No ratings yet
3.1 K - Means
16 pages
K-Means in Python - Solution
No ratings yet
K-Means in Python - Solution
6 pages
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
No ratings yet
Subject: ML Name: Priyanshu Gandhi Date: 10/4/21 Expt. No.: 9 Roll No.: C008 Title: Clustering Implementation in Python
7 pages
K-Means Algorithm
No ratings yet
K-Means Algorithm
29 pages
Data Science Machine Leraning222
No ratings yet
Data Science Machine Leraning222
11 pages
Assignment 3.1 K Means Clustering in Python PART 1
No ratings yet
Assignment 3.1 K Means Clustering in Python PART 1
7 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
12 pages
Overview of Clustering:: UNIT-5
No ratings yet
Overview of Clustering:: UNIT-5
27 pages
K-Means Clustering
No ratings yet
K-Means Clustering
19 pages
INTRO TO ML ASS
No ratings yet
INTRO TO ML ASS
3 pages
DSUP_Exp5[1]
No ratings yet
DSUP_Exp5[1]
7 pages
K Means Clustering Project - Sample
No ratings yet
K Means Clustering Project - Sample
9 pages
K-Means_Clustering_Report
No ratings yet
K-Means_Clustering_Report
2 pages
ML exp8
No ratings yet
ML exp8
4 pages
DA_EXP_10
No ratings yet
DA_EXP_10
6 pages
09.unsupervised Learning
No ratings yet
09.unsupervised Learning
50 pages
ML0101EN Clus K Means Customer Seg Py v1
100% (1)
ML0101EN Clus K Means Customer Seg Py v1
8 pages
Unsupervised
No ratings yet
Unsupervised
10 pages
LP I Assignment A4 Clustering
No ratings yet
LP I Assignment A4 Clustering
13 pages
K.means Clustering
No ratings yet
K.means Clustering
8 pages
2.3 Aiml Rishit
No ratings yet
2.3 Aiml Rishit
7 pages
DA_EXP_10_66
No ratings yet
DA_EXP_10_66
6 pages
Assignment 5
No ratings yet
Assignment 5
3 pages
Data Structures and Algorithm
From Everand
Data Structures and Algorithm
Knowledge Flow
No ratings yet
Allux.lua
No ratings yet
Allux.lua
4 pages
Assignment 1 Deadline 2075/8/17
No ratings yet
Assignment 1 Deadline 2075/8/17
1 page
MPC 5676
No ratings yet
MPC 5676
92 pages
Healthy, Safe and Secure 2024-2025
No ratings yet
Healthy, Safe and Secure 2024-2025
12 pages
Working With Rows Columns
No ratings yet
Working With Rows Columns
14 pages
Chapter 01
No ratings yet
Chapter 01
18 pages
Naweed Malikzada CV
No ratings yet
Naweed Malikzada CV
2 pages
Textbook (Alpha)
100% (1)
Textbook (Alpha)
556 pages
Building Blocks: Entity Declaration Description Example
No ratings yet
Building Blocks: Entity Declaration Description Example
10 pages
GFS-154B M06 Configuring IO Drivers and OPC Servers
No ratings yet
GFS-154B M06 Configuring IO Drivers and OPC Servers
30 pages
Bcse410l Cyber-Security TH 1.1 0 Bcse410l
No ratings yet
Bcse410l Cyber-Security TH 1.1 0 Bcse410l
2 pages
Brochure A70 - HR
No ratings yet
Brochure A70 - HR
7 pages
Reflections On UNIX Vulnerabilities
No ratings yet
Reflections On UNIX Vulnerabilities
24 pages
Log
No ratings yet
Log
14 pages
Quick_Guide_ATARI-LYNX-II_REV3_0
No ratings yet
Quick_Guide_ATARI-LYNX-II_REV3_0
2 pages
Aatcc Ep9
100% (2)
Aatcc Ep9
3 pages
s7-1200 PN DP Gateway en
No ratings yet
s7-1200 PN DP Gateway en
28 pages
Wipro SVEC B.Tech 2024 Notification
No ratings yet
Wipro SVEC B.Tech 2024 Notification
26 pages
5000 Series: Manual AC Power Source Operation Manual For Models 5005 5010 5020 5040
No ratings yet
5000 Series: Manual AC Power Source Operation Manual For Models 5005 5010 5020 5040
45 pages
Mo-0031-Ing R5 - SCM21 PDF
No ratings yet
Mo-0031-Ing R5 - SCM21 PDF
25 pages
Computer Architecture Lab Manual
No ratings yet
Computer Architecture Lab Manual
108 pages
EE303 Digital System - 2010fall: Unit 12 Registers and Counters
No ratings yet
EE303 Digital System - 2010fall: Unit 12 Registers and Counters
50 pages
Who Is Riding The Bus?: Class 2
No ratings yet
Who Is Riding The Bus?: Class 2
5 pages
Assignment: English: Name Zuhaib Ahmed
No ratings yet
Assignment: English: Name Zuhaib Ahmed
5 pages
2B201-324E - D - Aquilion3264 Trouble Shooting
100% (1)
2B201-324E - D - Aquilion3264 Trouble Shooting
37 pages
Intercom Datasheets
No ratings yet
Intercom Datasheets
24 pages

K Means Clustering - Experiment 12

Uploaded by

K Means Clustering - Experiment 12

Uploaded by

EXPERIMENT NO.

K-Means Clustering is an Unsupervised Machine Learning algorithm, which groups the

##to create dataframe and import libraries

##to read csv file

## to fit dataset into clusters

## print the predicted cluster number for each datapoint

##to check the cluster centers

##to plot the different datapoints as per their assigned clusters

Fig. 1. Input Dataset

Fig. 2. 3 Clusters using k-means clustering

You might also like