SHAHEED BHAGAT SINGH
STATE
UNIVERSITY,FEROZEPUR,
PUNJAB
PRACTICAL FILE OF AI AND ML
Submitted by:- Submitted to:-
Akash Kumar Dr. Sunny Behal
2007755 (HOD of CSE)
CSE (Data Science)
Practical no.1
Linear regression in python
……………………………..
PRACTICAL NO.2
Predict Employee Attrition Using Machine Learning &
Python
1
………………………………….
PRACTICAL NO.3
Python Implementation of the K-Means Clustering
Algorithm
Here’s how to use Python to implement the K-Means
Clustering Algorithm. These are the steps you need to
take:
Data pre-processing
Finding the optimal number of clusters using the
elbow method
Training the K-Means algorithm on the training
data set
Visualizing the clusters
1. Data Pre-Processing. Import the libraries, datasets,
and extract the independent variables.
# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Mall_Customers_data.csv')
x = dataset.iloc[:, [3, 4]].values
2. Find the optimal number of clusters using the elbow
method. Here’s the code you use:
#finding optimal number of clusters using the elbow
method
from sklearn.cluster import KMeans
wcss_list= [] #Initializing the list for the values of WCSS
#Using for loop for iterations from 1 to 10.
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++',
random_state= 42) kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11), wcss_list) mtp.title('The Elobw
Method Graph') mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()
3. Train the K-means algorithm on the training dataset.
Use the same two lines of code used in the previous
section. However, instead of using i, use 5, because
there are 5 clusters that need to be formed. Here’s the
code:
#training the K-means model on a dataset
kmeans = KMeans(n_clusters=5, init='k-means++',
random_state= 42) y_predict= kmeans.fit_predict(x)
4. Visualize the Clusters. Since this model has five
clusters, we need to visualize each one.
#visulaizing the clusters
mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s =
100, c = 'blue', label = 'Cluster 1') #for first cluster
mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s =
100, c = 'green', label = 'Cluster 2') #for second cluster
mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s =
100, c = 'red', label = 'Cluster 3') #for third cluster
mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s =
100, c = 'cyan', label = 'Cluster 4') #for fourth cluster
mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s =
100, c = 'magenta', label = 'Cluster 5') #for fifth cluster
mtp.scatter(kmeans.cluster_centers_[:, 0],
kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label
= 'Centroid')
mtp.title('Clusters of customers') mtp.xlabel('Annual
Income (k$)') mtp.ylabel('Spending Score (1-100)')
mtp.legend()
mtp.show()
…………………………………….
PRACTICAL No.4
Build a Movie Recommendation System in
Python using Machine Learning
How to build a Movie Recommendation System using
Machine Learning
The approach to build the movie recommendation
engine consists of the following steps.
Perform Exploratory Data Analysis (EDA) on the data
Build the recommendation system
Get recommendations
Step 1: Perform Exploratory Data Analysis (EDA) on the
data
The dataset contains two CSV files, credits, and movies.
The credits file contains all the metadata information
about the movie and the movie file contains the
information like name and id of the movie, budget,
languages in the movie that has been released, etc.
Let’s load the movie dataset using pandas.
from sklearn.feature_extraction.text import
CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
count_vectorizer =
CountVectorizer(stop_words="english")
count_matrix =
count_vectorizer.fit_transform(movies_df["soup"])
print(count_matrix.shape)
cosine_sim2 = cosine_similarity(count_matrix,
count_matrix)
print(cosine_sim2.shape)
movies_df = movies_df.reset_index()
indices = pd.Series(movies_df.index,
index=movies_df['title'])
def get_recommendations(title,
cosine_sim=cosine_sim):
idx = indices[title]
similarity_scores = list(enumerate(cosine_sim[idx]))
similarity_scores= sorted(similarity_scores,
key=lambda x: x[1], reverse=True)
similarity_scores= sim_scores[1:11]
# (a, b) where a is id of movie, b is similarity_scores
movies_indices = [ind[0] for ind in similarity_scores]
movies = movies_df["title"].iloc[movies_indices]
return movies
print("################ Content Based System
#############")
print("Recommendations for The Dark Knight Rises")
print(get_recommendations("The Dark Knight Rises",
cosine_sim2))
print()
print("Recommendations for Avengers")
print(get_recommendations("The Avengers",
cosine_sim2))
………………………
PRACTICAL No.5
Predicting when Employee will Leave your
company
…………………………………..