RUNGTA COLLEGE OF ENGINEERING & TECHNOLOGY
DEPARTMENT OF Computer Science and Engineering
LAB MANUAL
Machine Learning LAB
SEMESTER <7th>
RUNGTA COLLEGE
Rungta Educational Campus,
Kohka-Kurud Road, Bhilai,
Chhattisgarh, India
Phone No. 0788-6666666
MANAGED BY : SANTOSH RUNGTA GROUP OF INSTITUTIONS
Prepared By
<Faculty Name 1> <Faculty Name 2>
RUNGTA COLLEGE OF ENGINEERING & TECHNOLOGY
DEPARTMENT OF Computer Science and Engineering
LAB MANUAL
Machine Learning LAB
SEMESTER <7th>
PREPARED AS PER THE SYLLABUS PRESCRIBED BY
CHHATTISGARH SWAMI VIVEKANAND TECHNICAL UNIVERSITY, BHILAI
List of DOs & DON’Ts.
(Give instructions as per < Name Of The Department > Laboratories)
DOs:
▪ Remove your shoes outside the laboratory.
▪ Come prepared in the lab regarding the experiment to be performed in the
lab.
▪ Take help from the Manual / Work Book for preparation of the experiment.
▪ For any abnormal working of the machine consult the Faculty In-charge/
Lab Assistant.
▪ Shut down the machine and switch off the power supply after performing
the experiment.
▪ Maintain silence and proper discipline in the lab.
▪ Enter your machine number in the Login register.
DON’Ts :
▪ Do not bring any magnetic material in the lab.
▪ Do not eat or drink any thing in the lab.
▪ Do not tamper the instruments in the Lab and do not disturb their settings.
LIST OF EXPERIMENTS
AS PER THE SYLLABUS PRESCRIBED BY THE UNIVERSITY
For Example…
Chhattisgarh Swami Vivekananda Technical
University, Bhilai
Branch : Computer Science and Engineering Semester : VII
Subject : Machine Learning Lab Subject Code:
D022721(022)
Total Lab Periods : 36 Batch Size: 30
Maximum Marks : 40 Minimum Marks:
20
COURSE OBJECTIVES
1. To be able to use Numpy along with Matplotlib for visual representation of
data.
2. To be able to create a Supervised Learning models in Python.
3. To be able to create an Un-Supervised Learning models in Python.
4. To be able to implement Artificial Neural Network in Python.
COURSE OUTCOMES
After successful completion of this course, the students will be able to-
1. ApplyNumpy along with Matplotlib for visual analysis of data.
2. Apply Supervised Learning models for problem solving.
3. Apply Un-Supervised Learning models for problem solving.
4. Apply Artificial Neural Network for problem solving.
List of Experiments
1. Write programs to understand the use of Matplotlib for Simple Interactive
Chart, Set the Properties of the Plot, matplotlib and NumPy.
2. Write programs to understand the use of Matplotlib for Working with Multiple
Figures and Axes, Adding Text, Adding a Grid and Adding a Legend.
3. Write programs to understand the use of Matplotlib for Working with Line
Chart, Histogram, Bar Chart, Pie Charts.
4. Write a program in Python to implement Linear Regression for house price
prediction.
(DataSource:https://forge.scilab.org/index.php/p/rdataset/source/file/master/
csv/MASS/Boston).
5. Write a program in Python to implement K Nearest Neighbor classifier for
diabetes classification.(DataSource:https://www.kaggle.com/uciml/pima-indians-
diabetes-database/data).
6. Build a Naive Bayes model in Python totackle a spam classification problem.
(DataSource:(https://www.kaggle.com/uciml/sms-spam-collectiondataset/
downloads/spam.csv/1).
7. Write a Python code to tackle a multi-class classification problem where the
challenge is to classify wine into three types using Decision Tree.
(DataSource:https://gist.github.com/tijptjik/9408623/archive/
b237fa5848349a14a14e5d4107dc7897c21951f5.zip).
8. Write a program in Python to implement Support Vector Machine for diabetes
classification.
(DataSource:https://www.kaggle.com/uciml/pima-indians-diabetes-database/
data).
9. Demonstrate the application of Artificial Neural Network using Python.
TEXT BOOKS
1. Hands On MachineLearning With Python– John Anderson, AI Sciences LLC.
2. Python for Data Analysis, Wes McKinney, O’Reilly.
REFERENCE BOOKS
LIST OF EXPERIMENTS
AS PER RUNGTA COLLEGE OF ENGINEERING & TECHNOLOGY
LIST OF EXPERIMENTS
Exp.
Name of Experiment Page No.
No.
1 Write programs to understand the use of Matplotlib 1
for Simple Interactive Chart, Set the Properties of
the Plot, matplotlib and NumPy.
2 Write programs to understand the use of Matplotlib
for Working with Multiple Figures and Axes, Adding
Text, Adding a Grid and Adding a Legend.
3 Write programs to understand the use of Matplotlib
for Working with Line Chart, Histogram, Bar Chart,
Pie Charts.
40 Write a program in Python to implement Linear
Regression for house price prediction.
5 Write a program in Python to implement K Nearest
Neighbor classifier for diabetes classification.
6 Build a Naive Bayes model in Python totackle a
spam classification problem.
7 Write a Python code to tackle a multi-class
classification problem where the challenge is to
classify wine into three types using Decision Tree.
8 Write a program in Python to implement Support
Vector Machine for diabetes classification.
9 Demonstrate the application of Artificial Neural
Network using Python.
Experiment No. 1
Aim : Write programs to understand the use of Matplotlib for Simple
Interactive Chart, Set the Properties of the Plot, matplotlib and NumPy.
Source Code :
1.1
#Simple Interactive Chart using Matplotlib:
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
# Create a basic line plot
plt.plot(x, y)
# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Simple Interactive Chart')
# Display the interactive plot
plt.show()
1.2
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y1 = [10, 8, 6, 4, 2]
y2 = [5, 4, 3, 2, 1]
# Create two lines on the same plot
plt.plot(x, y1, label='Line 1', color='blue', linestyle='dashed',
marker='o', markersize=8)
plt.plot(x, y2, label='Line 2', color='green', linestyle='dashdot',
marker='s', markersize=8)
# Set labels, title, and legend
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot with Custom Properties')
plt.legend()
# Display the plot
plt.show()
1.3
#Using Matplotlib and NumPy together:
import matplotlib.pyplot as plt
import numpy as np
# Generate data using NumPy
x = np.linspace(0, 2*np.pi, 100) # 100 points between 0 and 2π
y = np.sin(x)
# Create a sine wave plot
plt.plot(x, y, label='Sine Wave', color='red', linewidth=2)
# Set labels, title, and legend
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Sine Wave using Matplotlib and NumPy')
plt.legend()
# Display the plot
plt.show()
Experiment No. 2
Aim : Write programs to understand the use of Matplotlib for Working with
Multiple Figures and Axes, Adding Text, Adding a Grid and Adding a Legend
Source Code :
2.1
2.1#Working with Multiple Figures and Axes:
import matplotlib.pyplot as plt
import numpy as np
# Sample data
x = np.linspace(0, 2*np.pi, 100)
y1 = np.sin(x)
y2 = np.cos(x)
# Create two separate figures and axes
fig1, ax1 = plt.subplots()
fig2, ax2 = plt.subplots()
# Plot data on the first figure
ax1.plot(x, y1, label='Sine Wave', color='blue')
ax1.set_xlabel('X-axis')
ax1.set_ylabel('Y-axis')
ax1.set_title('Sine Wave')
# Plot data on the second figure
ax2.plot(x, y2, label='Cosine Wave', color='red')
ax2.set_xlabel('X-axis')
ax2.set_ylabel('Y-axis')
ax2.set_title('Cosine Wave')
# Display both figures
plt.show()
2.2
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
# Create a basic line plot
plt.plot(x, y)
# Add text to the plot
plt.text(3, 6, 'Important Point', fontsize=12, color='red', ha='center',
va='center')
# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot with Text')
# Display the plot
plt.show()
2.3
2.3#Adding a Grid:
python
Copy code
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
# Create a basic line plot
plt.plot(x, y)
# Add a grid to the plot
plt.grid(True, linestyle='--', linewidth=0.5, color='gray', alpha=0.7)
# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot with Grid')
# Display the plot
plt.show()
#Adding a Legend:
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y1 = [10, 8, 6, 4, 2]
y2 = [5, 4, 3, 2, 1]
# Create two lines on the same plot
plt.plot(x, y1, label='Line 1', color='blue')
plt.plot(x, y2, label='Line 2', color='green')
# Add a legend to the plot
plt.legend()
# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Plot with Legend')
# Display the plot
plt.show()
Experiment No. 3
Aim : Write programs to understand the use of Matplotlib for Working with
Line Chart, Histogram, Bar Chart, Pie Charts.
Source Code :
3.1
##3.1Working with Line Chart:
import matplotlib.pyplot as plt
# Sample data
x = [1, 2, 3, 4, 5]
y = [10, 8, 6, 4, 2]
# Create a line chart
plt.plot(x, y, marker='o', linestyle='-', color='blue')
# Set labels and title
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Line Chart Example')
# Display the plot
plt.show()
3.2
#3.2Working with Histogram:
import matplotlib.pyplot as plt
import numpy as np
# Generate random data for the histogram
data = np.random.randn(1000) # 1000 samples from a standard normal
distribution
# Create a histogram
plt.hist(data, bins=20, edgecolor='black', alpha=0.7)
# Set labels and title
plt.xlabel('Values')
plt.ylabel('Frequency')
plt.title('Histogram Example')
# Display the plot
plt.show()
3.3
#3.3Working with Bar Chart:
import matplotlib.pyplot as plt
# Sample data
categories = ['Category A', 'Category B', 'Category C', 'Category D']
values = [20, 30, 15, 25]
# Create a bar chart
plt.bar(categories, values, color='green')
# Set labels and title
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart Example')
# Display the plot
plt.show()
#3.4Working with Pie Chart:
import matplotlib.pyplot as plt
# Sample data
categories = ['Category A', 'Category B', 'Category C', 'Category D']
sizes = [30, 15, 20, 35]
# Create a pie chart
plt.pie(sizes, labels=categories, autopct='%1.1f%%', shadow=True,
startangle=90, colors=['red', 'green', 'blue', 'yellow'])
# Set title
plt.title('Pie Chart Example')
# Display the plot
plt.show()
Experiment No. 4
Aim : Write a program in Python to implement Linear Regression for house
price prediction.
Source Code :
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Load the data from the provided URL
url = "https://forge.scilab.org/index.php/p/rdataset/source/file/master/
csv/MASS/Boston.csv"
boston_data = pd.read_csv(url)
# Select the features and target variable
X = boston_data.drop('medv', axis=1) # Features (all columns except
'medv')
y = boston_data['medv'] # Target variable
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Create a Linear Regression model
model = LinearRegression()
# Fit the model on the training data
model.fit(X_train, y_train)
# Make predictions on the test data
y_pred = model.predict(X_test)
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared:", r2)
Experiment No. 5
Aim : Write a program in Python to implement K Nearest Neighbor
classifier for diabetes classification.
Source Code :
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
# Load the data from the provided URL
url = "https://www.kaggle.com/uciml/pima-indians-diabetes-database/data"
diabetes_data = pd.read_csv(url)
# Select the features and target variable
X = diabetes_data.drop('Outcome', axis=1) # Features (all columns except
'Outcome')
y = diabetes_data['Outcome'] # Target variable
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Create a KNN classifier with k=3 (you can change the value of k)
knn_classifier = KNeighborsClassifier(n_neighbors=3)
# Fit the classifier on the training data
knn_classifier.fit(X_train, y_train)
# Make predictions on the test data
y_pred = knn_classifier.predict(X_test)
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:")
print(conf_matrix)
print("Classification Report:")
print(class_report)
Experiment No. 6
Aim : Build a Naive Bayes model in Python totackle a spam classification
problem.
Source Code :
import nltk
nltk.download('punkt')
nltk.download('stopwords')
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
import string
# Load the data from the provided URL
url =
"https://www.kaggle.com/uciml/sms-spam-collection-dataset/downloads/
spam.csv/1"
spam_data = pd.read_csv(url, encoding='ISO-8859-1')
# Select the features (SMS text) and target variable (spam or ham)
X = spam_data['v2']
y = spam_data['v1']
# Preprocess the text data
def preprocess_text(text):
# Tokenize the text into individual words
tokens = word_tokenize(text)
# Remove punctuation and convert to lowercase
table = str.maketrans('', '', string.punctuation)
tokens = [word.translate(table).lower() for word in tokens]
# Remove stopwords
stop_words = set(stopwords.words('english'))
tokens = [word for word in tokens if word not in stop_words]
# Reconstruct the text from the processed tokens
return ' '.join(tokens)
X = X.apply(preprocess_text)
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Create a Bag-of-Words representation of the text data
vectorizer = CountVectorizer()
X_train_counts = vectorizer.fit_transform(X_train)
X_test_counts = vectorizer.transform(X_test)
# Create a Naive Bayes classifier
naive_bayes_classifier = MultinomialNB()
# Fit the classifier on the training data
naive_bayes_classifier.fit(X_train_counts, y_train)
# Make predictions on the test data
y_pred = naive_bayes_classifier.predict(X_test_counts)
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:")
print(conf_matrix)
print("Classification Report:")
print(class_report)
Experiment No. 7
Aim : Write a Python code to tackle a multi-class classification problem
where the challenge is to classify wine into three types using Decision Tree.
Source Code :
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
import requests
from zipfile import ZipFile
from io import BytesIO
# Download and extract the Wine dataset from the provided link
url =
"https://gist.github.com/tijptjik/9408623/archive/b237fa5848349a14a14e5d41
07dc7897c21951f5.zip"
response = requests.get(url)
with ZipFile(BytesIO(response.content)) as zip_file:
zip_file.extractall()
# Load the data from the extracted file
wine_data =
pd.read_csv("9408623-9408623/b237fa5848349a14a14e5d4107dc7897c21951f5/
wine.csv")
# Select the features and target variable
X = wine_data.drop('Wine_Type', axis=1) # Features (all columns except
'Wine_Type')
y = wine_data['Wine_Type'] # Target variable
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Create a Decision Tree classifier
decision_tree_classifier = DecisionTreeClassifier()
# Fit the classifier on the training data
decision_tree_classifier.fit(X_train, y_train)
# Make predictions on the test data
y_pred = decision_tree_classifier.predict(X_test)
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:")
print(conf_matrix)
print("Classification Report:")
print(class_report)
Experiment No. 8
Aim : Write a program in Python to implement Support Vector Machine for
diabetes classification.
Source Code :
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report,
confusion_matrix
# Load the data from the provided URL
url = "https://www.kaggle.com/uciml/pima-indians-diabetes-database/data"
diabetes_data = pd.read_csv(url)
# Select the features and target variable
X = diabetes_data.drop('Outcome', axis=1) # Features (all columns except
'Outcome')
y = diabetes_data['Outcome'] # Target variable
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Create an SVM classifier
svm_classifier = SVC()
# Fit the classifier on the training data
svm_classifier.fit(X_train, y_train)
# Make predictions on the test data
y_pred = svm_classifier.predict(X_test)
# Evaluate the classifier
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:")
print(conf_matrix)
print("Classification Report:")
print(class_report)
Experiment No. 9
Aim : Demonstrate the application of Artificial Neural Network using
Python.
Source Code :
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.datasets import load_iris
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical
# Load the Iris dataset
iris_data = load_iris()
X = iris_data.data
y = iris_data.target
# Convert target labels to one-hot encoding
y_encoded = to_categorical(y)
# Split the data into training and testing sets (80% training, 20%
testing)
X_train, X_test, y_train, y_test = train_test_split(X, y_encoded,
test_size=0.2, random_state=42)
# Build the ANN model
model = Sequential()
model.add(Dense(10, input_dim=4, activation='relu'))
model.add(Dense(3, activation='softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', optimizer='adam',
metrics=['accuracy'])
# Train the model
model.fit(X_train, y_train, epochs=100, batch_size=5)
# Evaluate the model on the test data
loss, accuracy = model.evaluate(X_test, y_test)
print("Test Loss:", loss)
print("Test Accuracy:", accuracy)