0% found this document useful (0 votes)

6 views

24CSPC212-PIC Lab Manual

The document is a lab manual for a Machine Learning course, detailing various experiments to be conducted in the laboratory. It includes implementations of algorithms such as Candidate-Elimination, ID3 Decision Tree, Backpropagation for Neural Networks, and Naïve Bayesian Classifier, among others. Each experiment outlines the aim, algorithm, program code, and expected results.

Uploaded by

sangeetha.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

24CSPC212-PIC Lab Manual

Uploaded by

sangeetha.cse

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 45

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE

SUBJECT: MACHINE LEARNING LABORATORY

YEAR/ SEMESTER: II / IV

LAB MANUAL (AD3461)

CONTENTS

PAGE
EX.NO DATE LIST OF EXPERIMENTS MARKS SIGN
NO
CYCLE-I
1. For a given set of training data examples stored
in a .CSV file, implement and demonstrate the
Candidate-Elimination algorithm to output a
description of the set of all hypotheses
consistent with the training examples
2. Write a program to demonstrate the working of
the decision tree based ID3 algorithm. Use an
appropriate data set for building the decision
tree and apply this knowledge to classify a new
sample.
3. Build an Artificial Neural Network by
implementing the Backpropagation algorithm
and test the same using appropriate data sets.

4. Write a program to implement the naïve

Bayesian classifier for a sample training data
set stored as a .CSV file and compute the
accuracy with a few test data sets.

5. Implement naïve Bayesian Classifier model to

classify a set of documents and measure the
accuracy, precision, and recall
CYCLE-II
6. Write a program to construct a Bayesian
network to diagnose CORONA infection using
standard WHO Data Set
7. Apply EM algorithm to cluster a set of data
stored in a .CSV file. Use the same data set for
clustering using the k-Means algorithm.
Compare the results of these two algorithms
8. Write a program to implement k-Nearest
Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions.
9. Implement the non-parametric Locally
Weighted Regression algorithm in order to fit
data points. Select an appropriate data set for
your experiment and draw graphs

SIGNATURE OF STAFF IN CHARGE

Ex.No:1 IMPLEMENTATION OF CANDIDATE –ELIMINATION ALGORITHM

AIM

To implement and demonstrate the Candidate-Elimination algorithm, for a given set of training data
examples stored in a .CSV file, to output a description of the set of all hypotheses consistent with the
training examples.

ALGORITHM:

Step:1 Load Data set.

Step:2 Initialize General Hypothesis and Specific Hypothesis.
Step: 3 For each training example
Step:4 If example is positive example
if attribute_value == hypothesis_value:
Do nothing
else:
replace attribute value with '?' (Basically generalizing it)
Step:5 If example is Negative example
Make generalize hypothesis more specific.
PROGRAM:

import numpy as
np import pandas
as pd
# Loading Data from a CSV File
data = pd.DataFrame(data=pd.read_csv('E:\BALA\AI\Lab programs\pgms\dataset.csv'))
print(data)

# Separating concept features from

Target concepts = np.array(data.iloc[:,0:-
1]) print(concepts)

# Isolating target into a separate

DataFrame # copying last column to
target array
target = np.array(data.iloc[:,-
1]) print(target)

def learn(concepts,
target): '''
learn() function implements the learning method of the Candidate elimination
algorithm.

Arguments:
concepts - a data frame with all the features
target - a data frame with corresponding output values'''
# Initialise S0 with the first instance from concepts
# .copy() makes sure a new list is created instead of just pointing to the same memory
location specific_h = concepts[0].copy()
print("\nInitialization of specific_h and
general_h") print(specific_h)
#h=["#" for i in range(0,5)]
#print(h)
general_h = [["?" for i in range(len(specific_h))] for i in range(len(specific_h))]
print(general_h)
# The learning iterations
for i, h in enumerate(concepts):
# Checking if the hypothesis has a positive
target if target[i] == "Yes":
for x in range(len(specific_h)):
# Change values in S & G only if values
change if h[x] != specific_h[x]:
specific_h[x] = '?'
general_h[x][x] = '?'
# Checking if the hypothesis has a positive target
if target[i] == "No":
for x in range(len(specific_h)):
# For negative hyposthesis change values only in
G if h[x] != specific_h[x]:
general_h[x][x] =
specific_h[x] else:
general_h[x][x] = '?'
print("\nSteps of Candidate Elimination

Algorithm",i+1) print(specific_h)
print(general_h)
# find indices where we have empty rows, meaning those that are
unchanged indices = [i for i, val in enumerate(general_h) if val == ['?', '?',
'?', '?', '?', '?']] for i in indices:
# remove those rows from general_h
general_h.remove(['?', '?', '?', '?',
'?''?'])
# Return final values
return specific_h, general_h
s_final, g_final = learn(concepts, target)
print("\nFinal Specific_h:", s_final, sep="\
n") print("\nFinal General_h:", g_final,
sep="\n")
OUTPUT:
RESULT:

Thus the Candidate-Elimination algorithm, to test all the hypotheses with the training sets
using python was executed and verified successfully.
1. What is Machine Learning?

2. Why do we need Machine Learning algorithm?

3. What are the applications of Machine Learning?

4. List the type of Machine Learning algorithms?

5. What is deep learning?

Ex.No:2
IMPLEMENTATION OF DECISION TREE IN ID3 ALGORITHM
Date:

AIM:

To build Decision tree in ID3 algorithm to classify a new sample using python.

ALGORITHM:

Step:1 Observe the dataset. Import the necessary basic python libraries.
Step:2 Read the dataset.
Step:3 Calculate the Entropy of the whole dataset.
Step:4 Calculate the Entropy of the filtered dataset.
Step:5 Calculate the Information gain for the feature(outlook).
Step:6 Finding the most informative feature (feature with highest information gain).
Step:7 Adding a node to the tree.
Step:8 Perform ID3 algorithm and generate a tree.
Step:9 Finding unique classes of the label.
Step: 10 predicting from the tree.
Step:11 Evaluating the test dataset.
Step:12 Checking the test dataset.

PROGRAM:
import numpy as np
import math
import csv
def read_data(filename):
with open(filename, 'r') as csvfile:
datareader = csv.reader(csvfile, delimiter=',')
headers = next(datareader)
metadata = []
traindata = []
for name in headers:
metadata.append(name)
for row in datareader:

traindata.append(row)
return (metadata,
traindata)
class Node:
def init (self, attribute):
self.attribute = attribute
self.children = []
self.answer = ""

def str (self):

return
self.attribute
def subtables(data, col,
delete): dict = {}
items = np.unique(data[:, col])
count = np.zeros((items.shape[0], 1),
dtype=np.int32) for x in range(items.shape[0]):

for y in
range(data.shape[0]): if
data[y, col] == items[x]:
count[x] += 1
for x in range(items.shape[0]):
dict[items[x]] = np.empty((int(count[x]), data.shape[1]), dtype="|
S32") pos = 0
for y in
range(data.shape[0]): if
data[y, col] == items[x]:
dict[items[x]][pos] =
data[y] pos += 1
if delete:
dict[items[x]] = np.delete(dict[items[x]], col, 1)
return items, dict
def entropy(S):

items =
np.unique(S) if
items.size == 1:

return 0

counts = np.zeros((items.shape[0], 1))

sums = 0

for x in range(items.shape[0]):

counts[x] = sum(S == items[x]) / (S.size *

1.0) for count in counts:

sums += -1 * count * math.log(count, 2)

return sums

def gain_ratio(data, col):

items, dict = subtables(data, col, delete=False)

total_size = data.shape[0]

entropies = np.zeros((items.shape[0], 1))

intrinsic = np.zeros((items.shape[0],
1)) for x in range(items.shape[0]):

ratio = dict[items[x]].shape[0]/(total_size * 1.0)

entropies[x] = ratio * entropy(dict[items[x]][:, -
1]) intrinsic[x] = ratio * math.log(ratio, 2)

total_entropy = entropy(data[:, -
1]) iv = -1 * sum(intrinsic)

for x in range(entropies.shape[0]):
total_entropy -= entropies[x]

return total_entropy / iv

def create_node(data, metadata):

if (np.unique(data[:, -1])).shape[0] ==
1: node = Node("")
node.answer = np.unique(data[:, -1])
[0] return node

gains = np.zeros((data.shape[1] - 1,
1)) for col in range(data.shape[1] -
1):gains[col] = gain_ratio(data, col)
split = np.argmax(gains)
node = Node(metadata[split])
metadata = np.delete(metadata, split, 0)
items, dict = subtables(data, split,
delete=True) for x in range(items.shape[0]):
child = create_node(dict[items[x]], metadata)
node.children.append((items[x], child))
return node
def
empty(size):
s = ""
for x in range(size):
s += " "
return s
def print_tree(node,
level): if node.answer !
= "":
print(empty(level), node.answer)
return
print(empty(level),
node.attribute) for value, n in
node.children:
print(empty(level + 1), value)
print_tree(n, level + 2)
metadata, traindata = read_data("E:\BALA\AI\Lab programs\pgms\
Tennisdata.csv") data = np.array(traindata)

node = create_node(data, metadata)

print_tree(node, 0)
OUTPUT:

RESULT:
Thus the program to implement decision tree based ID3 algorithm using python was
executed and verified successfully
Viva Questions:
1. What are data types in C? Why are they important?

2. What is the difference between int, float, and double data types?

3. What is the size of char, int, float, and double in C?

4. What is the difference between signed and unsigned data types?

5. What is the role of the sizeof() operator in determining data type sizes?
Ex.No:3 IMPLEMENTATION OF BACK PROPAGATION ALGORITHM TO
Date: BUILD AN ARTIFICIAL NEURAL NETWORK

AIM:

To implement the Back Propagation algorithm to build an Artificial Neural Network.

ALGORITHM:
1. Inputs X, arrive through the preconnected path.
2. Input is modeled using real weights W. The weights are usually randomly selected.
3. Calculate the output for every neuron from the input layer, to the hidden layers,
to the output layer.
4. Calculate the error in the outputs
5. Travel back from the output layer to the hidden layer to adjust the weights such
that the errors is decreased. Keep repeating the process until the desired output
is achieved.

PROGRAM:

from math import exp

from random import
seed
from random import
random # Initialize a
network
def initialize_network(n_inputs, n_hidden, n_outputs):

network = list()
hidden_layer = [{'weights':[random() for i in range(n_inputs + 1)]} for i in
range(n_hidden)]
network.append(hidden_layer)
output_layer = [{'weights':[random() for i in range(n_hidden + 1)]} for i in
range(n_outputs)]
network.append(output_laye
r) return network
# Calculate neuron activation for an
input def activate(weights, inputs):
activation = weights[-1]
for i in range(len(weights)-1):
activation += weights[i] *
inputs[i] return activation
# Transfer neuron
activation def
transfer(activation):
return 1.0 / (1.0 + exp(-
activation)) # Forward propagate input to a
network output def
forward_propagate(network, row):
inputs = row
for layer in network:
new_inputs = []
for neuron in
layer:

activation = activate(neuron['weights'],
inputs) neuron['output'] =
transfer(activation)
new_inputs.append(neuron['output'])
inputs = new_inputs
return inputs
# Calculate the derivative of an neuron output

def transfer_derivative(output):
return output * (1.0 -
output) # Backpropagate error and store
in neurons
def backward_propagate_error(network,
expected): for i in
reversed(range(len(network))):
layer = network[i]
errors = list()
if i != len(network)-1:
for j in range(len(layer)):
error = 0.0
for neuron in network[i + 1]:
error += (neuron['weights'][j] *
neuron['delta']) errors.append(error)
else:
for j in range(len(layer)):
neuron = layer[j]
errors.append(neuron['output'] -
expected[j])
for j in range(len(layer)):
neuron = layer[j]
neuron['delta'] = errors[j] *
transfer_derivative(neuron['output']) # Update network weights with error
def update_weights(network, row, l_rate):
for i in range(len(network)):

inputs = row[:-
1] if i != 0:
inputs = [neuron['output'] for neuron in network[i -
1]] for neuron in network[i]:
for j in range(len(inputs)):

neuron['weights'][j] -= l_rate * neuron['delta'] *

inputs[j] neuron['weights'][-1] -= l_rate * neuron['delta']
# Train a network for a fixed number of epochs
def train_network(network, train, l_rate, n_epoch,
n_outputs): for epoch in range(n_epoch):
sum_error = 0
for row in train:
outputs = forward_propagate(network,
row) expected = [0 for i in
range(n_outputs)] expected[row[-1]] = 1
sum_error += sum([(expected[i]-outputs[i])**2 for i
in range(len(expected))])
backward_propagate_error(network, expected)
update_weights(network, row, l_rate)
print('>epoch=%d, lrate=%.3f, error=%.3f' % (epoch, l_rate,
sum_error)) # Test training backprop algorithm
seed(1)
dataset = [[2.7810836,2.550537003,0],
[1.465489372,2.362125076,0],
[3.396561688,4.400293529,0],
[1.38807019,1.850220317,0],
[3.06407232,3.005305973,0],
[7.627531214,2.759262235,1],
[5.332441248,2.088626775,1],

[6.922596716,1.77106367,1],
[8.675418651,-0.242068655,1],
[7.673756466,3.508563011,1]]
n_inputs = len(dataset[0]) - 1
n_outputs = len(set([row[-1] for row in dataset]))

network = initialize_network(n_inputs, 2,
n_outputs) train_network(network, dataset, 0.5,
20, n_outputs) for layer in network:
print(layer)
OUTPUT:

>epoch=0, lrate=0.500, error=6.350

>epoch=1, lrate=0.500, error=5.531
>epoch=2, lrate=0.500, error=5.221
>epoch=3, lrate=0.500, error=4.951
>epoch=4, lrate=0.500, error=4.519
>epoch=5, lrate=0.500, error=4.173
>epoch=6, lrate=0.500, error=3.835
>epoch=7, lrate=0.500, error=3.506
>epoch=8, lrate=0.500, error=3.192
>epoch=9, lrate=0.500, error=2.898
>epoch=10, lrate=0.500, error=2.626
>epoch=11, lrate=0.500, error=2.377
>epoch=12, lrate=0.500, error=2.153
>epoch=13, lrate=0.500, error=1.953
>epoch=14, lrate=0.500, error=1.774
>epoch=15, lrate=0.500, error=1.614
>epoch=16, lrate=0.500, error=1.472
>epoch=17, lrate=0.500, error=1.346
>epoch=18, lrate=0.500, error=1.233
>epoch=19, lrate=0.500, error=1.132
[{'weights': [-1.4688375095432327, 1.850887325439514, 1.0858178629550297], 'output': 0.029
980305604426185, 'delta': 0.0059546604162323625}, {'weights': [0.37711098142462157, -0.06
25909894552989, 0.2765123702642716], 'output': 0.9456229000211323, 'delta': -
0.0026279652
850863837}]
[{'weights': [2.515394649397849, -0.3391927502445985, -0.9671565426390275], 'output': 0.23
648794202357587, 'delta': 0.04270059278364587}, {'weights': [-2.5584149848484263, 1.00364
22106209202, 0.42383086467582715], 'output': 0.7790535202438367, 'delta': -
0.038031325964
37354}]

RESULT:

Thus the Back propagation algorithm to build an Artificial Neural networks was
implemented successfully.
Ex.No:4 IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER FOR A
SAMPLE TRAINING DATASET AND TO COMPUTE ACCURACY
Date:

AIM:
To implement Naïve Bayesian classifier for Tennis data set and to compute the
accuracy with few datasets.

ALGORITHM:
Step:1 Convert the data set into a frequency table.
Step:2 Create likelihood table by finding the probabilities like overcast
probability =
0.29 and probability of plating is 0.64.
Step:3 Now, use Naive Bayesian equation to calculate the posterior probability for
each
class. The class with the highest posterior probability is the outcome
oprediction.

Problem: Players will play if weather is sunny. Is this statement is

correct? We can solve it using above discussed method of posterior

probability.

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 =

0.64 Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
Step:4 Exit.

PROGRAM:
import pandas as pd from
sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv("E:\BALA\AI\Lab programs\pgms\Tennis.csv")
print("The first 5 values of data is :\n",data.head())
# obtain Train data and Train output
X = data.iloc[:,:-1]
print("\nThe First 5 values of train data is\n",X.head())

y = data.iloc[:,-1]
print("\nThe first 5 values of Train output is\n",y.head())

# Convert then in numbers

le_outlook = LabelEncoder()

X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy) print("\
nNow the Train data is :\n",X.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y) print("\
nNow the Train output is\n",y)

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)

classifier = GaussianNB()
classifier.fit(X_train,y_train)

from sklearn.metrics import accuracy_score

print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))

OUTPUT:

Accuracy is: 0.6666666666666666

RESULT:

Thus the program to implement Naïve Bayesian classifier to compute the accuracy with few
datasets using python was executed and verified successfully.
Viva Questions:
1. What are input and output statements in C?

2. What is the difference between printf() and scanf()?

3. What is the difference between an expression and a statement in C?

4. What are logical expressions? Give an example using && and ||

5. What is the purpose of putchar() and getchar()?

Ex.No:5 IMPLEMENTATION OF NAÏVE BAYESIAN CLASSIFIER MODEL TO
CLASSIFY A SET OF DOCUMENTS AND TO MEASURE THE
Date: ACCURACY, PRECISION, AND RECALL

AIM:
To classify a set of documents using Naïve Bayesian classifier and to measure the
accuracy and precision

ALGORITHM:
Step:1 Import basic libraries.
Step:2 Importing the dataset.
Step:3 Data preprocessing.
Step:4 Training the model.
Step:5 Testing and evaluation of the model.
Step:6 Visualizing the model.

PROGRAM:

from sklearn.datasets import fetch_20newsgroups

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

import numpy as np

categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']

twenty_train =

fetch_20newsgroups(subset='train',categories=categories,shuffle=True) twenty_test =

fetch_20newsgroups(subset='test',categories=categories,shuffle=True)

print(len(twenty_train.data))

print(len(twenty_test.data))

print(twenty_train.target_names)

print("\n".join(twenty_train.data[0].split("\n")))

print(twenty_train.target[0])
OUTPUT:

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

X_train_tf = count_vect.fit_transform(twenty_train.data) from

sklearn.feature_extraction.text import TfidfTransformer

tfidf_transformer = TfidfTransformer()

X_train_tfidf = tfidf_transformer.fit_transform(X_train_tf)

X_train_tfidf.shape

from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score

from sklearn import metrics

mod = MultinomialNB() mod.fit(X_train_tfidf,

twenty_train.target)

X_test_tf = count_vect.transform(twenty_test.data)

X_test_tfidf = tfidf_transformer.transform(X_test_tf)

predicted = mod.predict(X_test_tfidf)

print("Accuracy:", accuracy_score(twenty_test.target, predicted))

print(classification_report(twenty_test.target,predicted,target_names=twenty_test.target_names))

print("confusion matrix is \n",metrics.confusion_matrix(twenty_test.target, predicted))

OUTPUT:

RESULT:

Thus the accuracy and precision was measured by Naïve Bayesian classifier model.
Viva Questions:
Ex.No:6 CONSTRUCTION OF A BAYESIAN NETWORK TO DIAGNOSE CORONA
INFECTION USING STANDARD WHO DATA SET
Date:

AIM:

To construct a Bayesian network to diagnose corona infection using WHO data set.

ALGORITHM:
This Naive Bayes is broken down into parts:
Step1: Separate by Class.
Step2: Summarize Dataset.
Step3: Summarize Data by Class.
Step4: Gaussian Probability Density

Step 5: Class Probabilities.

PROGRAM:
import numpy as np import pandas as pd
from scipy.stats import randint import pandas as pd
import matplotlib.pyplot as plt from pandas import set_option
plt.style.use('ggplot')
from sklearn.model_selection import train_test_split from sklearn.linear_model import
LogisticRegression from sklearn.feature_selection import RFE
from sklearn.model_selection import KFold
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import RandomizedSearchCV from sklearn.preprocessing import
StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier import xgboost as xgb
from xgboost import XGBClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import cross_val_score from sklearn.metrics import
confusion_matrix
from sklearn.neighbors import KNeighborsClassifier from sklearn.tree import
DecisionTreeClassifier from sklearn.ensemble import ExtraTreesClassifier
from sklearn.feature_selection import SelectFromModel from sklearn import metrics
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

from sklearn.metrics import classification_report

covid_19_data = pd.read_csv("E:\BALA\AI\Lab programs\pgms\covid_19_data.csv")
print(f'The shape of the dataframe is {covid_19_data.shape}')
print()

print(covid_19_data.info()) print()

covid_19_data.replace(to_replace='?', value=np.NaN, inplace=True)

print(covid_19_data.describe(include='all'))
print()
print(covid_19_data['Country/Region'].value_counts())
print(covid_19_data.isnull().sum())

import seaborn as sns

sns.countplot(x='Country/Region', data=covid_19_data, linewidth=3)
plt.show()
covid_19_data[['ObservationDate', 'Province/State', 'Country/Region','Last Update','Confirmed',
'Deaths', 'Recovered']].hist(bins=50, figsize=(15,8))
plt.show()
covid_19_data['Country/Region'].fillna(covid_19_data['Country/Region'].mode()[0],
inplace=True)
covid_19_data['Confirmed'].fillna(covid_19_data['Confirmed'].mode()[0],
inplace=True) X = covid_19_data.drop(['Deaths'],axis=1)
y = covid_19_data.Recovered
X=X[['confirmed', 'Recovered']]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
NB_classifier = GaussianNB()
NB_classifier.fit(X_train, y_train) y_predict =
NB_classifier.predict(X_test)

cm = confusion_matrix(y_test, y_predict)
sns.heatmap(cm, annot=True, cmap='Blues')
print(classification_report(y_test, y_predict))

OUTPUT:
RESULT:

Thus the program to diagnose corona infection using Bayesian network was successfully
implemented using python.
Ex.No:7 COMPARISON OF CLUSTERING IN EM ALGORITHM AND K-MEANS
ALGORITHM USING THE SAME DATA SETS
Date:

AIM:

To compare the clustering in EM algorithm and K-means algorithm using the same data sets.

ALGORITHM:
The K-means implementation is as follows:
Step1: Choose the number of clusters k.
Step:2 Select k random points from the data as centroids.
Step3: Assign all the points to the closest cluster centroid.
Step:4 Recompute the centroids of newly formed clusters.
Step5:Repeat steps 3 and 4.

The EM implementation is as follows:

1. Expectation step (E - step): It involves the estimation (guess) of all missing values in
the dataset so that after completing this step, there should not be any missing value.
2. Maximization step (M - step): This step involves the use of estimated data in the E-
step and updating the parameters.
3. Repeat E-step and M-step until the convergence of the values occurs.

PROGRAM:

from sklearn.cluster import KMeans

from sklearn import preprocessing
from sklearn.mixture import GaussianMixture
from sklearn.datasets import load_iris
import sklearn.metrics as sm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

dataset=load_iris() #
print(dataset)

X=pd.DataFrame(dataset.data)
X.columns=['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y=pd.DataFrame(dataset.target)
y.columns=['Targets'] #
print(X)
plt.figure(figsize=(14,7))
colormap=np.array(['red','lime','black'])
# REAL PLOT
plt.subplot(1,3,1)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y.Targets],s=40)
plt.title('Real')

# K-PLOT
plt.subplot(1,3,2) model=KMeans(n_clusters=3)

model.fit(X) predY=np.choose(model.labels_,
[0,1,2]).astype(np.int64)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[predY],s=40)
plt.title('KMeans')

# GMM PLOT
scaler=preprocessing.StandardScaler()
scaler.fit(X)

xsa=scaler.transform(X)
xs=pd.DataFrame(xsa,columns=X.columns)
gmm=GaussianMixture(n_components=3)
gmm.fit(xs) y_cluster_gmm=gmm.predict(xs)
plt.subplot(1,3,3)
plt.scatter(X.Petal_Length,X.Petal_Width,c=colormap[y_cluster_gmm],s=40)

plt.title('GMM Classification')
OUTPUT:

RESULT:
Thus the program to compare clustering in EM and K-means algorithm with few datasets
was performed successfully
Ex.No:8 IMPLEMENTATION OF K-NEAREST NEIGHBOUR ALGORITHM TO
CLASSIFY THE IRIS DATA SET
Date:

AIM:

To implement K-Nearest Neighbour algorithm to classify iris data set.

ALGORITHM:
Step1:Calculate the Information Gain of each feature.

Step:2 Considering that all rows don’t belong to the same class, split the dataset S into subsets
using the feature for which the Information Gain is maximum.

Step3: Make a decision tree node using the feature with the maximum Information gain.

Step4:If all rows belong to the same class, make the current node as a leaf node with the class
as its label.

Step5: Repeat for the remaining features until we run out of all features, or the decision tree
has all leaf nodes.

PROGRAM:

from sklearn.datasets import load_iris

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
import numpy as np
dataset=load_iris()
X_train,X_test,y_train,y_test=train_test_split(dataset["data"],dataset["target"],random_state=0)
kn=KNeighborsClassifier(n_neighbors=1)
kn.fit(X_train,y_train)
for i in range(len(X_test)):

x=X_test[i] x_new=np.array([x])
prediction=kn.predict(x_new)
print("TARGET=",y_test[i],dataset["target_names"]
[y_test[i]],"PREDICTED=",prediction,dataset["target_names"][prediction])
print(kn.score(X_test,y_test))
OUTPUT:

TARGET= 2 virginica PREDICTED= [2] ['virginica']

TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 2 virginica PREDICTED= [2] ['virginica']
TARGET= 1 versicolor PREDICTED= [1] ['versicolor']
TARGET= 0 setosa PREDICTED= [0] ['setosa']
TARGET= 1 versicolor PREDICTED= [2] ['virginica']
0.973684210E2631E8

RESULT:
Thus the program for K-Nearest Neighbour algorithm was implemented successfully using
an iris data set
Viva Questions:
1. What are the different types of decision-making statements in C?

2. What is the difference between if-else and switch statements?

3. Can we use multiple conditions in an if statement? How?

4. Can we use if inside a switch statement? Explain with an example

5. What is the ternary (?:) operator? How does it work as a decision-making tool?
Ex.No:9 IMPLEMENTATION OF THE NON-PARAMETRIC LOCALLY WEIGHTED
REGRESSION ALGORITHM IN ORDER TO FIT DATA POINT
Date:

AIM:

To implement the non-parametric Locally Weighted Regression algorithm in order to fit

data points.

ALGORITHM:
Step1:Read the Given data Sample to X and the curve (linear or non linear) to Y

Step2: Set the value for Smoothening parameter or Free parameter say τ

Step3: Set the bias /Point of interest set x0 which is a subset of X

Step4: Determine the weight matrix using :

1. Determine the value of model term parameter β using:

2. Prediction = x0*β

PROGRAM:

from math import ceil

import numpy as np from
scipy import linalg
def lowess(x, y, f, iterations): n
= len(x)
r = int(ceil(f * n))
h = [np.sort(np.abs(x - x[i]))[r] for i in range(n)]

w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0) w

= (1 - w ** 3) ** 3
yest = np.zeros(n)
delta = np.ones(n)
for iteration in range(iterations):
for i in range(n):
weights = delta * w[:, i]
b = np.array([np.sum(weights * y), np.sum(weights * y * x)])
A = np.array([[np.sum(weights), np.sum(weights * x)],[np.sum(weights * x),
np.sum(weights * x * x)]])
beta = linalg.solve(A, b)
yest[i] = beta[0] + beta[1] * x[i]
residuals = y - yest
s = np.median(np.abs(residuals))
delta = np.clip(residuals / (6.0 * s), -1, 1)
delta = (1 - delta ** 2) ** 2
return yest
import math n
= 100
x = np.linspace(0, 2 * math.pi, n)
y = np.sin(x) + 0.3 * np.random.randn(n)
f =0.25
iterations=3
yest = lowess(x, y, f, iterations)
import matplotlib.pyplot as plt
plt.plot(x,y,"r.")
plt.plot(x,yest,"b-")

OUTPUT:

RESULT:

Thus the non-parametric Locally Weighted Regression algorithm to fit data points was
implemented successfully.
Viva Questions:
1. What is recursion in C? How does it work?

2. What is the difference between iteration and recursion?

3. What are the advantages and disadvantages of recursion?

4. What is the role of the function stack in recursion?

5. What is tail recursion? How is it different from regular recursion?

ADA Lab Manual Updated 2023-24
No ratings yet
ADA Lab Manual Updated 2023-24
38 pages
Cc103-Data Structures and Algrorithm
No ratings yet
Cc103-Data Structures and Algrorithm
112 pages
ML RECORD NEW FORMAT
No ratings yet
ML RECORD NEW FORMAT
48 pages
Machine Learning - Lab Manual
No ratings yet
Machine Learning - Lab Manual
35 pages
ML_LAB Record_final
No ratings yet
ML_LAB Record_final
39 pages
ML Lab Record
No ratings yet
ML Lab Record
33 pages
Ad3461 Ml Lab Manual
100% (1)
Ad3461 Ml Lab Manual
54 pages
ML Lab Manual (1-9)
No ratings yet
ML Lab Manual (1-9)
37 pages
ML Manual
No ratings yet
ML Manual
34 pages
Ml Lab Manual1 9
No ratings yet
Ml Lab Manual1 9
38 pages
Machine Learning Lab Manual (15CSL76)
No ratings yet
Machine Learning Lab Manual (15CSL76)
30 pages
ML - LAB Record
No ratings yet
ML - LAB Record
36 pages
code mlt
No ratings yet
code mlt
9 pages
Advance Machine Learning
No ratings yet
Advance Machine Learning
28 pages
ad3461-ml-lab-manual
No ratings yet
ad3461-ml-lab-manual
48 pages
Machine Learning Lab (17CSL76)
No ratings yet
Machine Learning Lab (17CSL76)
48 pages
MLWP LAB Experiment's
No ratings yet
MLWP LAB Experiment's
11 pages
AD3461 ML lab manual
No ratings yet
AD3461 ML lab manual
32 pages
Jntuk R20 ML
No ratings yet
Jntuk R20 ML
43 pages
My ML Lab Manual
No ratings yet
My ML Lab Manual
21 pages
Fedal #5
No ratings yet
Fedal #5
33 pages
School of Engineering: Lab Manual On Machine Learning Lab
No ratings yet
School of Engineering: Lab Manual On Machine Learning Lab
23 pages
Ashin ML Record - Merged
No ratings yet
Ashin ML Record - Merged
53 pages
Machine Learning Laboratory Manual
No ratings yet
Machine Learning Laboratory Manual
11 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
Machine Learning Lab Record: Dr. Sarika Hegde
No ratings yet
Machine Learning Lab Record: Dr. Sarika Hegde
23 pages
ML Experiments
No ratings yet
ML Experiments
22 pages
ML Lab Manual - Ex No. 1 To 9
No ratings yet
ML Lab Manual - Ex No. 1 To 9
26 pages
(P) Program AIO
No ratings yet
(P) Program AIO
22 pages
IT ML Lab
No ratings yet
IT ML Lab
35 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Lab Manual ML
No ratings yet
Lab Manual ML
28 pages
ML LAB 146
No ratings yet
ML LAB 146
50 pages
ML Lab Manual PDF
No ratings yet
ML Lab Manual PDF
9 pages
AD3461-Machine Learning Lab Manual
No ratings yet
AD3461-Machine Learning Lab Manual
26 pages
ML Lab Manual Devansh (1)
No ratings yet
ML Lab Manual Devansh (1)
57 pages
ML New record (5)
No ratings yet
ML New record (5)
51 pages
ML lab manual
No ratings yet
ML lab manual
25 pages
201CS240-MLLABMANUAL
No ratings yet
201CS240-MLLABMANUAL
20 pages
Machine Learning practical file
No ratings yet
Machine Learning practical file
31 pages
AI Manual
No ratings yet
AI Manual
69 pages
original ML lab manual (1)
No ratings yet
original ML lab manual (1)
22 pages
AD3461_ML_MANUAL
No ratings yet
AD3461_ML_MANUAL
34 pages
ML Lab Manual
No ratings yet
ML Lab Manual
26 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
31 pages
MLAll Practical
No ratings yet
MLAll Practical
27 pages
Lab Manual: Department of Computer Science and Engineering
No ratings yet
Lab Manual: Department of Computer Science and Engineering
30 pages
Shashidhar-18csl76 Final
No ratings yet
Shashidhar-18csl76 Final
19 pages
B.TECH Machine Learning-Lab
No ratings yet
B.TECH Machine Learning-Lab
99 pages
ML_Industry_Lab_File_With_Code_and_IO
No ratings yet
ML_Industry_Lab_File_With_Code_and_IO
8 pages
ML
No ratings yet
ML
8 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
33 pages
Ml_Lab_Manual
No ratings yet
Ml_Lab_Manual
70 pages
Deep Learning
No ratings yet
Deep Learning
25 pages
Machine Learning (Se204A) Lab Manual
No ratings yet
Machine Learning (Se204A) Lab Manual
27 pages
Outcome Based Lab Report
No ratings yet
Outcome Based Lab Report
22 pages
ML Lab
No ratings yet
ML Lab
7 pages
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Numpy Module
No ratings yet
Numpy Module
10 pages
ML RECORD - Merged
No ratings yet
ML RECORD - Merged
33 pages
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Spos Lab Manual
100% (1)
Spos Lab Manual
41 pages
Induction and Recursion: With Question/Answer Animations
No ratings yet
Induction and Recursion: With Question/Answer Animations
74 pages
Ite6102 Computer Programming 1 Updated
No ratings yet
Ite6102 Computer Programming 1 Updated
23 pages
DAA - Questions BANK
No ratings yet
DAA - Questions BANK
8 pages
Mini Project Report-1.1
No ratings yet
Mini Project Report-1.1
53 pages
SNIPER-SCOPE
No ratings yet
SNIPER-SCOPE
5 pages
Instant Download (Ebook) Essentials of Programming in Mathematica by Paul Wellin ISBN 9781107116665, 110711666X PDF All Chapters
100% (11)
Instant Download (Ebook) Essentials of Programming in Mathematica by Paul Wellin ISBN 9781107116665, 110711666X PDF All Chapters
55 pages
DSU Paper
No ratings yet
DSU Paper
2 pages
Implementation of Selection Sort Algorithm in Various Programming Languages
No ratings yet
Implementation of Selection Sort Algorithm in Various Programming Languages
7 pages
Artificial Intelligence Questions
No ratings yet
Artificial Intelligence Questions
49 pages
Carlo Milanesi - Creative Projects For Rust Programmers - Build Exciting Projects On Domains Such As Web Apps, WebAssembly, Games, and Parsing-Packt Publishing (2020)
No ratings yet
Carlo Milanesi - Creative Projects For Rust Programmers - Build Exciting Projects On Domains Such As Web Apps, WebAssembly, Games, and Parsing-Packt Publishing (2020)
396 pages
Fortran for Scientists and Engineers 4th Edition Stephen J. Chapman download
100% (1)
Fortran for Scientists and Engineers 4th Edition Stephen J. Chapman download
57 pages
1 To 10 Skylab
No ratings yet
1 To 10 Skylab
9 pages
Assembler Pass 2
No ratings yet
Assembler Pass 2
5 pages
B.B.A (C.a) 2013 Pattern
No ratings yet
B.B.A (C.a) 2013 Pattern
80 pages
Unit 3 Patterns
No ratings yet
Unit 3 Patterns
12 pages
Subject Register UG 3 Sem CS - AI & DS 17112022
No ratings yet
Subject Register UG 3 Sem CS - AI & DS 17112022
5 pages
ICT questions HN
No ratings yet
ICT questions HN
48 pages
Pps Lab Manual
No ratings yet
Pps Lab Manual
66 pages
CHANDIGARH UNIVERSITY DS - Sample - EST
0% (1)
CHANDIGARH UNIVERSITY DS - Sample - EST
1 page
OOPJ MSE - 2 Question Bank Final
No ratings yet
OOPJ MSE - 2 Question Bank Final
2 pages
Daa Record
No ratings yet
Daa Record
40 pages
DSA Practical
No ratings yet
DSA Practical
51 pages
Object Oriented Software Engineering - CCS356 - Notes - Unit 3 - Software Design
No ratings yet
Object Oriented Software Engineering - CCS356 - Notes - Unit 3 - Software Design
60 pages
HHW Class - Vii 2024F
No ratings yet
HHW Class - Vii 2024F
7 pages
StoreSim Optimizing Information Leakage in Multi-Cloud Storage Services
No ratings yet
StoreSim Optimizing Information Leakage in Multi-Cloud Storage Services
6 pages
Snapchat - LeetCode
No ratings yet
Snapchat - LeetCode
5 pages
Hauffman Coading
No ratings yet
Hauffman Coading
6 pages

24CSPC212-PIC Lab Manual

Uploaded by

24CSPC212-PIC Lab Manual

Uploaded by

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND DATA SCIENCE

SUBJECT: MACHINE LEARNING LABORATORY

LAB MANUAL (AD3461)

4. Write a program to implement the naïve

5. Implement naïve Bayesian Classifier model to

SIGNATURE OF STAFF IN CHARGE

Ex.No:1 IMPLEMENTATION OF CANDIDATE –ELIMINATION ALGORITHM

Step:1 Load Data set.

# Separating concept features from

# Isolating target into a separate

2. Why do we need Machine Learning algorithm?

3. What are the applications of Machine Learning?

4. List the type of Machine Learning algorithms?

5. What is deep learning?

def str (self):

counts = np.zeros((items.shape[0], 1))

counts[x] = sum(S == items[x]) / (S.size *

sums += -1 * count * math.log(count, 2)

def gain_ratio(data, col):

items, dict = subtables(data, col, delete=False)

entropies = np.zeros((items.shape[0], 1))

ratio = dict[items[x]].shape[0]/(total_size * 1.0)

def create_node(data, metadata):

node = create_node(data, metadata)

3. What is the size of char, int, float, and double in C?

4. What is the difference between signed and unsigned data types?

To implement the Back Propagation algorithm to build an Artificial Neural Network.

from math import exp

neuron['weights'][j] -= l_rate * neuron['delta'] *

>epoch=0, lrate=0.500, error=6.350

Problem: Players will play if weather is sunny. Is this statement is

correct? We can solve it using above discussed method of posterior

P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

# Convert then in numbers

from sklearn.model_selection import train_test_split

from sklearn.metrics import accuracy_score

Accuracy is: 0.6666666666666666

2. What is the difference between printf() and scanf()?

3. What is the difference between an expression and a statement in C?

4. What are logical expressions? Give an example using && and ||

5. What is the purpose of putchar() and getchar()?

from sklearn.datasets import fetch_20newsgroups

from sklearn.metrics import confusion_matrix

from sklearn.metrics import classification_report

categories = ['alt.atheism', 'soc.religion.christian','comp.graphics', 'sci.med']

from sklearn.feature_extraction.text import CountVectorizer

X_train_tf = count_vect.fit_transform(twenty_train.data) from

sklearn.feature_extraction.text import TfidfTransformer

from sklearn.naive_bayes import MultinomialNB

from sklearn.metrics import accuracy_score

mod = MultinomialNB() mod.fit(X_train_tfidf,

print("Accuracy:", accuracy_score(twenty_test.target, predicted))

print("confusion matrix is \n",metrics.confusion_matrix(twenty_test.target, predicted))

Step 5: Class Probabilities.

from sklearn.metrics import classification_report

covid_19_data.replace(to_replace='?', value=np.NaN, inplace=True)

import seaborn as sns

The EM implementation is as follows:

from sklearn.cluster import KMeans

To implement K-Nearest Neighbour algorithm to classify iris data set.

from sklearn.datasets import load_iris

TARGET= 2 virginica PREDICTED= [2] ['virginica']

2. What is the difference between if-else and switch statements?

3. Can we use multiple conditions in an if statement? How?

4. Can we use if inside a switch statement? Explain with an example

To implement the non-parametric Locally Weighted Regression algorithm in order to fit

Step3: Set the bias /Point of interest set x0 which is a subset of X

Step4: Determine the weight matrix using :

1. Determine the value of model term parameter β using:

from math import ceil

w = np.clip(np.abs((x[:, None] - x[None, :]) / h), 0.0, 1.0) w

2. What is the difference between iteration and recursion?

3. What are the advantages and disadvantages of recursion?

4. What is the role of the function stack in recursion?

5. What is tail recursion? How is it different from regular recursion?

You might also like