0% found this document useful (0 votes)

95 views

L. D. College of Engineering: Lab Manual For

The document provides information about implementing linear regression using gradient descent. It discusses linear regression models and defines the cost function J(θ) that represents the error between predictions and actual values. Gradient descent is introduced as an algorithm that can be used to minimize this cost function and find the optimal parameters θ that best fit the data. Pseudocode is provided for a gradient descent algorithm to optimize a linear regression model with one variable. Examples are given to demonstrate calculating the cost function J(θ) for sample data and finding the minimum where the regression line best fits the data.

Uploaded by

Franklin De santa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

95 views

L. D. College of Engineering: Lab Manual For

Uploaded by

Franklin De santa

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 70

190280117004

L. D. College of Engineering

Lab Manual for

Artificial Intelligence and Machine

Learning
(Subject Code: 3161710)

B.E. 6th Semester - IC

Name of Faculty: Dr. Dipesh Makwana & Prof. Kruti Joshi

Enrolment No: 190280117004

Name of Student: Dhrumil Bhavsar

1
190280117004

Index
Sr. Page Date
Name Of Experiment Sign
No. No.

1. Implementation of linear regression

Implementation of gradient descent with single

2.
variable
Implementation of gradient descent with multi-
3.
variable

4. Polynomial Regression

5. Logistic Regression

An Implementation of Artificial Neural Networks

6.
using Back propagation

7. Implementation of SVM with simple features

8. Implementing K -means clustering algorithm

Introduction to Python Programming for Machine

9.
Learning
Write a program for the concept of decision tree to
10.
develop a piecewise linear model and test it as well.
Write a program for kNN algorithm for
11.
classification of IRIS dataset
Write a program using Bayes algorithm for email
classification (spam or non-spam) for the open-
12.
sourced data set from the UC Irvine Machine
Learning Repository
Write a program using SVM on IRIS dataset and
13.
carry out classification.
Write a program using SVM algorithm for Boston
14. house price prediction dataset to predict price of
houses from certain features

2
190280117004

ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING – 3161710

SEMESTER VI

PRACTICAL – 1

AIM: Implementation of linear regression using MATLAB.

THEORY:

Regression analysis is a form of predictive modelling technique which

investigates the relationship between a dependent and independent variable.

Linear Regression

Linear regression is a basic and commonly used type of predictive analysis

which usually works on continuous data.

A Linear regression graph is made by generating a scattered plot between the

independent and dependent variable, and making a regression line, which is the
line closest to most of the points on the graph plot. This line has minimum error,
and will be used for prediction for newer data. This line is a linear
representation of the plot and is described by:

y = B1x + B0 + E0
Where,

y = Dependent Variable

x = Independent Variable

B1 = Slope of the regression line

B0 = Y intercept

E0 = error in data of the variables

3
190280117004

Example of a Linear Regression plot

Types of Linear Regression:

Based on the number of independent variables, there are two types of linear
regression-

4
190280117004

1. Simple Linear Regression:

In simple linear regression, the dependent variable depends only on a single
independent variable.

For simple linear regression, the form of the model is-
Y = β0 + β1X

Here,
 Y is a dependent variable.
 X is an independent variable.
 β0 and β1 are the regression coefficients.
 β0 is the intercept or the bias that fixes the offset to a line.
 β1 is the slope or weight that specifies the factor by which X has an
impact on Y.

There are following 3 cases possible-

Case-01: β1 < 0

 It indicates that variable X has negative impact on Y.
 If X increases, Y will decrease and vice-versa.

5
190280117004

Case-02: β1 = 0

 It indicates that variable X has no impact on Y.
 If X changes, there will be no change in Y.

Case-03: β1 > 0

 It indicates that variable X has positive impact on Y.
 If X increases, Y will increase and vice-versa.

6
190280117004

2. Multiple Linear Regression-

In multiple linear regressions, the dependent variable depends on more than one
independent variables.

For multiple linear regression, the form of the model is-

Y = β0 + β1X1 + β2X2 + β3X3 + …… + βnXn

Here,
 Y is a dependent variable.
 X1, X2, …., Xn are independent variables.
 β0, β1,…, βn are the regression coefficients.
 βj (1<=j<=n) is the slope or weight that specifies the factor by which Xj has an
impact on Y.

MATLAB Code/Program:
x = [10;15;20;25;30;35;40;45;50;55;60];
y = [365;387;451;499;567;609;677;725;777;808;989];

plot (x,y, 'b.');

xlabel ('Time (hours)');

ylabel ('Output of Process (gm/mol)');

X = [ones(length(x),1) x];

[A,~,~,~,STATS] = regress(y,X);

hold on

xplot = [min(x), max(x)];

yplot = A(1) +A(2)*xplot;
plot (xplot,yplot, 'r');
legend('Data', 'Model');

7
190280117004

Output:

Example 1:
The table below shows some data from the early days of the Italian clothing company
Benetton. Each row in the table shows Benetton’s sales for a year and the amount spent on
advertising that year. In this case, our outcome of interest is sales—it is what we want to
predict. If we use advertising as the predictor variable, linear regression estimates that Sales =
168 + 23 Advertising. That is, if advertising expenditure is increased by one million Euro, then
sales will be expected to increase by 23 million Euros, and if there was no advertising, we
would expect sales of 168 million Euros.

8
190280117004

Code:
x = [23;26;30;34;43;48;52;57;58];
y = [651;762;856;1063;1190;1298;1421;1440;1518];

plot (x,y, 'b.');

xlabel ('Advertising (Million Euros)');

ylabel ('Sales (Million Euros)');

X = [ones(length(x),1) x];

[A,~,~,~,STATS] = regress(y,X);

hold on

xplot = [min(x), max(x)];

yplot = A(1)+A(2)*xplot;

plot(xplot,yplot, 'r');
legend('Data', 'Model');

Output:

Conclusion:
In this experiment we were able to implement linear regression
using MATLAB.

9
190280117004

PRACTICAL – 2

AIM: Implementation of Gradient descent algorithm for single variable.

Theory:

Model Representation
First, the goal of most machine learning algorithms is to construct a model: a
hypothesis that can be used to estimate Y based on X. The hypothesis, or model,
maps inputs to outputs. So, for example, say I train a model based on a bunch of
housing data that includes the size of the house and the sale price. By training a
model, I can give you an estimate on how much you can sell your house for
based on its size. This is an example of a regression problem — given some
input, we want to predict a continuous output.
The hypothesis is usually presented as

The theta values are the parameters.

Some quick examples of how we visualize the hypothesis:

This yields h(x) = 1.5 + 0x.

0x means no slope, and y will always be the constant 1.5.
This looks like:

10
190280117004

The goal of creating a model is to choose parameters, or theta values, so that h(x)
is close to y for the training data, x and y. So for this data

X = [1, 1, 2, 3, 4, 3, 4, 6, 4]

Y = [2, 1, 0.5, 1, 3, 3, 2, 5, 4]

11
190280117004

Cost Function

We need a function that will minimize the parameters over our dataset. One
common function that is often used is mean squared error, which measure the
difference between the estimator (the dataset) and the estimated value (the
prediction). It looks like this:

It turns out we can adjust the equation a little to make the calculation down the
track a little simpler. We end up with:

Let’s apply this cost function to the follow data:

12
190280117004

For now we will calculate some theta values, and plot the cost function by hand.
Since this function passes through (0, 0), we are only looking at a single value of
theta. From here on out, I’ll refer to the cost function as J(ϴ).
For J(1), we get 0. No surprise — a value of J(1) yields a straight line that fits
the data perfectly. How about J(0.5)?

The MSE function gives us a value of 0.58. Let’s plot both our values so far:

J(1) = 0
J(0.5) = 0.58

13
190280117004

We will go ahead and calculate some more values of J(ϴ).

And if we join the dots together nicely,

14
190280117004

We can see that the cost function is at a minimum when theta = 1. This makes
sense — our initial data is a straight line with a slope of 1 (the orange line in the
figure above).

Gradient Descent

We minimized J(ϴ) by trial and error above — just trying lots of values and
visually inspecting the resulting graph. There must be a better way?
Queue gradient descent. Gradient Descent is a general function for minimizing
a function, in this case the Mean Squared Error cost function.
Gradient Descent basically just does what we were doing by hand — change the
theta values, or parameters, bit by bit, until we hopefully arrived a minimum.
We start by initializing theta0 and theta1 to any two values, say 0 for both, and
go from there. Formally, the algorithm is as follows:

where α, alpha, is the learning rate, or how quickly we want to move towards the
minimum. If α is too large, however, we can overshoot.

15
190280117004

MATLAB Code/Program:
clc
clear
close all
x = [0,0.14,0.18,0.28,0.37,0.45,0.56,0.69,0.78,1.00];
y = [0, 0.27,0.56,0.21,0.66,0.32,0.65,0.87,1.09, 0.51];
n = length(x);
b0 = 0;
b1 = 0;
plot(x,y, '.b');
hold on
for c = 1:500
y0 = b1*x +b0;
CF = sum(((y0-y).^2)/2*n);
e = sum(y0-y);
n = sum((y0-y).*x);

b0 = b0-(e*0.01);
b1 = b1-(n*0.01);

Y = b1*x + b0;
tem = plot(x,y,'r');

%plot(b1,CF,'.e')
pause(0.1);
if(c~=500)
delete(ten)
end
end
Output:

Conclusion:
In this experiment we were able to implement Gradient Descent for
single variable using MATLAB.

16
190280117004

PRACTICAL – 3

AIM: Implementation of gradient for Multiple Variables.

Theory:

In case of multivariate linear regression output value is dependent on multiple

input values. The relationship between input values, format of different input
values and range of input values plays important role in linear model creation
and prediction.

𝒉(𝒙) = 𝜽𝟎 + 𝜽𝟏𝒙𝟏 + 𝜽𝟐𝒙𝟐 + ⋯ 𝜽𝒏𝒙𝒏

Where: x1, x2, … Xn are multiple input values.

If we consider the house price example then the factors affecting its price like
house size, no of bedrooms, location etc are nothing but input variables of
above hypothesis function.

Cost Function

Our cost function remains same as used in single variable linear regression.

Gradient Descent Algorithm

Gradient descent algorithm function format remains same as used in Univariate

linear regression. But here we have to do it for all the theta values. (No of theta
values = no of features + 1)

17
190280117004

MATLAB Code/Program:
clc
clear all
close all
figure;
x=[1,5,7,11,3];
y=[1,3,9,6,4];
format long
m=0;
c=0;
u=[];
v=[];
plot(x,y,'bo','linewidth',3);
axis([0 6 0 8]);
hold on;
pause(1);
for ua=1:6
for i=1:length(x)
predicted=m*x(i)+c;
error=predicted-y(i);
g=m-0.01*error*x(i);
m=g;
k=c-0.01*error;
c=k;
u=[u m];
v=[v c];
e=g*x+k;
rs=plot(x,e,'k','linewidth',3);
axis([0 6 0 8]);
pause(0.3);
if(ua~=6 || i~=length(x))
delete(rs);
end

18
190280117004

end
end
Output:

Conclusion:

In this experiment we were able to implement Gradient Descent

for multi variable using MATLAB.

19
190280117004

PRACTICAL – 4

AIM: Implementation of Polynomial regression.

Theory:
 Polynomial Regression is a regression algorithm that models the
relationship between a dependent(y) and independent variable(x) as
nth degree polynomial. The Polynomial Regression equation is given
below:

y= b0+b1x1+ b2x12+ b2x13+...... bnx1n

 It is also called the special case of Multiple Linear Regression in ML.
Because we add some polynomial terms to the Multiple Linear
regression equation to convert it into Polynomial Regression.
 It is a linear model with some modification in order to increase the
accuracy.
 The dataset used in Polynomial regression for training is of non-linear
nature.
 It makes use of a linear regression model to fit the complicated and
non-linear functions and datasets.
 Hence, "In Polynomial regression, the original features are converted
into Polynomial features of required degree (2,3,n) and then modelled
using a linear model."

Need for Polynomial Regression:

The need of Polynomial Regression in ML can be understood in the below

points:
 If we apply a linear model on a linear dataset, then it provides us a
good result as we have seen in Simple Linear Regression, but if we
apply the same model without any modification on a non-linear
dataset, then it will produce a drastic output. Due to which loss
function will increase, the error rate will be high, and accuracy will be
decreased.

20
190280117004

 So, for such cases, where data points are arranged in a non-linear

fashion, we need the Polynomial Regression model. We can
understand it in a better way using the below comparison diagram of
the linear dataset and non-linear dataset.
 In the below image, we have taken a dataset which is arranged non-
linearly. So, if we try to cover it with a linear model, then we can
clearly see that it hardly covers any data point. On the other hand, a
curve is suitable to cover most of the data points, which is of the
Polynomial model.
 Hence, if the datasets are arranged in a non-linear fashion, then we
should use the Polynomial Regression model instead of Simple Linear
Regression.

 To understand the need for polynomial regression, let’s generate some

random dataset first.

 The data generated looks like:

21
190280117004

 Let’s apply a linear regression model to this dataset.

 The plot of the best fit line is:

 We can see that the straight line is unable to capture the patterns in the
data. This is an example of under-fitting. Compute the RMSE and R²-
score of the linear line.
 To overcome under-fitting, we need to increase the complexity of the
model.
 To generate a higher order equation, we can add powers of the original
features as new features. The linear model,

 can be transformed to

22
190280117004

 This is still considered to be linear model as the coefficients/weights

associated with the features are still linear. x² is only a feature. However,
the curve that we are fitting is quadratic in nature.
 To convert the original features into their higher order terms we will use
the Polynomial Features class provided by scikit-learn. Next, we train the
model using Linear Regression.
 Fitting a linear regression model on the transformed features gives the
below plot.

 It is quite clear from the plot that the quadratic curve is able to fit the data
better than the linear line. Compute the RMSE and R²-score of the
quadratic plot.
 If we try to fit a cubic curve (degree=3) to the dataset, we can see that it
passes through more data points than the quadratic and the linear plots.

23
190280117004

 Below is a comparison of fitting linear, quadratic and cubic curves on the

dataset.
 If we further increase the degree to 20, we can see that the curve passes
through more data points. Below is a comparison of curves for degree 3 and
20.

24
190280117004

 For degree=20, the model is also capturing the noise in the data. This is an
example of over-fitting. Even though this model passes through most of
the data, it will fail to generalize on unseen data.
 To prevent over-fitting, we can add more training samples so that the
algorithm doesn’t learn the noise in the system and can become more
generalized.

How do we choose an optimal model?

 To answer this question we need to understand the bias vs variance trade-
off.
 The Bias vs Variance trade-off
 Bias refers to the error due to the model’s simplistic assumptions in fitting
the data. A high bias means that the model is unable to capture the patterns
in the data and this results in under-fitting.
 Variance refers to the error due to the complex model trying to fit the data.
High variance means the model passes through most of the data points and
it results in over-fitting the data.
 The below picture summarizes our learning.

25
190280117004

 From the above picture we can observe that as the model complexity
increases, the bias decreases and the variance increase and vice-versa.
Ideally, a machine learning model should have low variance and low
bias. But practically it’s impossible to have both. Therefore, to achieve a
good model that performs well both on the train and unseen data, a trade-
off is made.

 Polynomial provides the best approximation of the relationship between the

dependent and independent variable.
 A Broad range of function can be fit under it.
 Polynomial basically fits a wide range of curvature.
Disadvantages of using Polynomial Regression:

26
190280117004

 The presence of one or two outliers in the data can seriously affect the
results of the nonlinear analysis.
 These are too sensitive to the outliers.
 In addition, there are unfortunately fewer model validation tools for the
detection of outliers in nonlinear regression than there are for linear
regression.

MATLAB Code/Program:
clc
x = [1 3 5 6 7 2 5 4 9 7 8 3 5 6];
y = [10 20 40 87 62 56 71 22 29 91 29 35 23 30];
plot(x,y,'ro','linewidth', 2);
hold on
p5 = polyfit(x,y,5);
xc = 1:.1:10;
y5 = polyval(p5,xc);
plot(xc,y5, 'g.-','linewidth',3)
grid
legend('original data','5th order fit')

Output:

27
190280117004

Conclusion:

In this experiment we were able to implement polynomial

regression using MATLAB.

PRACTICAL – 5

Aim: Implementation of logistic regression.

Theory:

Logistic regression is a supervised learning classification algorithm used to

predict the probability of a target variable. The nature of target or dependent
variable has two possible classes.

• In simple words, the dependent variable is binary in nature having data coded
as either 1 or 0.

28
190280117004

• Mathematically, a logistic regression model predicts P(Y=1) as a function of

X. It is one of the simplest ML algorithms that can be used for various
classification problems such as spam detection, Diabetes prediction, cancer
detection etc.

Curve of Logistic Regression

Type of Logistic Regression:

On the basis of the categories,

Logistic Regression can be classified into three types:

Binomial: In binomial Logistic regression, there can be only two possible types
of the dependent variables, such as 0 or 1, Pass or Fail, etc.

Multinomial: In multinomial Logistic regression, there can be 3 or more

possible unordered types of the dependent variable, such as "cat", "dogs", or
"sheep"

Ordinal: In ordinal Logistic regression, there can be 3 or more possible ordered

types of dependent variables, such as "low", "Medium", or "High".

Logistic Function (Sigmoid Function):

29
190280117004

The sigmoid function is a mathematical function used to map the predicted

values to probabilities.

 It maps any real value into another value within a range of 0 and 1.
 The value of the logistic regression must be between 0 and 1, which
cannot go beyond this limit, so it forms a curve like the "S" form. The S-
form curve is called the Sigmoid function or the logistic function.
 In logistic regression, we use the concept of the threshold value, which
defines the probability of either 0 or 1. Such as values above the threshold
value tends to 1, and a value below the threshold values tends to 0.

Regression Models:

• Binary Logistic Regression Model − The simplest form of logistic regression

is binary or binomial logistic regression in which the target or dependent
variable can have only 2 possible types, either 1 or 0.

• Multinomial Logistic Regression Model − Another useful form of logistic

regression is multinomial logistic regression in which the target or dependent
variable can have 3 or more possible unordered types i.e. the types having no
quantitative significance.

Logistic Regression Equation:

The Logistic regression equation can be obtained from the Linear Regression
equation. The mathematical steps to get Logistic Regression equations are given
below:

o We know the equation of the straight line can be written as:

o In Logistic Regression y can be between 0 and 1 only, so for this let's

divide the above equation by (1-y):

30
190280117004

o But we need range between -[infinity] to +[infinity], then take logarithm

of the equation it will become:

The above equation is the final equation for Logistic Regression.

Binary Logistic Regression Model of ML

• The simplest form of logistic regression is binary or binomial logistic

regression in which the target or dependent variable can have only 2 possible
types either 1 or 0. It allows us to model a relationship between multiple
predictor variables and a binary/binomial target variable. In case of logistic
regression, the linear function is basically used as an input to another function
such as 𝑔 in the following relation:

ℎ∅(𝑥) = 𝑔(∅𝑇𝑥)𝑤ℎ𝑒𝑟𝑒0 ≤ ℎ∅ ≤ 1ℎ∅(𝑥) = 𝑔(∅𝑇𝑥) 𝑤ℎ𝑒𝑟𝑒 0 ≤ ℎ∅ ≤ 1

• Here, 𝑔 is the logistic or sigmoid function which can be given as follows:

𝑔(𝑧) = 11 + 𝑒 − 𝑧𝑤ℎ𝑒𝑟𝑒𝑧 = ∅𝑇𝑥𝑔(𝑧) = 11 + 𝑒 – 𝑧 𝑤ℎ𝑒𝑟𝑒 𝑧 = ∅𝑇𝑥

• To sigmoid curve can be represented with the help of following graph. We can
see the values of y-axis lie between 0 and 1 and crosses the axis at 0.5.

The classes can be divided into positive or negative. The output comes under
the probability of positive class if it lies between 0 and 1. For our
implementation, we are interpreting the output of hypothesis function as
positive if it is ≥0.5, otherwise negative.

• We also need to define a loss function to measure how well the algorithm
performs using the weights on functions, represented by theta as follows:

ℎ = 𝑔(𝑋∅)ℎ = 𝑔(𝑋∅)

𝐽(∅) = 1𝑚. (−𝑦𝑇𝑙𝑜𝑔(ℎ) − (1 − 𝑦)𝑇𝑙𝑜𝑔(1 − ℎ)𝐽(∅)

= 1𝑚. (−𝑦𝑇𝑙𝑜𝑔(ℎ) − (1 − 𝑦)𝑇𝑙𝑜𝑔(1 − ℎ)

31
190280117004

• Now, after defining the loss function our prime goal is to minimize the loss
function. It can be done with the help of fitting the weights which means by
increasing or decreasing the weights. With the help of derivatives of the loss
function w.r.t each weight, we would be able to know what parameters should
have high weight and what should have smaller weight.

• The following gradient descent equation tells us how loss would change if we
modified the parameters −

𝜕𝐽(∅)𝜕∅𝑗 = 1𝑚𝑋𝑇(𝑔(𝑋∅) − 𝑦)𝜕𝐽(∅)𝜕∅𝑗 = 1𝑚𝑋𝑇(𝑔(𝑋∅) − 𝑦)

MATLAB Code/Program:

x = rand (100, 1);

y = x > 0.5;
y (1:60) = x (1:60) > 0.3; %to avoid perfect seperation
% fit model
mdl = fitglm (x, y, "distribution", "binomial");
xnew =linspace (0, 1, 1000)'; %test data
ynew = predict (mdl, xnew);
scatter (x,y);
hold on
plot(xnew,ynew)

Output:

32
190280117004

Conclusion:

In this experiment we were able to implement logistic regression

using MATLAB.

PRACTICAL – 6

AIM: Implementation of Artificial Neural Network using Back

propagation.

Theory:

Artificial Neural Networks

• A neural network is a group of connected I/O units where each connection has
a weight associated with its computer programs. It helps you to build predictive
models from large databases. This model builds upon the human nervous
33
190280117004

system. It helps you to conduct image understanding, human learning, computer

speech, etc.

Back-propagation

• Back-propagation is the essence of neural net training. It is the method of fine-

tuning the weights of a neural net based on the error rate obtained in the
previous epoch (i.e. iteration). Proper tuning of the weights allows you to
reduce error rates and to make the model reliable by increasing its
generalization.

• Back-propagation is a short form for "backward propagation of errors." It is a

standard method of training artificial neural networks. This method helps to
calculate the gradient of a loss function with respects to all the weights in the
network.

Working of Back-propagation: Simple Algorithm

Consider the following diagram:

1. Inputs X, arrive through the reconnected path

2. Input is modeled using real weights W. The weights are usually randomly
selected.

3. Calculate the output for every neuron from the input layer, to the hidden
layers, to the output layer.

4. Calculate the error in the outputs

Error = Actual Output – Desired Output

5. Travel back from the output layer to the hidden layer to adjust the weights
such that the error is decreased.

6. Keep repeating the process until the desired output is achieved

34
190280117004

Need of Back-propagation (Advantages)

Most prominent advantages of Back-propagation are:

• Back-propagation is fast, simple and easy to program.

• It has no parameters to tune apart from the numbers of input.

• It is a flexible method as it does not require prior knowledge about the

network.

• It is a standard method that generally works well.

• It does not need any special mention of the features of the function to be
learned.

Disadvantages of using Back-propagation

• The actual performance of back-propagation on a specific problem is

dependent on the input data.

• Back-propagation can be quite sensitive to noisy data

• You need to use the matrix-based approach for back-propagation instead of

mini-batch.

Types of Back-propagation Networks

Two Types of Back-propagation Networks are:

• Static Back-propagation

• Recurrent Back-propagation

Static back-propagation:

• It is one kind of back-propagation network which produces a mapping of a

static input for static output. It is useful to solve static classification issues like
optical character recognition.
35
190280117004

Recurrent Back-propagation:

• Recurrent back-propagation is fed forward until a fixed value is achieved.

After that, the error is computed and propagated backward.

The main difference between both of these methods is: that the mapping is rapid
in static back propagation while it is non-static in recurrent back-propagation.

MATLAB Code:
input = [0 0; 0 1; 1 0; 1 1];
output = [0;1;1;0];
bias = [-1 -1 -1];
coeff = 1;
iterations = 9;
weights = [0 0 0; 1 3 4; 9 5 2];
for i = 1:iterations
out = zeros(4,1);
numln = length(input(:,1));
for j = 1:numln
H1 =
bias(1,1)*weights(1,1)+input(j,1)*weights(1,2)+input(j,2)*weights(1,3);
x2(1) = sigma(H1);
H2 =
bias(1,2)*weights(2,1)+input(j,1)*weights(2,2)+input(j,2)*weights(2,3);
x2(2)= sigma(H2);
x3_1 = bias(1,3)*weights(3,1)+x2(1)*weights(3,2)+x2(2)*weights(3,3);
out(j)=sigma(x3_1);
delta3_1=out(j)*(1-out(j))*(output(j)-out(j));
delta2_1 = x2(1)*(1-x2(1))*weights(3,2)*delta3_1;
delta2_2 = x2(2)*(1-x2(2))*weights(3,3)*delta3_1;

for k = 1:3
if k == 1
weights(1:k) = weights(1,k)+coeff*bias(1,1)*delta2_1;
weights(2:k) = weights(2,k)+coeff*bias(1,2)*delta2_2;
weights(3:k) = weights(3,k)+coeff*bias(1,3)*delta3_1;

else
weights(1:k) = weights(1,k)+coeff*bias(j,1)*delta2_1;
weights(2:k) = weights(2,k)+coeff*bias(j,2)*delta2_2;
weights(3:k) = weights(3,k)+coeff*x2(k-1)*delta3_1;

end
end
end
end
disp(out)

36
190280117004

function y = sigma(x)
y = 1./(1+exp(x))
end

Output:

aiml6

y =

0.5000

y =

0.7311

y =

0.9936

y =

0.5020

y =

0.5006

y =

0.1801

Conclusion:
In this experiment we were able to implement Artificial Neural
Network by implying back regression using MATLAB.

PRACTICAL – 7

AIM: Implementation of Support Vector Machines using simple features.

Theory:

37
190280117004

Support vector machines (SVMs) are powerful yet flexible supervised machine
learning algorithms which are used both for classification and regression. But
generally, they are used in classification problems.

An SVM model is basically a representation of different classes in a plane in

multidimensional space. The plane will be generated in an iterative manner by
SVM so that the error can be minimized. The goal of SVM is to divide the
datasets into classes to find a maximum marginal plane.

The followings are important concepts in SVM −

 Support Vectors − Datapoints that are closest to the hyperplane is called
support vectors. Separating line will be defined with the help of these
data points.
 Plane − As we can see in the above diagram, it is a decision plane or
space which is divided between a set of objects having different classes.
 Margin − It may be defined as the gap between two lines on the closet
data points of different classes. It can be calculated as the perpendicular
distance from the line to the support vectors. Large margin is considered
as a good margin and small margin is considered as a bad margin.

38
190280117004

The main goal of SVM is to divide the datasets into classes to find a maximum
marginal plane and it can be done in the following two steps –

1. First, SVM will generate planes iteratively that segregates the classes in
best way.
2. Then, it will choose the hyperplane that separates the classes correctly.

MATLAB Code/Program:

rng(1); % For reproducibility

r = sqrt(rand(100,1)); % Radius
t = 2*pi*rand(100,1); % Angle
data1 = [r.*cos(t), r.*sin(t)]; % Points
r2 = sqrt(3*rand(100,1)+1); % Radius
t2 = 2*pi*rand(100,1); % Angle
data2 = [r2.*cos(t2), r2.*sin(t2)]; % points
figure;
plot(data1(:,1),data1(:,2),'g.','MarkerSize',15)
hold on
plot(data2(:,1),data2(:,2),'b.','MarkerSize',15)
ezpolar(@(x)1);ezpolar(@(x)2);
axis equal
hold off

%Train the SVM Classifier

cl = fitcsvm(data3,theclass,'KernelFunction','rbf',...
'BoxConstraint',Inf,'ClassNames',[-1,1]);

% Predict scores over the grid

d = 0.02;
[x1Grid,x2Grid] = meshgrid(min(data3(:,1)):d:max(data3(:,1)),...
min(data3(:,2)):d:max(data3(:,2)));
xGrid = [x1Grid(:),x2Grid(:)];
[~,scores] = predict(cl,xGrid);

% Plot the data and the decision boundary

figure;
h(1:2) = gscatter(data3(:,1),data3(:,2),theclass,'rb','.');
hold on
ezpolar(@(x)1);
h(3) =
plot(data3(cl.IsSupportVector,1),data3(cl.IsSupportVector,2),'ko');
contour(x1Grid,x2Grid,reshape(scores(:,2),size(x1Grid)),[0 0],'k');
legend(h,{'-1','+1','Support Vectors'});
axis equal

39
190280117004

hold off

Output:

Conclusion:
In this experiment we were able to implement support vector
machine for simple feature using MATLAB.

PRACTICAL – 8

Aim: Implementing K -means clustering algorithm.

Theory:

40
190280117004

K-Means Clustering is an unsupervised learning algorithm that is used to solve

the clustering problems in machine learning or data science.

In this algorithm, the unlabeled dataset is classified into different clusters. Here
K defines the number of pre-defined clusters that need to be created in the
process, as if K=2, there will be two clusters, and for K=3, there will be three
clusters, and so on.

The algorithm takes the unlabeled dataset as input, divides the dataset into k-
number of clusters, and repeats the process until it does not find the best
clusters. The value of k should be predetermined in this algorithm.

The k-means clustering algorithm mainly performs two tasks:

o Determines the best value for K center points or centroids by an iterative

process.
o Assigns each data point to its closest k-center. Those data points which
are near to the particular k-center, create a cluster.

Hence each cluster has datapoints with some commonalities, and it is away
from other clusters.

The below diagram explains the working of the K-means Clustering Algorithm:

41
190280117004

The working of the K-Means algorithm is explained in the below steps:

Step-1: Select the number K to decide the number of clusters.

Step-2: Select random K points or centroids. (It can be other from the input
dataset).

Step-3: Assign each data point to their closest centroid, which will form the
predefined K clusters.

Step-4: Calculate the variance and place a new centroid of each cluster.

Step-5: Repeat the third steps, which means reassign each datapoint to the new
closest centroid of each cluster.

Step-6: If any reassignment occurs, then go to step-4.

Step-7: The model is ready.

42
190280117004

Let's understand the above steps by considering the visual plots:

 Suppose we have two variables M1 and M2. The x-y axis scatter plot of
these two variables is given below:

o Let's take number k of clusters, i.e., K=2, to identify the dataset and to
put them into different clusters. It means here we will try to group
these datasets into two different clusters.
o We need to choose some random k points or centroid to form the
cluster. These points can be either the points from the dataset or any
other point. So, here we are selecting the below two points as k points,
which are not the part of our dataset. Consider the below image:

43
190280117004

 Now we will assign each data point of the scatter plot to its closest K-
point or centroid. We will compute it by applying some mathematics that
we have studied to calculate the distance between two points. So, we will
draw a median between both the centroids. Consider the below image:

 From the above image, it is clear that points left side of the line is near to
the K1 or blue centroid, and points to the right of the line are close to the
yellow centroid. Let's color them as blue and yellow for clear
visualization.

44
190280117004

 As we need to find the closest cluster, so we will repeat the process by

choosing a new centroid. To choose the new centroids, we will compute
the center of gravity of these centroids, and will find new centroids as
below:

 Next, we will reassign each datapoint to the new centroid. For this, we
will repeat the same process of finding a median line. The median will be
like below image:

45
190280117004

 From the above image, we can see, one yellow point is on the left side of
the line, and two blue points are right to the line. So, these three points
will be assigned to new centroids.

 As reassignment has taken place, so we will again go to the step-4, which

is finding new centroids or K-points.
 We will repeat the process by finding the center of gravity of centroids,
so the new centroids will be as shown in the below image:

46
190280117004

 As we got the new centroids so again will draw the median line and
reassign the data points. So, the image will be:

 We can see in the above image; there are no dissimilar data points on
either side of the line, which means our model is formed. Consider the
below image:

47
190280117004

 As our model is ready, so we can now remove the assumed centroids, and
the two final clusters will be as shown in the below image:

 The performance of the K-means clustering algorithm depends upon

highly efficient clusters that it forms. But choosing the optimal number of
clusters is a big task. There are some different ways to find the optimal
number of clusters, but here we are discussing the most appropriate
method to find the number of clusters or value of K. The method is given
below:

Elbow Method

 The Elbow method is one of the most popular ways to find the optimal
number of clusters. This method uses the concept of WCSS
value. WCSS stands for Within Cluster Sum of Squares, which defines
the total variations within a cluster. The formula to calculate the value of
WCSS (for 3 clusters) is given below:

WCSS=∑ Pi∈Cluster 1 distance (Pi C 1)2+∑ Pi∈Cluster 2 distance(P iC 2) 2+∑ Pi∈CLuster 3 distance(P iC

In the above formula of WCSS,

 ∑Pi in Cluster1 distance(Pi C1)2: It is the sum of the square of the distances

between each data point and its centroid within a cluster1 and the same
for the other two terms.

 To measure the distance between data points and centroid, we can use
any method such as Euclidean distance or Manhattan distance.

48
190280117004

 To find the optimal value of clusters, the elbow method follows the below
steps:

o It executes the K-means clustering on a given dataset for different K

values (ranges from 1-10).
o For each value of K, calculates the WCSS value.
o Plots a curve between calculated WCSS values and the number of
clusters K.
o The sharp point of bend or a point of the plot looks like an arm, then
that point is considered as the best value of K.

 Since the graph shows the sharp bend, which looks like an elbow, hence
it is known as the elbow method. The graph for the elbow method looks
like the below image:

Advantages of K-Means Clustering

 There’s a reason why top professionals prefer the K-Means clustering
algorithm. Some benefits it offers:
o It is a fast, robust, and easier to understand the algorithm.
o The end-efficiency is relatively high
o Offers phenomenal results when data sets are different from each
other. For higher variables values, K-Means works comparatively
quicker
o The clusters produced with K-Means are relatively tighter than
other clustering methods.

49
190280117004

K-Means Algorithm Using MATLAB:

 K-Means is a largely used algorithm used by many professionals dealing
with data science, machine learning, artificial intelligence, cryptography,
and cyber security.
 The core objective of using this algorithm is to find out the centroid of
each cluster. The data given to a programmer is heterogeneous. Here is
the MATLAB code for plotting the centroid of each cluster and assign the
coordinates of each centroid:

MATLAB Code:
rng default; % For reproducibility
X = [randn(100,2)*0.75+ones(100,2);
randn(100,2)*0.5-ones(100,2)];
opts=statset('Display','final');
[idx,C]=kmeans(X,4,'Distance','cityblock','Replicates','5','Options',opts);
plot(X(idx==1,1),X(idx==1,2),'r.','MarkerSize',12);
hold on;
plot(X(idx==2,1),X(idx==2,2),'b.','MarkerSize',12);
plot(X(idx==3,1),X(idx==3,2),'g.','MarkerSize',12);
plot(X(idx==4,1),X(idx==4,2),'y.','MarkerSize',12);
plot(C(:,1),C(2), 'Kx','MarkerSize','15','LineWidth',3);
legend('Cluster 1','Cluster 2','Cluster 3','Cluster
4','Centroids','Location','NW');
title('Cluster Assignments and centroids');
hold off;
for i=1:size(C, 1)
display(['Centroid, num2str(i), : X1 = ', num2str(C(i, 1)), '; X2 = ',
num2str(C(1, 2))]);
end

Output:

50
190280117004

Results:
 The centroids obtained are as follows:
o The value of X1 & X2 for Centroid 1: 1.3661; 1.7232
o The value of X1 & X2 for Centroid 2: -1.015; -1.053
o The value of X1 & X2 for Centroid 3: 1.6565; 0.36376The value of
X1 & X2 for Centroid 4: 0.35134; 0.85358

Conclusion:
In this experiment we were able to implement K-mean clustering
algorithm using MATLAB.

51
190280117004

PRACTICAL – 9

Aim: Introduction to Python Programming for Machine Learning.

Problem Statement: Run the commands using Anaconda- Jupyter notebook.

Theory:

Important Libraries

1. NumPy : Numerical Python

 It is useful component that makes Python as one of the favourite

languages for Data Science.
 It basically stands for Numerical Python and consists of
multidimensional array objects.
 By using NumPy, we can perform the following important operations −
 Mathematical and logical operations on arrays.
 Fourier transformation
 Operations associated with linear algebra.
We can also see NumPy as the replacement of MATLAB because NumPy is
mostly used along with Scipy (Scientific Python) and Mat-plotlib (plotting
library).

2. Pandas

Pandas is an open-source Python Library used for high-performance data

manipulation and data analysis using its powerful data structures.

With the help of Pandas, in data processing we can accomplish the following
five steps −

 Load
 Prepare
 Manipulate

52
190280117004

 Model
 Analyze

Key Features of Pandas

 Fast and efficient DataFrame object with default and customized

indexing.
 Tools for loading data into in-memory data objects from different file
formats.
 Data alignment and integrated handling of missing data.
 Reshaping and pivoting of date sets.
 Label-based slicing, indexing and sub setting of large data sets.
 Columns from a data structure can be deleted or inserted.
 Group by data for aggregation and transformations.
 High performance merging and joining of data.
 Time Series functionality.

Pandas deals with the following data structures −

 Series
 DataFrame

3. Scipy: Scientific Python

The SciPy library of Python is built to work with NumPy arrays and provides
many user-friendly and efficient numerical practices such as routines for
numerical integration and optimization. Together, they run on all popular
operating systems, are quick to install and are free of charge. NumPy and SciPy
are easy to use, but powerful enough to depend on by some of the world's
leading scientists and engineers.

4. Scikit-learn

The following are some features of Scikit-learn that makes it so useful −

 It is built on NumPy, SciPy, and Matplotlib.
 It is an open source

53
190280117004

 Wide range of machine learning algorithms covering major areas of

ML like classification, clustering, regression, dimensionality reduction,
model selection etc. can be implemented with the help of it.

5. Matplotlib
 Matplotlib is a python library used to create 2D graphs and plots by
using python scripts.
 It has a module named pyplot which makes things easy for plotting by
providing feature to control line styles, font properties, formatting axes
etc.
 It supports a very wide variety of graphs and plots namely - histogram,
bar charts, power spectra, error charts etc.
 It is used along with NumPy to provide an environment that is an
effective open source alternative for MatLab. It can also be used with
graphics toolkits like PyQt and wxPython.

Python Code:

54
190280117004

Output :

Conclusion:

55
190280117004

PRACTICAL – 10

Aim: Write a program for the concept of decision tree to develop a piecewise
linear model and test it as well.

Problem Statement: Generate a synthetic data set using following function, and
split it into training, validation, and testing sample points. Write a program for
the concept of decision tree to develop a piecewise linear model and test it as
well.
x
y= +sin ( x ) +ℵ
2

Steps:
1. Import libraries
2. Prepare data
3. Split the data into training, validation and test sets
4. Fit model
5. Evaluate the model.

Python Program:
1. Import libraries

import numpy as np
from sklearn import linear_model, datasets, tree
import matplotlib.pyplot as plt %matplotlib inline

2. Prepare data:

number_of_samples = 100
x = np.linspace(-np.pi, np.pi, number_of_samples)
y = 0.5*x+np.sin(x)+np.random.random(x.shape)
plt.scatter(x,y,color='black') #Plot y-vs-x in dots
plt.xlabel('x-input feature')
plt.ylabel('y-target values')
plt.title('Fig 5: Data for linear regression')
plt.show()

56
190280117004

3. Split the data into training, validation and test sets

random_indices = np.random.permutation(number_of_samples)
#Training set
x_train = x[random_indices[:70]]
y_train = y[random_indices[:70]]
#Validation set
x_val = x[random_indices[70:85]]
y_val = y[random_indices[70:85]]
#Test set
x_test = x[random_indices[85:]]
y_test = y[random_indices[85:]]

4. Fit a line to the data

maximum_depth_of_tree = np.arange(10)+1
train_err_arr = []
val_err_arr = []
test_err_arr = []

for depth in maximum_depth_of_tree:

    model = tree.DecisionTreeRegressor(max_depth=depth)
    #sklearn takes the inputs as matrices. Hence, we reshape the arrays into column matrices
    x_train_for_line_fitting = np.matrix(x_train.reshape(len(x_train),1))
    y_train_for_line_fitting = np.matrix(y_train.reshape(len(y_train),1))

    #Fit the line to the training data
    model.fit(x_train_for_line_fitting, y_train_for_line_fitting)

    #Plot the line
    plt.figure()
    plt.scatter(x_train, y_train, color='black')
    plt.plot(x.reshape((len(x),1)),model.predict(x.reshape((len(x),1))),color='blue')
    plt.xlabel('x-input feature')
    plt.ylabel('y-target values')
    plt.title('Line fit to training data with max_depth='+str(depth))
    plt.show()
5. Evaluate the model.

mean_train_error = np.mean( (y_train - model.predict(x_train.reshape(len(x_train),1)))**2 )

57
190280117004

    mean_val_error = np.mean( (y_val - model.predict(x_val.reshape(len(x_val),1)))**2 )
    mean_test_error = np.mean( (y_test - model.predict(x_test.reshape(len(x_test),1)))**2 )

    train_err_arr.append(mean_train_error)
    val_err_arr.append(mean_val_error)
    test_err_arr.append(mean_test_error)

    print ('Training MSE: ', mean_train_error, '\nValidation MSE: ', mean_val_error, '\nTest MSE:
', mean_test_error)

plt.figure()
plt.plot(train_err_arr,c='red')
plt.plot(val_err_arr,c='blue')
plt.plot(test_err_arr,c='green')
plt.legend(['Training error', 'Validation error', 'Test error'])
plt.title('Variation of error with maximum depth of tree')
plt.show()

Output :

Conclusion :

58
190280117004

PRACTICAL – 11

Aim: Write a program for KNN algorithm for classification of IRIS dataset

Problem Statement: Write a program for kNN algorithm for classification of

IRIS dataset.

Python Program:
1. Import Libraries

from __future__ import print_function

import numpy as np
from sklearn import datasets, neighbors, linear_model, tree
from sklearn.decomposition import PCA
from sklearn.neighbors import KNeighborsClassifier
from sklearn.datasets import load_iris, fetch_olivetti_faces
from sklearn.model_selection import train_test_split
from sklearn.decomposition import PCA as RandomizedPCA
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
from time import time
%matplotlib inline

2. Prepare dataset

First, we will prepare the dataset. The dataset we choose is a modified version
of the Iris dataset. We choose only the first two input feature dimensions
viz sepal-length and sepal-width (both in cm) for ease of visualization.
iris = load_iris()
X = iris.data[:,:2] #Choosing only the first two input-features
Y = iris.target

number_of_samples = len(Y)

print(number_of_samples)

#Splitting into training and test sets
random_indices = np.random.permutation(number_of_samples)
#Training set
num_training_samples = int(number_of_samples*0.75)
x_train = X[random_indices[:num_training_samples]]
y_train = Y[random_indices[:num_training_samples]]

#Test set

59
190280117004

x_test = X[random_indices[num_training_samples:]]
y_test = Y[random_indices[num_training_samples:]]

#Visualizing the training data
X_class0 = np.asmatrix([x_train[i] for i in range(len(x_train)) if y_
train[i]==0]) #Picking only the first two classes
Y_class0 = np.zeros((X_class0.shape[0]),dtype=np.int)
X_class1 = np.asmatrix([x_train[i] for i in range(len(x_train)) if y_
train[i]==1])
Y_class1 = np.ones((X_class1.shape[0]),dtype=np.int)
X_class2 = np.asmatrix([x_train[i] for i in range(len(x_train)) if y_
train[i]==2])
Y_class2 = np.full((X_class2.shape[0]),fill_value=2,dtype=np.int)

plt.scatter([X_class0[:,0]],[ X_class0[:,1]],color='red')
plt.scatter([X_class1[:,0]],[ X_class1[:,1]],color='blue')
plt.scatter([X_class2[:,0]], [X_class2[:,1]],color='green')
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend(['class 0','class 1','class 2'])
plt.title('Fig 1: Visualization of training data')
plt.show()

Note that the first class is linearly separable from the other two classes but the
second and third classes are not linearly separable from each other.

3. K-nearest neighbour classifier algorithm

Now that our training data is ready, we will jump right into the classification
task. Just to remind you, the K-nearest neighbor is a non-parametric learning
algorithm and does not learn an parameterized function that maps the input
to the output. Rather it looks up the training set every time it is asked to
classify a point and finds out the K nearest neighbors of the query point. The
class corresponding to majority of the points is output as the class of the
query point.

model = neighbors.KNeighborsClassifier(n_neighbors = 10) # K = 10
model.fit(x_train, y_train)

4. Visualize the working of the algorithm

Let's see how the algorithm works. We choose the first point in the test set as
our query point.
60
190280117004

query_point = np.array([5.9,2.9])
true_class_of_query_point = 1
predicted_class_for_query_point = model.predict([query_point])
print("Query point: {}".format(query_point))
print("True class of query point: {}".format(true_class_of_query_point))
query_point.shape

Let's visualize the point and its K=5 nearest neighbors.

neighbors_object = neighbors.NearestNeighbors(n_neighbors=10)
neighbors_object.fit(x_train)
distances_of_nearest_neighbors, indices_of_nearest_neighbors_of_query_point
= neighbors_object.kneighbors([query_point])
nearest_neighbors_of_query_point = x_train[indices_of_nearest_neighbors_of_
query_point[0]]
print("The query point is: {}\n".format(query_point))
print("The nearest neighbors of the query point are:\n {}\
n".format(nearest_neighbors_of_query_point))
print("The classes of the nearest neighbors are: {}\
n".format(y_train[indices_of_nearest_neighbors_of_query_point[0]]))
print("Predicted class for query point:
{}".format(predicted_class_for_query_point[0]))

plt.scatter([X_class0[:,0]], [X_class0[:,1]],color='red')
plt.scatter([X_class1[:,0]], [X_class1[:,1]],color='blue')
plt.scatter([X_class2[:,0]], [X_class2[:,1]],color='green')
plt.scatter(query_point[0], query_point[1],marker='^',s=75,color='black')
plt.scatter(nearest_neighbors_of_query_point[:,0], nearest_neighbors_of_que
ry_point[:,1],marker='s',s=150,color='yellow',alpha=0.30)
plt.xlabel('sepal length')
plt.ylabel('sepal width')
plt.legend(['class 0','class 1','class 2'])
plt.title('Fig 3: Working of the K-NN classification algorithm')
plt.show()

def evaluate_performance(model, x_test, y_test):
    test_set_predictions = [model.predict(x_test[i].reshape((1,len(x_test[i
]))))[0] for i in range(x_test.shape[0])]
    test_misclassification_percentage = 0
    for i in range(len(test_set_predictions)):
        if test_set_predictions[i]!=y_test[i]:
            test_misclassification_percentage+=1
    test_misclassification_percentage *= 100/len(y_test)
    return test_misclassification_percentage

5. Evaluate the performances on the validation and test sets

print("Evaluating K-NN classifier:")

test_err = evaluate_performance(model, x_test, y_test)
print('test misclassification percentage = {}%'.format(test_err))

61
190280117004

Output:

Conclusion:

62
190280117004

Practical – 12

Name: Deepankar Patnaik

Enrollment: 190280117057

Aim: Write a program using Bayes algorithm for email classification (spam or
non-spam) for the open-sourced data set from the UC Irvine Machine Learning
Repository.

Problem Statement:
Write a program using Bayes algorithm for email classification (spam or non-
spam) for the open sourced data set from the UC Irvine Machine Learning
Repository

Python Program:
import numpy as np
from sklearn.model_selection import train_test_split

datafile = open('C:/Users/AntennaPC/Desktop/spambase.data','r')

# Download spambase.data from the MSTeam of this course, Save it and

give file path from your pc

data = []
for line in datafile:
line = [float(element) for element in line.rstrip('\n').split(',')]
data.append(np.asarray(line))

num_features = 48
X = [data[i][:num_features] for i in range(len(data))]
y = [int(data[i][-1]) for i in range(len(data))]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,

random_state=42)

#Making likelihood estimations

#Find the two classes

X_train_class_0 = [X_train[i] for i in range(len(X_train)) if

y_train[i]==0]

63
190280117004

X_train_class_1 = [X_train[i] for i in range(len(X_train)) if

y_train[i]==1]

#Find the class specific likelihoods of each feature

likelihoods_class_0 = np.mean(X_train_class_0, axis=0)/100.0

likelihoods_class_1 = np.mean(X_train_class_1, axis=0)/100.0

#Calculate the class priors

num_class_0 = float(len(X_train_class_0))
num_class_1 = float(len(X_train_class_1))

prior_probability_class_0 = num_class_0 / (num_class_0 + num_class_1)

prior_probability_class_1 = num_class_1 / (num_class_0 + num_class_1)

log_prior_class_0 = np.log10(prior_probability_class_0)
log_prior_class_1 = np.log10(prior_probability_class_1)

def calculate_log_likelihoods_with_naive_bayes(feature_vector, Class):

assert len(feature_vector) == num_features
log_likelihood = 0.0 #using log-likelihood to avoid underflow
if Class==0:
for feature_index in range(len(feature_vector)):
if feature_vector[feature_index] == 1: #feature present
log_likelihood +=
np.log10(likelihoods_class_0[feature_index])
elif feature_vector[feature_index] == 0: #feature absent
log_likelihood += np.log10(1.0 -
likelihoods_class_0[feature_index])
elif Class==1:
for feature_index in range(len(feature_vector)):
if feature_vector[feature_index] == 1: #feature present
log_likelihood +=
np.log10(likelihoods_class_1[feature_index])
elif feature_vector[feature_index] == 0: #feature absent
log_likelihood += np.log10(1.0 -
likelihoods_class_1[feature_index])
else:
raise ValueError("Class takes integer values 0 or 1")

return log_likelihood

def calculate_class_posteriors(feature_vector):
log_likelihood_class_0 =
calculate_log_likelihoods_with_naive_bayes(feature_vector, Class=0)
log_likelihood_class_1 =
calculate_log_likelihoods_with_naive_bayes(feature_vector, Class=1)

log_posterior_class_0 = log_likelihood_class_0 + log_prior_class_0

log_posterior_class_1 = log_likelihood_class_1 + log_prior_class_1

64
190280117004

return log_posterior_class_0, log_posterior_class_1

def classify_spam(document_vector):
feature_vector = [int(element>0.0) for element in document_vector]
log_posterior_class_0, log_posterior_class_1 =
calculate_class_posteriors(feature_vector)
if log_posterior_class_0 > log_posterior_class_1:
return 0
else:
return 1

#Predict spam or not on the test set

predictions = []
for email in X_test:
predictions.append(classify_spam(email))

def evaluate_performance(predictions, ground_truth_labels):

correct_count = 0.0
for item_index in range(len(predictions)):
if predictions[item_index] == ground_truth_labels[item_index]:
correct_count += 1.0
accuracy = correct_count/len(predictions)
return accuracy
accuracy_of_naive_bayes = evaluate_performance(predictions, y_test)
print(accuracy_of_naive_bayes)

for i in range(100):
print predictions[i], y_test[i]

Output :

Conclusion :

65
190280117004

Practical – 13

Aim: Write a program using SVM on IRIS dataset and carry out classification.

Problem Statement: Write a program using SVM on IRIS dataset and carry out
classification.

Program:
1. Import Libraries

from future import division, print_function

import numpy as np
from sklearn import datasets, svm
from sklearn.model_selection import train_test_split
# from sklearn.cross_validation import train_test_split
import matplotlib.pyplot as plt
%matplotlib inline

2. Prepare dataset
iris = datasets.load_iris()
X = iris.data[:,:2]
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.25, random_state=42)

3. Use Support Vector Machine with different kinds of kernels and

evaluate performance
def evaluate_on_test_data(model=None):
predictions = model.predict(X_test)
correct_classifications = 0
for i in range(len(y_test)):
if predictions[i] == y_test[i]:
correct_classifications += 1
accuracy = 100*correct_classifications/len(y_test) #Accuracy as
a percentage
return accuracy

kernels = ('linear','poly','rbf')
accuracies = []
for index, kernel in enumerate(kernels):
model = svm.SVC(kernel=kernel)
model.fit(X_train, y_train)
acc = evaluate_on_test_data(model)
accuracies.append(acc)

66
190280117004

print("{} % accuracy obtained with kernel = {}".format(acc,

kernel))

4. Visualize the Visualize the decision boundaries

#Train SVMs with different kernels
svc = svm.SVC(kernel='linear').fit(X_train, y_train)
rbf_svc = svm.SVC(kernel='rbf', gamma=0.7).fit(X_train, y_train)
poly_svc = svm.SVC(kernel='poly', degree=3).fit(X_train, y_train)

#Create a mesh to plot in

h = .02 # step size in the mesh
x_min, x_max = X[:, 0].min() - 1, X[:, 0].max() + 1
y_min, y_max = X[:, 1].min() - 1, X[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h),
np.arange(y_min, y_max, h))

#Define title for the plots

titles = ['SVC with linear kernel',
'SVC with RBF kernel',
'SVC with polynomial (degree 3) kernel']

for i, clf in enumerate((svc, rbf_svc, poly_svc)):

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
plt.figure(i)

Z = clf.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot

Z = Z.reshape(xx.shape)
plt.contourf(xx, yy, Z, cmap=plt.cm.Paired, alpha=0.8)

# Plot also the training points

plt.scatter(X[:, 0], X[:, 1], c=y, cmap=plt.cm.ocean)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xticks(())
plt.yticks(())
plt.title(titles[i])

plt.show()

5. Check the support vectors

#Checking the support vectors of the polynomial kernel (for example)

print("The support vectors are:\n", poly_svc.support_vectors_)

Evaluate the performances on the validation and test sets

print("Evaluating K-NN classifier:")

67
190280117004

test_err = evaluate_performance(model, x_test, y_test)
print('test misclassification percentage = {}%'.format(test_err))

Output :

Conclusion :

68
190280117004

Practical – 14

Name: Deepankar Patnaik

Enrollment: 190280117057

Aim: Write a program using SVM algorithm for Boston house price prediction
dataset to predict price of houses from certain features.

Problem Statement: Write a program using SVM algorithm for Boston house
price prediction dataset to predict price of houses from certain features.

Program:
1. Import Libraries

from future import division, print_function

2. Load data from the Boston dataset

boston = datasets.load_boston()
X = boston.data
y = boston.target

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.25, random_state=42)

3. Use Support Vector Machine with different kinds of kernels and

evaluate performance.
def evaluate_on_test_data(model=None):
predictions = model.predict(X_test)
sum_of_squared_error = 0
for i in range(len(y_test)):
err = (predictions[i]-y_test[i]) **2
sum_of_squared_error += err
mean_squared_error = sum_of_squared_error/len(y_test)
RMSE = np.sqrt(mean_squared_error)
return RMSE

69
190280117004

kernels = ('linear','rbf')
RMSE_vec = []
for index, kernel in enumerate(kernels):
model = svm.SVR(kernel=kernel)
model.fit(X_train, y_train)
RMSE = evaluate_on_test_data(model)
RMSE_vec.append(RMSE)
print("RMSE={} obtained with kernel = {}".format(RMSE, kernel))

Output :

Conclusion:

Untitled
No ratings yet
Untitled
1,326 pages
Exercises 695 Clas
No ratings yet
Exercises 695 Clas
3 pages
Python Assignment
No ratings yet
Python Assignment
7 pages
Unit - I
No ratings yet
Unit - I
22 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Module 1
No ratings yet
Module 1
138 pages
Interview Questions For DS & DA (ML)
100% (1)
Interview Questions For DS & DA (ML)
66 pages
One Variable Optimization
No ratings yet
One Variable Optimization
15 pages
DTB (ch5)
No ratings yet
DTB (ch5)
14 pages
ML UNIT II
No ratings yet
ML UNIT II
30 pages
Whole ML PDF 1614408656
100% (1)
Whole ML PDF 1614408656
214 pages
ADSP Lab Manual official
No ratings yet
ADSP Lab Manual official
31 pages
NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
100% (1)
NPTEL Online Certification Courses Indian Institute of Technology Kharagpur
4 pages
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
No ratings yet
Unit-5 Part C 1) Explain The Q Function and Q Learning Algorithm Assuming Deterministic Rewards and Actions With Example. Ans)
11 pages
Python Skill Course File
No ratings yet
Python Skill Course File
73 pages
Midterm Solution
No ratings yet
Midterm Solution
6 pages
MLQuestion-Bank (2)_For IA1
No ratings yet
MLQuestion-Bank (2)_For IA1
2 pages
Syllabus
No ratings yet
Syllabus
9 pages
Java Programs 1-10
No ratings yet
Java Programs 1-10
23 pages
Exam Killer
100% (1)
Exam Killer
246 pages
Duda Solutions PDF
No ratings yet
Duda Solutions PDF
77 pages
Elective Focus Basket Details
No ratings yet
Elective Focus Basket Details
46 pages
Quiz 3 - Recommendation systems , Association rule mining_ Machine Learning 3 - Ravi
No ratings yet
Quiz 3 - Recommendation systems , Association rule mining_ Machine Learning 3 - Ravi
7 pages
Machine Learning: Notes by Aniket Sahoo - Part II
No ratings yet
Machine Learning: Notes by Aniket Sahoo - Part II
140 pages
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
No ratings yet
IAT-I Question Paper With Solution of 18CS71 Artificial Intelligence and Machine Learning Oct-2022-Dr. Paras Nath Singh
7 pages
Matlab
No ratings yet
Matlab
5 pages
Question Bank
No ratings yet
Question Bank
5 pages
Data Mining MCQ
No ratings yet
Data Mining MCQ
34 pages
ML 2
No ratings yet
ML 2
6 pages
Fuzzy Logic
No ratings yet
Fuzzy Logic
49 pages
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
No ratings yet
COMPX310-19A Machine Learning Chapter 7: Ensembles, Random Forest
41 pages
Practice Exam and Solution For Natural Language Processing
No ratings yet
Practice Exam and Solution For Natural Language Processing
10 pages
Sample Questions Pattern Recognition
No ratings yet
Sample Questions Pattern Recognition
8 pages
Session 11 - Multiple Regression Analysis (GbA) PDF
No ratings yet
Session 11 - Multiple Regression Analysis (GbA) PDF
119 pages
Adversarial Search: in Artificial Intelligence
No ratings yet
Adversarial Search: in Artificial Intelligence
21 pages
MCQ of Basic Introduction To C
No ratings yet
MCQ of Basic Introduction To C
19 pages
Spam News Detection Report
No ratings yet
Spam News Detection Report
9 pages
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
No ratings yet
SCSA3016 Data Science L T P Credits Total Marks 3 0 0 3 100
1 page
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
No ratings yet
AD3491 - Unit 4 - Analysis of Variance Important Questions 2 Marks With Answer --3-9 (1)
7 pages
Association Analysis: Basic Concepts and Algorithms
No ratings yet
Association Analysis: Basic Concepts and Algorithms
28 pages
UNIT2
No ratings yet
UNIT2
25 pages
Decision Trees - 2022
No ratings yet
Decision Trees - 2022
49 pages
ML Unit 1-Notes
No ratings yet
ML Unit 1-Notes
21 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
A Star Search PDF
100% (1)
A Star Search PDF
6 pages
Quiz 6
100% (1)
Quiz 6
8 pages
Implementation of Turbo Coder Using Verilog HDL For LTE
No ratings yet
Implementation of Turbo Coder Using Verilog HDL For LTE
4 pages
The Double Transposition Cipher
No ratings yet
The Double Transposition Cipher
2 pages
Types of Classification Algorithm
No ratings yet
Types of Classification Algorithm
27 pages
Quiz Week 7 - Support Vector Machines
100% (1)
Quiz Week 7 - Support Vector Machines
3 pages
Daa Assignment
No ratings yet
Daa Assignment
12 pages
Unit 1
No ratings yet
Unit 1
32 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
Lesson 03
No ratings yet
Lesson 03
18 pages
Chapter 1:-: Basics of An Algorithm and Mathematics
100% (1)
Chapter 1:-: Basics of An Algorithm and Mathematics
34 pages
Data Science Engineering Full Time Program Brochure
No ratings yet
Data Science Engineering Full Time Program Brochure
21 pages
Regression
No ratings yet
Regression
16 pages
Machine Learning Coursera All Exercies
75% (12)
Machine Learning Coursera All Exercies
117 pages
Programming Exercise 1: Linear Regression: Machine Learning
No ratings yet
Programming Exercise 1: Linear Regression: Machine Learning
15 pages
Machine Learning Coursera All Exercies PDF
No ratings yet
Machine Learning Coursera All Exercies PDF
117 pages
By: Domodar N. Gujarati: Prof. M. El-Sakka
No ratings yet
By: Domodar N. Gujarati: Prof. M. El-Sakka
22 pages
Sexting Scripts in Adolescent Relationships: Is Sexting Becoming The Norm?
No ratings yet
Sexting Scripts in Adolescent Relationships: Is Sexting Becoming The Norm?
22 pages
Addis Ababa University
No ratings yet
Addis Ababa University
78 pages
Lalu Heri Saputra Jaya Engagement
No ratings yet
Lalu Heri Saputra Jaya Engagement
6 pages
10JSSHS 1496 2021
No ratings yet
10JSSHS 1496 2021
19 pages
Vulnerabilitymapping
No ratings yet
Vulnerabilitymapping
15 pages
Machine learning assignment (3)
No ratings yet
Machine learning assignment (3)
5 pages
QUANTITATIVE TECHNIQUES NOTES
No ratings yet
QUANTITATIVE TECHNIQUES NOTES
21 pages
Musculoskeletal Science and Practice
No ratings yet
Musculoskeletal Science and Practice
6 pages
Machine Learning: B.Tech (CSBS) V Semester
No ratings yet
Machine Learning: B.Tech (CSBS) V Semester
17 pages
11 - Chapter 2
No ratings yet
11 - Chapter 2
27 pages
Introduction To Simple Linear Regression
No ratings yet
Introduction To Simple Linear Regression
34 pages
Heart Disease Prediction Project Documentation
No ratings yet
Heart Disease Prediction Project Documentation
22 pages
Lecture - Transportation Planning
No ratings yet
Lecture - Transportation Planning
87 pages
Work Environment and Performance of Deposit Money Banks in Rivers State
No ratings yet
Work Environment and Performance of Deposit Money Banks in Rivers State
8 pages
Financial Modelling
100% (1)
Financial Modelling
96 pages
Statistics Using Stata An Integrative Approach: Weinberg and Abramowitz 2016
No ratings yet
Statistics Using Stata An Integrative Approach: Weinberg and Abramowitz 2016
46 pages
Internship - Final Bhuvana
No ratings yet
Internship - Final Bhuvana
39 pages
Sanet - CD B06XCYSNT4 Compressed-Dikompresi
No ratings yet
Sanet - CD B06XCYSNT4 Compressed-Dikompresi
690 pages
8614 Huma Zulqurnain 1st Ass - 081135
No ratings yet
8614 Huma Zulqurnain 1st Ass - 081135
35 pages
Problems-1. Solutions 1-10
No ratings yet
Problems-1. Solutions 1-10
17 pages
CAP 3 Agronomic and Statistical Evaluation of Fertilizer Response 1985
No ratings yet
CAP 3 Agronomic and Statistical Evaluation of Fertilizer Response 1985
38 pages
Appendices PDF
No ratings yet
Appendices PDF
16 pages
DMBA103 Gumbo
No ratings yet
DMBA103 Gumbo
11 pages
Res511 - Group 4
No ratings yet
Res511 - Group 4
36 pages
Regression Problems in Python PDF
No ratings yet
Regression Problems in Python PDF
34 pages
Arun Kumar - Insights, Strategies, and Applications of Business Analytics-Cambridge Scholars Publishing (2024)
No ratings yet
Arun Kumar - Insights, Strategies, and Applications of Business Analytics-Cambridge Scholars Publishing (2024)
263 pages
Panel Data Analysis: Fixed & Random Effects (Using Stata 10.x)
0% (1)
Panel Data Analysis: Fixed & Random Effects (Using Stata 10.x)
40 pages
Session 4 - Multiple Linear Regression
No ratings yet
Session 4 - Multiple Linear Regression
63 pages
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 download
100% (2)
(Ebook) Data Analysis Using Regression and Multilevel/Hierarchical Models by Andrew Gelman, Jennifer Hill ISBN 9780521867061, 0521867061 download
51 pages