0% found this document useful (0 votes)
24 views5 pages

Lokesh T00691325

The document contains a series of tasks related to machine learning, including predicting class labels using a Simple Bayesian Classifier, discussing the Leave-One-Out method for validation, explaining overfitting in inductive inference, calculating linear regression parameters, and solving the XOR problem using Support Vector Machines with a specific kernel. Each task provides detailed calculations and explanations, demonstrating various concepts in data mining and machine learning. The document serves as an examination paper for a machine learning course.

Uploaded by

gopisettypankaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views5 pages

Lokesh T00691325

The document contains a series of tasks related to machine learning, including predicting class labels using a Simple Bayesian Classifier, discussing the Leave-One-Out method for validation, explaining overfitting in inductive inference, calculating linear regression parameters, and solving the XOR problem using Support Vector Machines with a specific kernel. Each task provides detailed calculations and explanations, demonstrating various concepts in data mining and machine learning. The document serves as an examination paper for a machine learning course.

Uploaded by

gopisettypankaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 5

NAME: Lokesh Reddy Syamala

ML EXAM I Date: 10/12/2021

(For tasks 1, 4 and 5 explain/ calculate every step and not using libraries )

Task 1: (20 points) For the training set given below, predict the classification of the
following sample X = {2,1,1, Class =?}
using Simple Bayesian Classifier

Sample Attribute1 Attribute2 Attribute3 Class


A1 A2 A3 C

1 1 2 1 1
2 0 0 1 1
3 2 1 2 2
4 1 2 1 2
5 0 1 2 1
6 2 2 2 2
7 1 0 1 1

# Training set in matrix form


Sample = [[1,2,1],[0,0,1],[2,1,2],[1,2,1],[0,1,2],[2,2,2],[1,0,1]]
# Class C
Class = [1,1,2,2,1,2,1]
# N: Number of samples
N = len(Sample)
# n1: Number of samples for which class is 1
n1 = Class.count(1)
# p1: Probability of a class being 1
p1 = n1/N
# n2: Number of samples for which class is 2
n2 = Class.count(2)
# p2: Probability of a class being 2
p2 = n2/N
#All the below values are initialised with 1 to avoid zero proabilit
y
# Number of A1 attributes being 2 when class is 1
count1_A1_2 = 1
# Number of A2 attributes being 1 when class is 1
count1_A2_1 = 1
# Number of A3 attributes being 1 when class is 1
count1_A3_1 = 1
# Number of A1 attributes being 2 when class is 2
count2_A1_2 = 1
# Number of A2 attributes being 1 when class is 2
count2_A2_1 = 1
# Number of A3 attributes being 1 when class is 2
count2_A3_1 = 1
# Iterating through all the samples and updating the above initialis
ed values accordingly
for i in range(N):
if Sample[i][0] == 2:
if Class[i] == 1:
count1_A1_2 += 1
else:
count2_A1_2 += 1
if Sample[i][1] == 1:
if Class[i] == 1:
count1_A2_1 += 1
else:
count2_A2_1 += 1
if Sample[i][2] == 1:
if Class[i] == 1:
count1_A3_1 += 1
else:
count2_A3_1 += 1

# Finding the probability of the class being 1 when Sample={2,1,1}


result1 = (count1_A1_2/n1)*(count1_A2_1/n1)*(count1_A3_1/n1)*p1
# Finding the probability of the class being 2 when Sample={2,1,1}
result2 = (count2_A1_2/n2)*(count2_A2_1/n2)*(count2_A3_1/n2)*p2
# Comparing the calculated probilities and outputting the class corr
esponding to the bigger probability
if (result1 > result2):
print("Predicted class is 1")
else:
print("Predicted class is 2")

Output:
Task 2: (20 Points) In which situations you would recommend Leave-One-Out method for
validation of data mining results? (20 points)

The leave-one-out approach is essentially n-fold cross-validation, where n is the number of


instances in the dataset, and we utilize it in two situations: when we don't get reliable findings
and when we want to use as much data as possible. Each instance is eliminated one at a time,
with the learning scheme focusing on the remaining instances.
It is assessed based on its accuracy on the remaining instance, with a score of 1 or 0
indicating success or failure. The final error estimate is calculated by averaging the results of
all n judgments, one for each member of the dataset.
When the largest amount of data feasible is used for training in each example, this
presumably enhances the likelihood that the classifier will be correct, and second, the
approach is deterministic, with no random sampling. It's pointless to repeat it ten times or
even once more because the result will be the same each time.
Set against this is the high computing cost, as the entire learning operation must be repeated n
times, which is typically impractical for large datasets. Nonetheless, leave-one-out appears to
offer a chance of getting the most out of a tiny dataset and obtaining the most accurate
estimate possible.

Task 3: (20 points) What is meant by the term overfitting in the context of inductive
inference? Give example(s) and solution(s)

The technique of reaching a general conclusion from a specific example is known as


inductive inference. Whereas for overfitting, when a model learns the information and noise
in the training data to the point where it degrades the model's performance on fresh data, this
is known as overfitting. This means that the model picks up on noise or random fluctuations
in the training data and learns them as ideas.
For example,
Assume that a model is being trained for a 1000-skill set that will be picked by customers.
And their outcomes, which are based on the information they supply. When the model is
applied to a real dataset, it has a 99 % accuracy rate. Only 50% accuracy is observed when
the model is run on a fully new dataset. Using the training data, the model is unable to
generalize. This is a case of overfitting.
To solve it, it is usually preferable to cross-validate to avoid this issue. The original training
datasets are utilized to produce multiple splits in cross validation. The model is then fine-
tuned using these divides. In other circumstances to solve it, more data and clear samples can
also be used.
Task 4 : ( 20 points) Given the data set with two dimensions X and Y:

X Y

1 4
4 2
3 3
5 2

Use a linear regression method to calculate the parameters  and  where y =  +  x.


(calculate every step and not using libraries)

X - Mx Y - My (X - Mx)2 (X - Mx) (Y - My)

-2.25 1.25 5.0625 -2.8125

0.75 -0.75 0.5625 -0.5625

-0.25 0.25 0.0625 -0.0625

1.75 -0.75 3.0625 -1.3125

SS: 8.75 SP: -4.75

x =13; y =11 ; Mean of X= 3.25; Mean of Y=2.75; SS =8.75; SP = -4.75

y= (SP/SS) + (Mean of Y – ((SP/SS)* Mean of X))


y= (-4.75/8.75) + (2.75 - (-0.54*3.25))
y= 4.51-0.54x
Task 5: Support Vector Machines (SVM). The Mercer kernel used to solve the XOR
problem is given by k (xi, xj) = (1 + xi Txj) p . What is the smallest positive integer p for which
the XOR problem is solved? Show the kernel and XOR Problem solution using SVM (20
points)

Let two-dimensional vectors x= [xi, xj]


Here, K (xi, xj) = (1+ xiT,xj)P
Let the Kernel K (xi, xj) = (1+ xiT,xj)2
We should show K (xi, xj) = Φ (xi)T Φ (xj)
Therefore,
K(xi, xj) = (1+ xiT,xj)2

= 1+xi12xj12+ 2 xi1 xj1 xi2 xj2 + xi22xj22+2 xi1xj1+2xj2xj1

= [1xi12√2xi1xi2 xi22√2xi1√2xi2]T [1+ xj12√2xj1xj2xj22 √2xj1√2xj2]

= Φ(xi)TΦ(xj), where Φ(x) = [1x12√2x1x2x22√2x1√2x2]

Hence, Smallest positive integer P to solve XOR problem is 2.

You might also like