0% found this document useful (0 votes)
2 views

ARTIFICIAL INTELLIGENCE LEC 3

The document outlines an advanced course on Artificial Intelligence offered by the Government Polytechnic Saharsa, covering key topics such as Precision and Recall, ROC Curve, Naive Bayes, K-Nearest Neighbor (KNN), and Support Vector Machines (SVM). It explains the concepts and mathematical foundations of these machine learning techniques, including their applications and advantages. The course aims to equip students with essential knowledge for understanding and implementing various AI algorithms.

Uploaded by

Kunal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

ARTIFICIAL INTELLIGENCE LEC 3

The document outlines an advanced course on Artificial Intelligence offered by the Government Polytechnic Saharsa, covering key topics such as Precision and Recall, ROC Curve, Naive Bayes, K-Nearest Neighbor (KNN), and Support Vector Machines (SVM). It explains the concepts and mathematical foundations of these machine learning techniques, including their applications and advantages. The course aims to equip students with essential knowledge for understanding and implementing various AI algorithms.

Uploaded by

Kunal Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

ARTIFICIAL INTELLIGENCE(ADVANCED)

A Course under Centre of Excellence as Initiative of Department of Science and


Technology, Government of Bihar

GOVERNMENT POLYTECHNIC SAHARSA


Presenter:
Prof. Shubham
HoD(Computer Science and Engineering)
Todays Class
➢Precision and Recall
➢ROC Curve
➢Naive Bayes
➢Support Vector Machines
➢K-Nearest Neighbor
Precision and Recall
Precision is the ratio between the True Positives and all the Positives. For
our problem statement, that would be the measure of patients that we
correctly identify as having a heart disease out of all the patients actually
having it. Mathematically:

Recall, also known as the true positive rate (TPR), is the percentage of data
samples that a machine learning model correctly identifies as belonging to a
class of interest—the “positive class”—out of the total samples for that class.
ROC Curve
The Receiver Operator Characteristic (ROC) curve is an evaluation metric
for binary classification problems. It is a probability curve that plots
the TPR against FPR at various threshold values and essentially separates
the ‘signal’ from the ‘noise.’
In other words, it shows the performance of a classification model at all
classification thresholds. The Area Under the Curve (AUC) is the measure
of the ability of a binary classifier to distinguish between classes and is
used as a summary of the ROC curve.
The higher the AUC, the better the model’s performance at distinguishing
between the positive and negative classes.
When AUC = 1, the classifier can correctly distinguish between all the Positive and the Negative class points. If,
however, the AUC had been 0, then the classifier would predict all Negatives as Positives and all Positives as
Negatives.
When 0.5<AUC<1, there is a high chance that the classifier will be able to distinguish the positive class values
from the negative ones. This is so because the classifier is able to detect more numbers of True positives and
True negatives than False negatives and False positives.
When AUC=0.5, then the classifier is not able to distinguish between Positive and Negative class points.
Meaning that the classifier either predicts a random class or a constant class for all the data points.
So, the higher the AUC value for a classifier, the better its ability to distinguish between positive and negative
classes.
Naive Bayes Classification
It is a classification technique based on Bayes’ Theorem with an independence assumption among
predictors.
In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class
is unrelated to the presence of any other feature.
The Naïve Bayes classifier is a popular supervised machine learning algorithm used for
classification tasks such as text classification.
•Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in
building the fast machine learning models that can make quick predictions.
•It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
•Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and
classifying articles.
The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:
•Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the
occurrence of other features. Such as if the fruit is identified on the bases of color, shape, and taste, then
red, spherical, and sweet fruit is recognized as an apple. Hence each feature individually contributes to
identify that it is an apple without depending on each other.
•Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
•Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is
used to determine the probability of a hypothesis with prior knowledge.
It depends on the conditional probability.
•The formula for Bayes' theorem is given as:
Where is Naive Bayes Used?
Face Recognition
As a classifier, it is used to identify the faces or its other features, like nose, mouth, eyes,
etc.
Weather Prediction
It can be used to predict if the weather will be good or bad.
Medical Diagnosis
Doctors can diagnose patients by using the information that the classifier provides.
Healthcare professionals can use Naive Bayes to indicate if a patient is at high risk for
certain diseases and conditions, such as heart disease, cancer, and other ailments.
News Classification
With the help of a Naive Bayes classifier, Google News recognizes whether the news is
political, world news, and so on.
Naïve Bayes Example
Advantages of Naive Bayes Classifier
The following are some of the benefits of the Naive Bayes classifier:
•It is simple and easy to implement
•It doesn’t require as much training data
•It handles both continuous and discrete data
•It is highly scalable with the number of predictors and data points
•It is fast and can be used to make real-time predictions
•It is not sensitive to irrelevant features
KNN (K-Nearest Neighbor)
The K-Nearest Neighbor (KNN) algorithm is a popular machine learning technique
used for classification and regression tasks.
It relies on the idea that similar data points tend to have similar labels or values.
K-NN algorithm assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories.
• K-NN algorithm can be used for Regression as well as for Classification but mostly it is used
for the Classification problems.
•K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
•It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an
action on the dataset.
How does K-NN work?
The K-NN working can be explained on the basis of the below algorithm:
•Step-1: Select the number K of the neighbors
•Step-2: Calculate the Euclidean distance of K number of neighbors
•Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
•Step-4: Among these k neighbors, count the number of the data points in each category.
•Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum.
•Step-6: Our model is ready.
•There is no particular way to determine the best value for "K",
so we need to try some values to find the best out of them. The
most preferred value for K is 5.
•A very low value for K such as K=1 or K=2, can be noisy and
lead to the effects of outliers in the model.
•Large values for K are good, but it may find some difficulties.

Advantages of KNN Algorithm:


•It is simple to implement.
•It is robust to the noisy training data
•It can be more effective if the training data is large.
Disadvantages of KNN Algorithm:
•Always needs to determine the value of K which may be
complex some time.
•The computation cost is high because of calculating the
distance between the data points for all the training samples.
Support Vector Machine
Support Vector Machine (SVM) is a supervised machine learning algorithm used for
both classification and regression.
The main objective of the SVM algorithm is to find the optimal hyperplane in an N-
dimensional space that can separate the data points in different classes in the
feature space.
The hyperplane tries that the margin between the closest points of different classes
should be as maximum as possible.
The dimension of the hyperplane depends upon the number of features. If the
number of input features is two, then the hyperplane is just a line.
If the number of input features is three, then the hyperplane becomes a 2-D plane.
•Support Vectors: These are the points that are closest to the hyperplane. A separating
line will be defined with the help of these data points.
•Margin: it is the distance between the hyperplane and the observations closest to the
hyperplane (support vectors). In SVM large margin is considered a good margin. There
are two types of margins hard margin and soft margin. I will talk more about these two
in the later section.
Linear SVM
When the data is perfectly linearly separable only then we can use Linear SVM.
Perfectly linearly separable means that the data points can be classified into 2 classes
by using a single straight line(if 2D).
Non-Linear SVM
When the data is not linearly separable then we can use Non-Linear SVM, which
means when the data points cannot be separated into 2 classes by using a straight line
(if 2D) then we use some advanced techniques like kernel tricks to classify them. In
most real-world applications we do not find linearly separable datapoints hence we use
kernel trick to solve them.

You might also like