ARTIFICIAL INTELLIGENCE LEC 3
ARTIFICIAL INTELLIGENCE LEC 3
Recall, also known as the true positive rate (TPR), is the percentage of data
samples that a machine learning model correctly identifies as belonging to a
class of interest—the “positive class”—out of the total samples for that class.
ROC Curve
The Receiver Operator Characteristic (ROC) curve is an evaluation metric
for binary classification problems. It is a probability curve that plots
the TPR against FPR at various threshold values and essentially separates
the ‘signal’ from the ‘noise.’
In other words, it shows the performance of a classification model at all
classification thresholds. The Area Under the Curve (AUC) is the measure
of the ability of a binary classifier to distinguish between classes and is
used as a summary of the ROC curve.
The higher the AUC, the better the model’s performance at distinguishing
between the positive and negative classes.
When AUC = 1, the classifier can correctly distinguish between all the Positive and the Negative class points. If,
however, the AUC had been 0, then the classifier would predict all Negatives as Positives and all Positives as
Negatives.
When 0.5<AUC<1, there is a high chance that the classifier will be able to distinguish the positive class values
from the negative ones. This is so because the classifier is able to detect more numbers of True positives and
True negatives than False negatives and False positives.
When AUC=0.5, then the classifier is not able to distinguish between Positive and Negative class points.
Meaning that the classifier either predicts a random class or a constant class for all the data points.
So, the higher the AUC value for a classifier, the better its ability to distinguish between positive and negative
classes.
Naive Bayes Classification
It is a classification technique based on Bayes’ Theorem with an independence assumption among
predictors.
In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in a class
is unrelated to the presence of any other feature.
The Naïve Bayes classifier is a popular supervised machine learning algorithm used for
classification tasks such as text classification.
•Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in
building the fast machine learning models that can make quick predictions.
•It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.
•Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and
classifying articles.
The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:
•Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the
occurrence of other features. Such as if the fruit is identified on the bases of color, shape, and taste, then
red, spherical, and sweet fruit is recognized as an apple. Hence each feature individually contributes to
identify that it is an apple without depending on each other.
•Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.
•Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is
used to determine the probability of a hypothesis with prior knowledge.
It depends on the conditional probability.
•The formula for Bayes' theorem is given as:
Where is Naive Bayes Used?
Face Recognition
As a classifier, it is used to identify the faces or its other features, like nose, mouth, eyes,
etc.
Weather Prediction
It can be used to predict if the weather will be good or bad.
Medical Diagnosis
Doctors can diagnose patients by using the information that the classifier provides.
Healthcare professionals can use Naive Bayes to indicate if a patient is at high risk for
certain diseases and conditions, such as heart disease, cancer, and other ailments.
News Classification
With the help of a Naive Bayes classifier, Google News recognizes whether the news is
political, world news, and so on.
Naïve Bayes Example
Advantages of Naive Bayes Classifier
The following are some of the benefits of the Naive Bayes classifier:
•It is simple and easy to implement
•It doesn’t require as much training data
•It handles both continuous and discrete data
•It is highly scalable with the number of predictors and data points
•It is fast and can be used to make real-time predictions
•It is not sensitive to irrelevant features
KNN (K-Nearest Neighbor)
The K-Nearest Neighbor (KNN) algorithm is a popular machine learning technique
used for classification and regression tasks.
It relies on the idea that similar data points tend to have similar labels or values.
K-NN algorithm assumes the similarity between the new case/data and available cases and
put the new case into the category that is most similar to the available categories.
• K-NN algorithm can be used for Regression as well as for Classification but mostly it is used
for the Classification problems.
•K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
•It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an
action on the dataset.
How does K-NN work?
The K-NN working can be explained on the basis of the below algorithm:
•Step-1: Select the number K of the neighbors
•Step-2: Calculate the Euclidean distance of K number of neighbors
•Step-3: Take the K nearest neighbors as per the calculated Euclidean distance.
•Step-4: Among these k neighbors, count the number of the data points in each category.
•Step-5: Assign the new data points to that category for which the number of the
neighbor is maximum.
•Step-6: Our model is ready.
•There is no particular way to determine the best value for "K",
so we need to try some values to find the best out of them. The
most preferred value for K is 5.
•A very low value for K such as K=1 or K=2, can be noisy and
lead to the effects of outliers in the model.
•Large values for K are good, but it may find some difficulties.