Classification Evaluation metrics
SYED QASIM RAZA FATIMI
What is Classification?
Classification is about predicting the class labels
given input data. In binary classification, only
two possible output classes(i.e., Dichotomy)
exist. In multiclass classification, more than two
possible classes can be present.
What is a classification matrices?
Classification matrices are used in
machine learning to evaluate the
performance of a classification model.
Accuracy:
Accuracy is the percentage of correctly classified
instances in a dataset. It is calculated by dividing
the number of correctly classified instances by
the total number of instances in the dataset.
How to find Accuracy:
MATHEMATICALLY:
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑎𝑙𝑙 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠
USING SKITLEARN:
How Much
Accuracy is Good?
PROBLEM WITH
ACCURACY
Confusion Matrix
A confusion matrix is a table that is used to evaluate the performance
of a classification model by showing the number of correct and
incorrect classifications made by the model. It is often represented as
a 2x2 matrix, where the rows represent the actual class labels and the
columns represent the predicted class labels.
How to find confusion matrix..?
FORMULA:
𝑇𝑃 +𝑇𝑁
𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝐹𝑃 + 𝐹𝑁 +𝑇𝑁
USING SCIKITLEARN:
Terminology:
TRUE POSITIVE
TRUE NEGATIVE
FALSE POSITIVE
FALSE-NEGATIVE
TYPE-1 ERROR (FALSE POSITIVE)
TYPE-2 ERROR (FALSE-NEGATIVE)
Confusion matric for Multi-
classification :
Predicted Values
Accuracy: Setosa Versicolor Verginica
Actual Values
Setosa 5 0 0
Accuracy
Vesicolor 0 3 1
Verginica 0 1 5
Precision
Precision is the ratio of true positives (the number of
correctly classified positive instances) to the total
number of positive predictions made by the model.
𝑇 𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛=
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒+ 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
Recall:
Recall is the ratio of true positives to the total
number of actual positive instances in the
dataset.
𝑇 𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑅𝑒𝑐𝑎𝑙𝑙=
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒+ 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒
F1 score
The F1 score is the harmonic mean of precision
and recall. It is calculated as 2*(precision * recall)
/ (precision + recall).
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛∗ 𝑅𝑒𝑐𝑎𝑙𝑙
𝑓 1𝑠𝑐𝑜𝑟𝑒 =2
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+ 𝑅𝑒𝑐𝑎𝑙𝑙
ROC And Auc:
The ROC (Receiver Operating Characteristic) curve is a plot of
the true positive rate (TPR) against the false positive rate
(FPR) for different threshold values. It is used to evaluate the
performance of a classification model at different
classification thresholds.
𝑇𝑃
FORMULA:
𝑇𝑃𝑅=
𝑇𝑃 + 𝐹𝑁
𝐹𝑃
𝐹 𝑃𝑅 =
𝐹𝑃 + 𝑇𝑁
Thank You!