0% found this document useful (0 votes)
12 views28 pages

W6 CSE 4781 Classification Metrics

The document discusses classification metrics in machine learning, including confusion matrix, accuracy, precision, recall, F1 score, and ROC-AUC. It explains how these metrics evaluate the performance of classification algorithms, using examples like fraud detection and email spam classification. The ROC AUC score is highlighted as a comprehensive measure of model quality across various classification thresholds.

Uploaded by

Aliul Hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views28 pages

W6 CSE 4781 Classification Metrics

The document discusses classification metrics in machine learning, including confusion matrix, accuracy, precision, recall, F1 score, and ROC-AUC. It explains how these metrics evaluate the performance of classification algorithms, using examples like fraud detection and email spam classification. The ROC AUC score is highlighted as a comprehensive measure of model quality across various classification thresholds.

Uploaded by

Aliul Hassan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Asst Prof.

S M Sadakatul Bari
AE(Avionics), AAUB
CLASSIFICATION METRICS
IN MACHINE LEARNING
 Confusion Matrix
 Accuracy
 Precision
 Recall,
 F1 score
 ROC-AUC
CONFUSION MATRIX
 A confusion matrix is a table that evaluates the performance of a classification
algorithm.
 The matrix uses actual(target) values to compare with machine learning (ML) predicted
values.
Examples :

Fraud detection: predicting if a payment transaction is fraudulent.

The words “positive” and “negative” refer to the target and non-target classes.
In this example, fraud is our target. We refer to transactions flagged as fraudulent as “positives.”
Let’s say we have an email spam classification model. It is a binary classification problem.
The two possible classes are “spam” and “not spam.”

After training the model, we generated predictions for 10000 emails in the validation dataset.
We already know the actual labels and can evaluate the quality of the model predictions.
In our example above, accuracy is (600+9000)/10000 =
0.96. The model was correct in 96% of cases.
Precision : How often a machine learning model correctly predicts the positive class?
In our example above, precision is 600/(600+100)= 0.86.
When predicting “spam,” the model was correct in 86% of
cases.
Recall: How often a machine learning model correctly identifies positive instances (true positives)
from all the actual positive samples in the dataset?

Recall can also be called sensitivity or true positive rate


(TPR).
In our example above, recall is 600/(600+300)= 0.67. The
model correctly found 67% of spam emails. The other 33%
made their way to the inbox unlabeled.
The F1 score is the harmonic mean (a kind of average) of precision and
recall.

Preferable for class-imbalanced datasets


What is a ROC curve?

The ROC curve stands for the Receiver Operating Characteristic curve. It is a graphical representation
of the performance of a binary classifier at different classification thresholds.

The curve plots the possible True Positive rates (TPR) against the False Positive rates (FPR).
What is a ROC AUC score?

ROC AUC stands for Receiver Operating Characteristic Area Under the Curve.

ROC AUC score is a single number that summarizes the classifier's performance across all possible
classification thresholds. To get the score, you must measure the area under the ROC curve.
• ROC AUC reflects the model quality in one number. It is convenient to use a single metric,
especially when comparing multiple models.

• In fact, it sums up the performance across the different classification thresholds. It is a


valuable "overall" quality measure, whereas precision and recall provide a quality
"snapshot" at a given decision threshold.

• ROC AUC measures the model's ability to discriminate between the positive and negative
classes, regardless of their relative frequencies in the dataset.

• During model training, it helps compare multiple ML models against each other.
 https://www.evidentlyai.com/classification-metrics

 https://www.youtube.com/watch?v=LxcRFNRgLCs

 https://www.youtube.com/watch?v=Joh3LOaG8Q0

You might also like