0% found this document useful (0 votes)
6 views25 pages

Ch 07 Evaluation

The document discusses the evaluation of AI models, focusing on concepts such as overfitting, prediction accuracy, and key evaluation metrics including precision, recall, and F1 score. It explains the importance of distinguishing between true positives, true negatives, false positives, and false negatives using a confusion matrix. The document emphasizes the need to choose appropriate evaluation metrics based on the specific context of the application, such as forest fire detection or disease prediction.

Uploaded by

c51913392
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views25 pages

Ch 07 Evaluation

The document discusses the evaluation of AI models, focusing on concepts such as overfitting, prediction accuracy, and key evaluation metrics including precision, recall, and F1 score. It explains the importance of distinguishing between true positives, true negatives, false positives, and false negatives using a confusion matrix. The document emphasizes the need to choose appropriate evaluation metrics based on the specific context of the application, such as forest fire detection or disease prediction.

Uploaded by

c51913392
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

EVALUATION

The Blossoms School, Aligarh Class X Mohd Suhail Athar


Evaluation

Evaluation is the process of understanding the


reliability of any AI model, based on outputs by
feeding test dataset into the model and comparing
with actual answers
Overfitting

Overfitting occurs when a machine learning model or


algorithm performs exceptionally well on the training
data but fails to generalize effectively to new, unseen
data.

Using the training data for evaluation is one of the


causes of overfitting.
Evaluation Terminologies

Suppose, we have an AI model which is used for the


prediction of forest fire.
Now, to test the efficiency of this model we need to take
two conditions into consideration:

Refers to the model estimate/guess or forecast


1) Prediction of an outcome

2) Reality Refers to the actual, observed outcome


Few more terms:
Terms below are used to describe the results of a model's
predictions when compared to the actual outcomes or ground
truth.

1) True Positive When some event happens both in prediction and in reality.

2) True Negative When some event don’t happen both in the prediction and
in reality.

3) False Positive When some event happen in the prediction but not in reality.

When some event do not occur in the prediction but


4) False Negative
happens in reality.
Case 1: Is there a forest fire?

True Positive
Case 2: Is there a forest fire?

True Negative
Case 3: Is there a forest fire?

False Positive
Case 3: Is there a forest fire?

False Negative
Confusion Matrix

A confusion matrix is a tabular summary of the


number of correct and incorrect predictions made by
a ML Algorithm/model. It is used to measure the
performance of a model.

A good model is one which has high TP and TN rates,


while low FP and FN rates.
Confusion Matrix

True False
Positive Positive
False True
Negative Negative
EVALUATION METHODS

Accuracy

Precision

Recall

F1 Score
Accuracy

Accuracy is defined as the percentage of correct predictions


out of all the observations.
A prediction is said to be correct if it matches the reality.

Correct Predictions
_________________
Accuracy % = X 100
Total Cases

TP + TN
_____________
Accuracy % = X 100
TP+TN+FP+FN
Precision

Precision is defined as the ratio of true positive cases versus all the cases
where the prediction is true.
That is, it tells us how many of the predicted true instances are actually
true.
True Positives
Precision = _____________________
All Predicted Positives
TP
Precision = _________
TP+FP
Precision value ranges from 0 to 1.
Recall

Recall can be defined as the ratio positive cases that are


correctly identified.
Recall value ranges from 0 to 1.

True Positives
___________________________
Recall =
True Positive + False Negative
TP
_________
Recall =
TP+FN
Choosing between Precision and Recall

It depends on the situation/condition.

Consider the situation Forest Fire:

o Here, a False Negative can cost us a lot and is too risky.


o That is, The fire alarm says, ‘All is fine’, but in reality, there is a fire.
o In cases having High FN Cost we should choose Recall.

For reference:
FN – Prediction No, Reality Yes FP – Prediction Yes, Reality No
Choosing between Precision and Recall

Consider the situation of Mining:

o Suppose a model predicted petroleum in some area. Workers keep


digging but found nothing.
o Here, False Positive case can be very costly.
o In cases having High FP Cost we should choose Precision.
Which one is more important – Precision or Recall ?

Both measures are important.


And since both the measures are important, there is a need
of a parameter which takes both Precision and Recall into
account i.e. F1 Score.
F1 Score

F1 score can be defined as the measure of balance


between precision and recall.

Precision x Recall
______________
F1 Score = 2 x
Precision + Recall

F1 Score value ranges from 0 to 1.


How do you suggest which evaluation metric is more important for any case?

F1 evaluation metric is more important in any case.


F1 score maintains a balance between the precision and recall.

A model has good performance if the F1 Score for that model is high.
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No


Prediction - Yes 100 10
Prediction - No 5 50

By analyzing the confusion matrix, we get –


1) Total of 165 predictions are made. That is, a total of 165 patients
were being tested.
2) Model predicted 110 patients with disease, and 55 patients without
any disease.
3) In reality, 105 patients in the sample have the disease, and 60
patients do not.
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No


Prediction - Yes 100 10
Prediction - No 5 50

True Positive Cases : 100


True Negative Cases : 50
False Positive Cases: 10
False Negative Cases: 05
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No


Prediction - Yes 100 TP 10 FP

Prediction - No 5 FN 50 TN

Accuracy: (TP+TN)/Total = ((100+50)/165)x100 = 91 %

Precision: TP/Predicted Yes = (TP/TP+FP) = 100/100+10 = 0.91

Recall: TP/Actual Yes = (TP/TP+FN) = 100/100+5 = 0.95

F1 Score: 2 x [(0.91 x 0.95)/(0.91+0.95)] = 0.92

You might also like