0% found this document useful (0 votes)

6 views25 pages

Ch 07 Evaluation

The document discusses the evaluation of AI models, focusing on concepts such as overfitting, prediction accuracy, and key evaluation metrics including precision, recall, and F1 score. It explains the importance of distinguishing between true positives, true negatives, false positives, and false negatives using a confusion matrix. The document emphasizes the need to choose appropriate evaluation metrics based on the specific context of the application, such as forest fire detection or disease prediction.

Uploaded by

c51913392

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views25 pages

Ch 07 Evaluation

Uploaded by

c51913392

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

EVALUATION

The Blossoms School, Aligarh Class X Mohd Suhail Athar

Evaluation

Evaluation is the process of understanding the

reliability of any AI model, based on outputs by
feeding test dataset into the model and comparing
with actual answers
Overfitting

Overfitting occurs when a machine learning model or

algorithm performs exceptionally well on the training
data but fails to generalize effectively to new, unseen
data.

Using the training data for evaluation is one of the

causes of overfitting.
Evaluation Terminologies

Suppose, we have an AI model which is used for the

prediction of forest fire.
Now, to test the efficiency of this model we need to take
two conditions into consideration:

Refers to the model estimate/guess or forecast

1) Prediction of an outcome

2) Reality Refers to the actual, observed outcome

Few more terms:
Terms below are used to describe the results of a model's
predictions when compared to the actual outcomes or ground
truth.

1) True Positive When some event happens both in prediction and in reality.

2) True Negative When some event don’t happen both in the prediction and
in reality.

3) False Positive When some event happen in the prediction but not in reality.

When some event do not occur in the prediction but

4) False Negative
happens in reality.
Case 1: Is there a forest fire?

True Positive
Case 2: Is there a forest fire?

True Negative
Case 3: Is there a forest fire?

False Positive
Case 3: Is there a forest fire?

False Negative
Confusion Matrix

A confusion matrix is a tabular summary of the

number of correct and incorrect predictions made by
a ML Algorithm/model. It is used to measure the
performance of a model.

A good model is one which has high TP and TN rates,

while low FP and FN rates.
Confusion Matrix

True False
Positive Positive
False True
Negative Negative
EVALUATION METHODS

Accuracy

Precision

Recall

F1 Score
Accuracy

Accuracy is defined as the percentage of correct predictions

out of all the observations.
A prediction is said to be correct if it matches the reality.

Correct Predictions
_________________
Accuracy % = X 100
Total Cases

TP + TN
_____________
Accuracy % = X 100
TP+TN+FP+FN
Precision

Precision is defined as the ratio of true positive cases versus all the cases
where the prediction is true.
That is, it tells us how many of the predicted true instances are actually
true.
True Positives
Precision = _____________________
All Predicted Positives
TP
Precision = _________
TP+FP
Precision value ranges from 0 to 1.
Recall

Recall can be defined as the ratio positive cases that are

correctly identified.
Recall value ranges from 0 to 1.

True Positives
___________________________
Recall =
True Positive + False Negative
TP
_________
Recall =
TP+FN
Choosing between Precision and Recall

It depends on the situation/condition.

Consider the situation Forest Fire:

o Here, a False Negative can cost us a lot and is too risky.

o That is, The fire alarm says, ‘All is fine’, but in reality, there is a fire.
o In cases having High FN Cost we should choose Recall.

For reference:
FN – Prediction No, Reality Yes FP – Prediction Yes, Reality No
Choosing between Precision and Recall

Consider the situation of Mining:

o Suppose a model predicted petroleum in some area. Workers keep

digging but found nothing.
o Here, False Positive case can be very costly.
o In cases having High FP Cost we should choose Precision.
Which one is more important – Precision or Recall ?

Both measures are important.

And since both the measures are important, there is a need
of a parameter which takes both Precision and Recall into
account i.e. F1 Score.
F1 Score

F1 score can be defined as the measure of balance

between precision and recall.

Precision x Recall
______________
F1 Score = 2 x
Precision + Recall

F1 Score value ranges from 0 to 1.

How do you suggest which evaluation metric is more important for any case?

F1 evaluation metric is more important in any case.

F1 score maintains a balance between the precision and recall.

A model has good performance if the F1 Score for that model is high.
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No

Prediction - Yes 100 10
Prediction - No 5 50

By analyzing the confusion matrix, we get –

1) Total of 165 predictions are made. That is, a total of 165 patients
were being tested.
2) Model predicted 110 patients with disease, and 55 patients without
any disease.
3) In reality, 105 patients in the sample have the disease, and 60
patients do not.
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No

Prediction - Yes 100 10
Prediction - No 5 50

True Positive Cases : 100

True Negative Cases : 50
False Positive Cases: 10
False Negative Cases: 05
An Example: Confusion Matrix of Disease Prediction by an AI model.

Reality - Yes Reality - No

Prediction - Yes 100 TP 10 FP

Prediction - No 5 FN 50 TN

Accuracy: (TP+TN)/Total = ((100+50)/165)x100 = 91 %

Precision: TP/Predicted Yes = (TP/TP+FP) = 100/100+10 = 0.91

Recall: TP/Actual Yes = (TP/TP+FN) = 100/100+5 = 0.95

F1 Score: 2 x [(0.91 x 0.95)/(0.91+0.95)] = 0.92

Econometrics: A Simple Introduction
From Everand
Econometrics: A Simple Introduction
K.H. Erickson
3.5/5 (5)
Evaluation Notes
No ratings yet
Evaluation Notes
12 pages
Ch-EVALUATION
No ratings yet
Ch-EVALUATION
7 pages
Unit 7 - AI (Evaluation)
No ratings yet
Unit 7 - AI (Evaluation)
28 pages
EVALUATION - notes
No ratings yet
EVALUATION - notes
15 pages
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
No ratings yet
Unit-7 Evaluation: 7. What Is Meant by Overfitting of Data?
7 pages
AI Project Evaluation 1
No ratings yet
AI Project Evaluation 1
5 pages
10 Ai Evaluation tp01
No ratings yet
10 Ai Evaluation tp01
5 pages
517-c-30072-Assignment Chapter Evaluation
No ratings yet
517-c-30072-Assignment Chapter Evaluation
10 pages
1051637-Worksheet Part b Unit7 Evaluation
No ratings yet
1051637-Worksheet Part b Unit7 Evaluation
5 pages
EVALUATION
No ratings yet
EVALUATION
4 pages
Unit-7 Evaluation Notes
No ratings yet
Unit-7 Evaluation Notes
9 pages
AI Evaluation
No ratings yet
AI Evaluation
3 pages
EVALUATION
No ratings yet
EVALUATION
10 pages
Evaluation
No ratings yet
Evaluation
2 pages
UNIT 7 EVALUATION.docx
No ratings yet
UNIT 7 EVALUATION.docx
13 pages
5.10AI -2B
No ratings yet
5.10AI -2B
15 pages
Confusion Matrix
No ratings yet
Confusion Matrix
43 pages
Evaluation Exercise
No ratings yet
Evaluation Exercise
3 pages
Aiunit 7 10
No ratings yet
Aiunit 7 10
4 pages
Screenshot 2024-12-17 at 8.54.03 PM
No ratings yet
Screenshot 2024-12-17 at 8.54.03 PM
4 pages
9__ROC__AUC
No ratings yet
9__ROC__AUC
27 pages
AI EVALUTION (4)
No ratings yet
AI EVALUTION (4)
2 pages
AI-Evaluation
No ratings yet
AI-Evaluation
30 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
confusion_matrix
No ratings yet
confusion_matrix
5 pages
Evaluation
No ratings yet
Evaluation
2 pages
EvaluationQuestions Class 10 Ai
No ratings yet
EvaluationQuestions Class 10 Ai
6 pages
Evaluation Question Answers
No ratings yet
Evaluation Question Answers
7 pages
Screenshot 2025-03-17 at 12.15.59
No ratings yet
Screenshot 2025-03-17 at 12.15.59
3 pages
CE880_Lecture6_slides
No ratings yet
CE880_Lecture6_slides
25 pages
Unit 7 - Evaluation
No ratings yet
Unit 7 - Evaluation
7 pages
c10 Ai Evaluation -2024-25
No ratings yet
c10 Ai Evaluation -2024-25
29 pages
Evaluation 2
No ratings yet
Evaluation 2
15 pages
EVALUATION PPT
No ratings yet
EVALUATION PPT
25 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Part B Chapter 7 (Evaluation)
No ratings yet
Part B Chapter 7 (Evaluation)
5 pages
Evaluation Class x Ai 417
No ratings yet
Evaluation Class x Ai 417
19 pages
Notes of Evaluation
No ratings yet
Notes of Evaluation
5 pages
ML Evaluation Metrics (1)
No ratings yet
ML Evaluation Metrics (1)
20 pages
Evaluation
No ratings yet
Evaluation
32 pages
MS EVALUATION WORKSHEET
No ratings yet
MS EVALUATION WORKSHEET
3 pages
2.Confusion matrix and Performmance Metrics
No ratings yet
2.Confusion matrix and Performmance Metrics
15 pages
Evaluation
No ratings yet
Evaluation
12 pages
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
No ratings yet
Learning Best Practices For Model Evaluation and Hyper-Parameter Tuning
20 pages
11.2 - Classification Evaluation Metrics
No ratings yet
11.2 - Classification Evaluation Metrics
22 pages
Confusion Matrix: A Confusion Matrix Is A Summary of Prediction Results On A Classification Problem
No ratings yet
Confusion Matrix: A Confusion Matrix Is A Summary of Prediction Results On A Classification Problem
13 pages
Evaluation 1
No ratings yet
Evaluation 1
23 pages
Evaluation-Practice Questions(Answer key)
No ratings yet
Evaluation-Practice Questions(Answer key)
4 pages
Confusion Matrix
No ratings yet
Confusion Matrix
7 pages
Performance Metrics (Classification) : Enrique J. de La Hoz D
100% (1)
Performance Metrics (Classification) : Enrique J. de La Hoz D
30 pages
Lecture 5
No ratings yet
Lecture 5
21 pages
Evaluation Grade10 Ai
No ratings yet
Evaluation Grade10 Ai
32 pages
UNIT-3
No ratings yet
UNIT-3
13 pages
ADS 5
No ratings yet
ADS 5
5 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
Accuracy and error measures
No ratings yet
Accuracy and error measures
14 pages
04 Evaluation Revision Notes
No ratings yet
04 Evaluation Revision Notes
5 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
MCS-011: Problem Solving and Programming
From Everand
MCS-011: Problem Solving and Programming
Dr. DK Sukhani
No ratings yet
(GEETHIC) Learning Output 2 - Scribd
No ratings yet
(GEETHIC) Learning Output 2 - Scribd
2 pages
Presentation 1
No ratings yet
Presentation 1
18 pages
Manual Removal of The Placenta After Vaginal Delivery
No ratings yet
Manual Removal of The Placenta After Vaginal Delivery
5 pages
Quantum Machine Intelligence: Mapping AI Applications: Carla Maria Alves Pereira Da Silva
No ratings yet
Quantum Machine Intelligence: Mapping AI Applications: Carla Maria Alves Pereira Da Silva
202 pages
Teach Prac 3 (Tp300 DL Yb21 21 Ag)
No ratings yet
Teach Prac 3 (Tp300 DL Yb21 21 Ag)
76 pages
SHS LP
No ratings yet
SHS LP
4 pages
Data Science
No ratings yet
Data Science
8 pages
Lesson Plan (One To One Correspondence)
No ratings yet
Lesson Plan (One To One Correspondence)
3 pages
Encoding. - EBSCO
No ratings yet
Encoding. - EBSCO
5 pages
Dr. Muhammad Wajahat Jan 2
No ratings yet
Dr. Muhammad Wajahat Jan 2
3 pages
11th Maths EM Answer Keys To Public Exam March 2023 Original Question Paper English Medium PDF Download 1
No ratings yet
11th Maths EM Answer Keys To Public Exam March 2023 Original Question Paper English Medium PDF Download 1
16 pages
Code Switching & Code Mixing
No ratings yet
Code Switching & Code Mixing
15 pages
Grade 10 - Theme 01 - Vocabulary List
No ratings yet
Grade 10 - Theme 01 - Vocabulary List
1 page
Mathematics P2 Nov 2016 Memo Afr & Eng
No ratings yet
Mathematics P2 Nov 2016 Memo Afr & Eng
26 pages
Field Study 1 - The Learners Development and Environment EPISODE 1
94% (32)
Field Study 1 - The Learners Development and Environment EPISODE 1
7 pages
Workshop..oral Com
No ratings yet
Workshop..oral Com
5 pages
(Research Paper) A Study of Attitudes by Amrit Pal Singh
No ratings yet
(Research Paper) A Study of Attitudes by Amrit Pal Singh
19 pages
SAT-Tense and Voice Problems
No ratings yet
SAT-Tense and Voice Problems
6 pages
Thesis Statement For The Golden Compass
100% (3)
Thesis Statement For The Golden Compass
8 pages
West Bengal Wbpgmat221113 FINAL
No ratings yet
West Bengal Wbpgmat221113 FINAL
16 pages
High School Diploma On Resume
100% (1)
High School Diploma On Resume
7 pages
Teoretical Foundation of Projective - Bellak-1-30
No ratings yet
Teoretical Foundation of Projective - Bellak-1-30
30 pages
LO1. Identify Materials and Tools For A Task
No ratings yet
LO1. Identify Materials and Tools For A Task
4 pages
Rizhao Application Form PDF
No ratings yet
Rizhao Application Form PDF
3 pages
MTN - Database Design Concepts
No ratings yet
MTN - Database Design Concepts
5 pages
English Month Proposal Format
No ratings yet
English Month Proposal Format
2 pages
Class Presentation.: Topic: Google Scholar
No ratings yet
Class Presentation.: Topic: Google Scholar
12 pages
Observation Assignments
No ratings yet
Observation Assignments
9 pages
Race and Gender Are Social Constructs: Sally Haslanger: Feminist Metaphysics (Stanford Encyclopedia of Philosophy)
No ratings yet
Race and Gender Are Social Constructs: Sally Haslanger: Feminist Metaphysics (Stanford Encyclopedia of Philosophy)
4 pages
2024 Work Immersion Evaluation Form
No ratings yet
2024 Work Immersion Evaluation Form
7 pages