جلسه 13

Uploaded by

Mohammd Ehsan Esfahani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views76 pages

جلسه 13

Uploaded by

Mohammd Ehsan Esfahani

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 76

‫دیشب بعد از حدود ‪ ۶‬ماه کودکان غزه آرام خوابیدند‬

‫در شش ماه گذشته دیشب اولین شبی بود که هیچکس در غزه شهید نشد‪.‬‬
‫جلسه سیزدهم‪ -‬یکشنبه ‪ ۲۶‬فروردين ‪۱۴۰۳‬‬
A confusion matrix is a square matrix that reports the counts of the true
positive (TP), true negative (TN), false positive (FP), and false negative (FN)
predictions of a classifier

Figure 6.10: A confusion matrix for our data

>>> from sklearn.metrics import confusion_matrix
>>> pipe_svc.fit(X_train, y_train)
>>> y_pred = pipe_svc.predict(X_test)
>>> confmat = confusion_matrix(y_true=y_test, y_pred=y_pred)
>>> print(confmat)
[[71 1]
[ 2 40]]

>>> fig, ax = plt.subplots(figsize=(2.5, 2.5))

>>> ax.matshow(confmat, cmap=plt.cm.Blues, alpha=0.3)
>>> for i in range(confmat.shape[0]):
... for j in range(confmat.shape[1]):
... ax.text(x=j, y=i, s=confmat[i, j],
... va='center', ha='center')
>>> ax.xaxis.set_ticks_position('bottom')
>>> plt.xlabel('Predicted label')
>>> plt.ylabel('True label')
>>> plt.show()
Assuming that class 1 (malignant) is the positive class
in this example, our model correctly classified 71 of the
examples that belong to class 0 (TN) and 40 examples
that belong to class 1 (TP), respectively. However, our
model also incorrectly misclassified two examples from
class 1 as class 0 (FN), and it predicted that one example Is
malignant although it is a benign tumor (FP). (pp. 195)
Both the prediction error (ERR) and accuracy (ACC) provide general information about how
many examples are misclassified. The error can be understood as the sum of all false
predictions divided by the number of total predictions, and the accuracy is calculated as the
sum of correct predictions divided by the total number of predictions, respectively:
The true positive rate (TPR) and false positive rate (FPR) are performance
metrics that are especially useful for imbalanced class problems:

In tumor diagnosis, for example, we are more concerned about the detection of malignant
tumors in order to help a patient with the appropriate treatment. However, it is also important
to decrease the number of benign tumors incorrectly classified as malignant (FP) to not
unnecessarily concern patients. In contrast to the FPR, the TPR provides useful information
about the fraction of positive (or relevant) examples that were correctly identified out of the
total pool of positives (P).
Both the prediction error (ERR) and accuracy (ACC) provide general information about how
many examples are misclassified. The error can be understood as the sum of all false
predictions divided by the number of total predictions, and the accuracy is calculated as the
sum of correct predictions divided by the total number of predictions, respectively:
The performance metrics precision (PRE) and recall (REC) are related to
those TP and TN rates, and in fact, REC is synonymous with TPR:

In other words, recall quantifies how many of the relevant records (the
positives) are captured as such (the true positives). Precision quantifies how
many of the records predicted as relevant (the sum of true and false positives)
are actually relevant (true positives):
F1 score is a machine learning evaluation metric that measures a model’s accuracy. It
combines the precision and recall scores of a model.

The accuracy metric computes how many times a model made a correct
prediction across the entire dataset. This can be a reliable metric only if the
dataset is class-balanced; that is, each class of the dataset has the same number
of samples.

Nevertheless, real-world datasets are heavily class-imbalanced, often making this

metric unviable. For example, if a binary class dataset has 90 and 10 samples in
class-1 and class-2, respectively, a model that only predicts “class-1,” regardless of
the sample, will still be 90% accurate. Accuracy computes how many times a
model made a correct prediction across the entire dataset. However, can this
model be called a good predictor? This is where the F1 score comes into play.
The F1 score is calculated as the harmonic
mean of the precision and recall scores. It
ranges from 0-100%, and a higher F1 score
denotes a better quality classifier.
When the positive class is considered,
the FP is 12, and the FN is 8. However,
for the negative class, the initial FP and
FN switch places. The FP is now 8, and
the FN is 12.
• Receiver operating characteristic (ROC) curves are useful tools to select models for
classification based on their performance with respect to the FPR and TPR, which are
computed by shifting the decision threshold of the classifier.
• The diagonal of a ROC graph can be interpreted as random guessing, and classification
models that fall below the diagonal are considered as worse than random guessing.
• A perfect classifier would fall into the top-left corner of the graph with a TPR of 1 and an
FPR of 0.
• Based on the ROC curve, we can then compute the so-called ROC area under the curve
(ROC AUC) to characterize the performance of a classification model.
Accuracy:
Percentage of positive instances out of the total
Precision: predicted positive instances. Take it as to find out
‘how much the model is right when it says it is right’.

Recall/Sensitivity/True Positive Rate:

Percentage of positive instances out of the total actual positive instances.
Therefore denominator (TP + FN) here is the actual number of positive
instances present in the dataset. Take it as to find out ‘how much extra right
ones, the model missed when it showed the right ones’.
Specificity:
Percentage of negative instances out of the total actual negative instances.
Therefore denominator (TN + FP) here is the actual number of negative
instances present in the dataset. It is similar to recall but the shift is on the
negative instances. Like finding out how many healthy patients were not
having cancer and were told they don’t have cancer.

F1 score:

It is the harmonic mean of precision and recall. This takes the contribution
of both, so higher the F1 score, the better.
https://www.datacamp.com/tutorial/what-is-a-confusion-matrix-in-machine-learning
Assuming that class 1 (malignant) is the positive class in this example, our model
correctly classified 71 of the examples that belong to class 0 (TN) and 40 examples
that belong to class 1 (TP), respectively. However, our model also incorrectly
misclassified 2 examples from class 1 as class 0 (FN), and it predicted that 1
example is malignant although it is a benign tumor (FP).
Actual values = [‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘dog’, ‘dog’, ‘cat’,
‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’]

Predicted values = [‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘cat’, ‘cat’, ‘cat’, ‘dog’, ‘dog’, ‘dog’, ‘cat’,
‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’]
The F1 score is a number between 0 and 1 and is the harmonic
mean of precision and recall.
F1 score is a harmonic mean of Precision and Recall. As compared to
Arithmetic Mean, Harmonic Mean punishes the extreme values more.
F-score should be high (ideally 1).
https://neptune.ai/blog/performance-metrics-in-machine-learning-complete-guide

How to Evaluate the Performance of a

Machine Learning Model
https://www.kdnuggets.com/2020/09/performance-machine-learning-model.html
Count plot showing how many has heart disease or not
https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html?source=post_page-----
d1c0f8feda5--------------------------------
Sensitivity and Specificity
Recall, Precision and F-Measure
Recall and Precision can be reported by a measure that combines them. One
example is called F-measure, which is the harmonic mean of recall and precision:
Receiver operating characteristic (ROC) graphs are useful tools to select models
for classification based on their performance with respect to the FPR and TPR,
which are computed by shifting the decision threshold of the classifier.

The diagonal of a ROC graph can be interpreted as random guessing, and

classification models that fall below the diagonal are considered as worse than
random guessing.

A perfect classifier would fall into the top-left corner of the graph with a TPR of 1
and an FPR of 0. Based on the ROC curve, we can then compute the so-called ROC
area under the curve (ROC AUC) to characterize the performance of a classification
Figure 6.11: The ROC plot
• Class imbalance is a quite common problem when working with real-world
data—examples from one class or multiple classes are over-represented in a
dataset

• We can think of several domains where this may occur, such as spam filtering,
fraud detection, or screening for diseases

• One way to deal with imbalanced class proportions during model fitting is to
assign a larger penalty to wrong predictions on the minority class
• At the beginning of this chapter, we discussed how to chain different transformation
techniques and classifiers in convenient model pipelines that help us to train and
evaluate machine learning models more efficiently
• We then used those pipelines to perform k-fold cross-validation, one of the essential
techniques for model selection and evaluation
• Using k-fold cross-validation, we plotted learning and validation curves to diagnose
common problems of learning algorithms, such as overfitting and underfitting
• Using grid search, randomized search, and successive halving, we further fine-tuned
our model
• We then used confusion matrices and various performance metrics to evaluate and
optimize a model’s performance for specific problem tasks
• Finally, we concluded this chapter by discussing different methods for dealing with
imbalanced data, which is a common problem in many real-world applications
Now, you should be well equipped with the essential techniques to build
supervised machine learning models for classification successfully
o True
o False

ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
11 pages
ML CH 5
No ratings yet
ML CH 5
45 pages
Lec5 Classification
No ratings yet
Lec5 Classification
27 pages
Binary Classification PDF
No ratings yet
Binary Classification PDF
27 pages
ML Metrics
No ratings yet
ML Metrics
9 pages
Imbalance Problem
No ratings yet
Imbalance Problem
13 pages
Evaluation Measures For Machine Learning Models
No ratings yet
Evaluation Measures For Machine Learning Models
6 pages
ML Lecture 11 Evaluation
No ratings yet
ML Lecture 11 Evaluation
17 pages
Performance Metrics
No ratings yet
Performance Metrics
34 pages
Iai&ml Unit-5
No ratings yet
Iai&ml Unit-5
15 pages
Machine Learning Evaluation Metrics
No ratings yet
Machine Learning Evaluation Metrics
16 pages
Ai DS 2 Book-Chpt-5
No ratings yet
Ai DS 2 Book-Chpt-5
17 pages
Lecture11evaluationmetricsforclassification 240913060639 0c766554
No ratings yet
Lecture11evaluationmetricsforclassification 240913060639 0c766554
28 pages
Lecture - 3
No ratings yet
Lecture - 3
24 pages
ML Model Evaluation Metrics
No ratings yet
ML Model Evaluation Metrics
8 pages
Unit8 (Evaluation Method)
No ratings yet
Unit8 (Evaluation Method)
43 pages
UNIT-1-2.Binary Classification and Related Tasks
No ratings yet
UNIT-1-2.Binary Classification and Related Tasks
22 pages
DL IT324a 4
No ratings yet
DL IT324a 4
52 pages
Unit 4
No ratings yet
Unit 4
20 pages
03 Performance Metrics
No ratings yet
03 Performance Metrics
15 pages
Performance Metrics
No ratings yet
Performance Metrics
12 pages
Classification Metrics
No ratings yet
Classification Metrics
39 pages
Machine Learningassignment
No ratings yet
Machine Learningassignment
10 pages
Classification Metrics Mod 6
No ratings yet
Classification Metrics Mod 6
8 pages
19-Performance Metrics
No ratings yet
19-Performance Metrics
23 pages
Binary Classifier Evaluation Guide
No ratings yet
Binary Classifier Evaluation Guide
12 pages
CS340 Machine Learning ROC Curves
No ratings yet
CS340 Machine Learning ROC Curves
8 pages
Roc 1 PDF
No ratings yet
Roc 1 PDF
8 pages
Lec 12 13 Evaluation Measures
No ratings yet
Lec 12 13 Evaluation Measures
45 pages
Module 5 ML
No ratings yet
Module 5 ML
12 pages
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
No ratings yet
Evaluation Metrics: Yining Chen (Adapted From Slides by Anand Avati) May 1, 2020
31 pages
Machine Learning II
No ratings yet
Machine Learning II
61 pages
Performance Parameters
No ratings yet
Performance Parameters
14 pages
Understanding F1 Score, Accuracy, ROC-AUC & PR-AUC Metrics
No ratings yet
Understanding F1 Score, Accuracy, ROC-AUC & PR-AUC Metrics
10 pages
Machine Learning Project Report (Group 3) Shahbaz Khan
No ratings yet
Machine Learning Project Report (Group 3) Shahbaz Khan
11 pages
Unit 2 Classification
No ratings yet
Unit 2 Classification
59 pages
Evaluation Metrics in Machine Learning - GeeksforGeeks
No ratings yet
Evaluation Metrics in Machine Learning - GeeksforGeeks
6 pages
Evaluating Models CH-3
No ratings yet
Evaluating Models CH-3
5 pages
06-FSSR DS610 2024 2025T1 Metrics
No ratings yet
06-FSSR DS610 2024 2025T1 Metrics
24 pages
L22 KNN+Metrics
No ratings yet
L22 KNN+Metrics
18 pages
Confusion Matrix
No ratings yet
Confusion Matrix
5 pages
Unit 2 Chap 4
No ratings yet
Unit 2 Chap 4
14 pages
ML Unit 3
No ratings yet
ML Unit 3
127 pages
Hands On Machine Learning 3 Edition
No ratings yet
Hands On Machine Learning 3 Edition
31 pages
Unit6 - 7 Issues
No ratings yet
Unit6 - 7 Issues
53 pages
Confusion Matrix V 2.0
No ratings yet
Confusion Matrix V 2.0
14 pages
3 - Model Evaluation & Validation
No ratings yet
3 - Model Evaluation & Validation
47 pages
ML - Training - Evaluation For Machine Learning Course
No ratings yet
ML - Training - Evaluation For Machine Learning Course
31 pages
A10 Model Performance v2 2up
No ratings yet
A10 Model Performance v2 2up
11 pages
? Task
No ratings yet
? Task
23 pages
Unit 4 Model Evaluation
No ratings yet
Unit 4 Model Evaluation
24 pages
Performance Evaluation
No ratings yet
Performance Evaluation
24 pages
CH-5 ML
No ratings yet
CH-5 ML
36 pages
Confusion Matrix
No ratings yet
Confusion Matrix
16 pages
L 13 Choose Your Own Algorithm D 07062024 111828am
No ratings yet
L 13 Choose Your Own Algorithm D 07062024 111828am
36 pages
F1 Score Vs ROC AUC Vs Accuracy Vs PR AUC Which Evaluation Metric Should You Choose - Neptune - Ai
No ratings yet
F1 Score Vs ROC AUC Vs Accuracy Vs PR AUC Which Evaluation Metric Should You Choose - Neptune - Ai
1 page
Lecture 3b - Evaluation
No ratings yet
Lecture 3b - Evaluation
37 pages
W6 CSE 4781 Classification Metrics
No ratings yet
W6 CSE 4781 Classification Metrics
28 pages
PIS Admin Manual PDF
No ratings yet
PIS Admin Manual PDF
63 pages
English Lit - American Essay Beulah
No ratings yet
English Lit - American Essay Beulah
3 pages
PT2 Datesheet 2024-25
No ratings yet
PT2 Datesheet 2024-25
1 page
Listening - Skills - Working With Kids
No ratings yet
Listening - Skills - Working With Kids
6 pages
Death Keeps No Calendar - Dating Mortuary Hardware From The Saints
100% (1)
Death Keeps No Calendar - Dating Mortuary Hardware From The Saints
200 pages
Todd Parr Kit
100% (3)
Todd Parr Kit
12 pages
Lesson 15: Bode Plots of Transfer Functions: ET 438a Automatic Control Systems Technology
No ratings yet
Lesson 15: Bode Plots of Transfer Functions: ET 438a Automatic Control Systems Technology
13 pages
Finite Diference Solutions of Seepage Problems
No ratings yet
Finite Diference Solutions of Seepage Problems
17 pages
Kai-Oi Joyce Yung: Artist Biography
No ratings yet
Kai-Oi Joyce Yung: Artist Biography
2 pages
Pagan Rites and Ceremonies: Account of The
No ratings yet
Pagan Rites and Ceremonies: Account of The
25 pages
Verkholantsev, Julia - The Slavonic Letters of St. Jerome, 2014
100% (1)
Verkholantsev, Julia - The Slavonic Letters of St. Jerome, 2014
275 pages
F7 - Interns-Performance-Appraisal-Form-updated
No ratings yet
F7 - Interns-Performance-Appraisal-Form-updated
2 pages
Ali Et Al. - 2016 - Satellite Remote Sensing of Grasslands From Observation To Management (2) - Annotated
No ratings yet
Ali Et Al. - 2016 - Satellite Remote Sensing of Grasslands From Observation To Management (2) - Annotated
23 pages
Cambridge International AS Level: 8021/22 English General Paper
No ratings yet
Cambridge International AS Level: 8021/22 English General Paper
8 pages
Quality Manual1
No ratings yet
Quality Manual1
2 pages
Dave Ramsey's Complete Guide To - Dave Ramsey PDF
100% (10)
Dave Ramsey's Complete Guide To - Dave Ramsey PDF
438 pages
DSWD General Functions
No ratings yet
DSWD General Functions
13 pages
Belly of The Beast RPG Quick Start
No ratings yet
Belly of The Beast RPG Quick Start
5 pages
Passive Voice Grammar Guide
No ratings yet
Passive Voice Grammar Guide
5 pages
Governor Brian P. Kemp Commissioner Spencer R. Moore
100% (1)
Governor Brian P. Kemp Commissioner Spencer R. Moore
1 page
Kelley-Thelonious Monk Seminar
No ratings yet
Kelley-Thelonious Monk Seminar
9 pages
One Page PhD Proposal Guide
No ratings yet
One Page PhD Proposal Guide
1 page
Bullying in Schools Facts and Intervention
No ratings yet
Bullying in Schools Facts and Intervention
30 pages
CSC - Individual Assignment 01-Tuyet Nhung
No ratings yet
CSC - Individual Assignment 01-Tuyet Nhung
3 pages
Jio Airfiber Channels 13.3.25
No ratings yet
Jio Airfiber Channels 13.3.25
14 pages
Robbins Fom8 TB 04a
40% (5)
Robbins Fom8 TB 04a
35 pages
Test Bank For Human Resources Management in Canada 14th Canadian Edition Dessler - PDF Download
100% (8)
Test Bank For Human Resources Management in Canada 14th Canadian Edition Dessler - PDF Download
44 pages
SAMBUHAY Missalette - Palm Sunday (Passion Sunday) - April 10, 2022
No ratings yet
SAMBUHAY Missalette - Palm Sunday (Passion Sunday) - April 10, 2022
4 pages
Influence Lines For Indeterminate
No ratings yet
Influence Lines For Indeterminate
43 pages
2018 Omnibus Sworn Statement For SVP
No ratings yet
2018 Omnibus Sworn Statement For SVP
2 pages

جلسه 13

Uploaded by

جلسه 13

Uploaded by

‫دیشب بعد از حدود ‪ ۶‬ماه کودکان غزه آرام خوابیدند‬

Figure 6.10: A confusion matrix for our data

>>> fig, ax = plt.subplots(figsize=(2.5, 2.5))

Nevertheless, real-world datasets are heavily class-imbalanced, often making this

Recall/Sensitivity/True Positive Rate:

How to Evaluate the Performance of a

The diagonal of a ROC graph can be interpreted as random guessing, and

You might also like