Estimating the Performance of a
Classifier -II
True and False Positives and Negatives
• The idea of a confusion matrix is already
introduced. When there are two classes,
which we will call positive and negative (or
simply + and −)
• The confusion matrix consists of four cells,
which can be labelled TP, FP, FN and TN
True and False Positives and Negatives
True and False Positives and Negatives
• It is often useful to distinguish between the two types of
classification error:
false positives and false negatives.
False positives (also known as Type 1 Errors) occur
when instances that should be classified as negative are
classified as positive.
False negatives (also known as Type 2 Errors) occur
when instances that should be classified as positive are
classified as negative.
Depending on the application, errors of these two types
are of more or less importance.
Performance Measures
We can now define a number of performance measures for a classifier applied
to a given test set.
ROC Graphs
• The TP Rate and FP Rate values of different classifiers on the
same test set are often represented diagrammatically by a
ROC Graph.
• ROC Graph stands for ‘Receiver Operating Characteristics’
graph, which reflects its original uses in signal processing
applications.
ROC Graphs
• The points (0, 1), (1, 0), (1, 1) and (0, 0) correspond
to the four special cases mentioned before in order.
• The first (0,1) is located at the best possible position
on the graph, the top left-hand corner.
• The second (1, 0) is at the worst possible position,
the bottom right-hand corner.
• If all the classifiers are good ones, all the points on
the ROC Graph are likely to be around the top left-
hand corner.
ROC Graphs
• One classifier is better than another if its
corresponding point on the ROC Graph is to the
‘north-west’ of the other’s.
• Classifier represented by (0.1, 0.6) is better than the
one represented by (0.2, 0.5).
– It has a lower FP Rate and a higher TP Rate.
• If we compare points (0.1, 0.6) and (0.2, 0.7), the
latter has a higher TP Rate but also a higher FP Rate.
– Neither classifier is superior to the other on both
measures and the one chosen will depend on the relative
importance given by the user to the two measures.
ROC Curves
• In general, each classifier corresponds to a
single point on a ROC Graph.
• However, there are some classification
algorithms where ‘tuning’ of some parameters
are needed.
– In this case, it is reasonable to think of a series of
classifiers, each is represented by a point in the
graph
– The points can be joined to form a ROC Curve
An example ROC Curve
Examining ROC curves can give insights into the best way of tuning
a classification algorithm.
In the above curve, performance clearly degrades after the
third point in the series.
AUC or Area Under Curve (ROC Curve)
Important Notes:
• Points along the diagonal represent a situation which
is similar to making a prediction by tossing of an
unbiased coin, that is classification model is random.
• All models with points above the diagonal have better
performance than a model which makes predictions
randomly.
• All models with points below the diagonal have worse
performance than a model which makes predictions
randomly.
AUC
Green curve has better predictability than red color
curve.
AUC
• The more an ROC curve is lifted up and away from the
diagonal the better the model is.
• Since our X-axis scale is maximum 1 (FPR may vary from
0 to 100% or 1) and since Y-axis scale is also maximum
one (TPR may vary from 0 to 100% or 1), total area here
(under the rectangle) is 1*1=1
• The more the ROC curve is lifted up and away from the
diagonal (and touches the top-horizontal line, TPR=1),
the more area will be under it and this area will be
closer to 1
• If the ROC curve coincides with the diagonal, area under
it is 0.5 (a diagonal divides a square/rectangle in half)..
AUC
• The area under (ROC) curve is known as AUC
• This area, therefore, should be greater than 0.5
for a model to be acceptable
• A model with AUC of 0.5 or less is worthless.
• Understandably, this area is a measure of
predictive accuracy of model.
AUC
A rough guide for classifying the accuracy in the
traditional academic point system:
0.90-1 = excellent (A)
0.80-.90 = good (B)
0.70-.80 = fair (C)
0.60-.70 = poor (D)
0.50-.60 = fail (F)