0% found this document useful (0 votes)
27 views5 pages

PROS - Ivanna Kristianti T - Predicting Receiver Operating Characteristic - Fulltext

This document discusses methods for predicting classifier performance measurements like the receiver operating characteristic (ROC) curve, area under the ROC curve (AUC), and arithmetic means of accuracies (Ameans) based on the distributions of data samples. The key aspects covered are defining ROC curves and AUC, reviewing Ameans, and presenting algorithms for predicting these performance metrics using probability density functions and distributions of the data samples without requiring experimentation. The predicted measurements may help evaluate feature discriminability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views5 pages

PROS - Ivanna Kristianti T - Predicting Receiver Operating Characteristic - Fulltext

This document discusses methods for predicting classifier performance measurements like the receiver operating characteristic (ROC) curve, area under the ROC curve (AUC), and arithmetic means of accuracies (Ameans) based on the distributions of data samples. The key aspects covered are defining ROC curves and AUC, reviewing Ameans, and presenting algorithms for predicting these performance metrics using probability density functions and distributions of the data samples without requiring experimentation. The predicted measurements may help evaluate feature discriminability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2nd 2013 IEEE Conference on Control, Systems and Industrial Informatics (ICCSII)

Bandung, Indonesia, June 23-26, 2013

Predicting Receiver Operating Characteristic Curve,


Area Under Curve, and Arithmetic Means of
Accuracies based on the Distribution of Data Samples
Ivanna Kristianti Timotius

Department of Electronic Engineering


Satya Wacana Christian University
Salatiga, Indonesia
[email protected]

Abstract-Measuring the performance of a classifier is an arithmetic means of accuracies (Ameans) [2] based on the
:ssential step in building a classification method for a two class distributions of the data samples.
:lassification problem. The Receiver Operating Characteristic
035

ROC) Curve, Area Under ROC Curve (AUC), and Arithmetic


Means of Accuracies (Ameans) are several classifier performance 03 .

measurements that are typically calculated by conducting an 025

experiment. This paper presents predicting methods of these


02
classifier performance measurements based on the data sample .

distributions. The experiment shows that the predicting methods 0 15


.

results are similar with the empirical results using the testing
data set. Therefore the methods are applicable in predicting the
classifier performance without conducting an experiment. The 005

predicted performance measurements might be useful in


evaluating the discriminability of a feature sample. % -5 0 5 10 15

Figure 1. An example of the distribution of data samples


Keywords-classifier performance measurement; receiver
operating characteristic curve; area under curve; arithmetic means
This predicted performance measurements might give
of accuracies; data sample distribution
information about the discriminability of a feature fed to a
classifier. Commonly, a discriminative feature is a feature
I INTRODUCTION
.

having high mean difference between classes and low variation


The performances of some classifiers depend on the within classes. If a non-discriminative feature is fed to a
arbitrary selection of decision thresholds. In such classifiers, classifier, it might decrease the classification accuracies and/or
the Receiver Operating Characteristic (ROC) curve is increase the classification time. The authors in [3] define the
employed to give an empirical description of a decision discriminability (or relevance) of a feature based on the mean
threshold effect [1]. In the case of two class classification based difference between the classes. This paper proposes a
on a decision threshold, if the data distributions of the positive performance measurement prediction method that might be
and negative data samples are depicted in Fig. 1, the used to evaluate the discriminability of a feature based on the
performance of the classifier is greatly depend on the chosen average and the standard deviation of the feature samples
decision threshold. Moving the decision threshold to the (assumed that the feature sample distributions are Gaussian) or
negative class will decrease the number of correctly classified based on the distribution of the feature sample.
data on the negative class, but will increase the number of The rest of this paper is organized as follows. Section 2
correctly classified data on the positive class. Changing the gives a brief explanation of ROC curve and AUC! Section 3
decision threshold to any value will give different numbers of reviews the concept of Ameans. Section 4 presents the ROC,
correctly classified data on the negative and positive classes. AUC, and Ameans predicting methods. Section 5 presents the
An ROC curve is used to depict the effect of changing this experimental design and results of the methods. Finally,
decision threshold.
Section 6 presents the conclusions of the work.
The ROC curve is typically shaped empirically by using a
testing data set. However, this paper proposes a method to II. Receiver Operating Characteristic (ROC) Curve
predict the ROC curve based on the distribution of the data and area Under ROC Curve (AUC)
samples. Knowing the predicted ROC curve, the predicted
An ROC curve or an ROC graph [4] is a depiction of
Area Under ROC Curve (AUC) can be calculated as well. In
addition, this paper also proposes an algorithm to predict the classifier performance in an ROC space. An ROC space is a
two-dimensional space in which the true positive rate (TP rate)
value of decision threshold having the highest value of

978-1 -4673-5817-0/13/$31.00 ©2013 IEEE 60


is plotted on the Y axis and the false positive rate (FP rate) is IV. Predicting the ROC Curve, area Under ROC
plotted on the X axis [1][5]. The TP rate also known as true Curve, and the arithmetic Means of accuracies
positive fraction (TPF) or sensitivity is defined as: The ROC curve predicting method proposed in this paper is
based on the probability density functions of the negative data
number of positives correctly classified sample fN(x) and the probability density function of positive
TP rate =----- (1) data sample fp(x). The cumulative distribution functions for the
number of total positives negative and positive data samples are respectively denoted by
Fjÿx) and FP{x). Note that if a classifier classifies the data
Whereas, the FP rate also known as the false positive samples based on a feature of the data, /aX.x) and fp(x) are the
_

fraction (FPF) is defined as: probability density function of the feature samples. Also if a
classifier is based on a parameter calculated from the feature
samples, fÿx) and f/>(x) are the probability density function of
number of negatives incorrectly classified the parameter. Given a decision threshold, th, the predicted TN
FP rate = -- & J- (2)
__

number of total negatives rate, FP rate, TP rate, and FN rate are calculated by the
following equations.
In a two-class classifier using a decision threshold each ,

value of decision threshold will give a pair of TP rate and FP Predicted TN rate = FN(th) (5)
rate, then consequently will give a point in the ROC space. By
conducting experiments using several decision thresholds, the
classifier will produce several points in the ROC space. These
points can be connected to produce a curve which is called an Predicted FP rate = 1- Predicted TN rate (6)
ROC curve.

The point (0, 0) in ROC space represents the never issuing


a positive classification strategy. The point (1, 1) in ROC space Predicted FN rate = Fr(th) (7)
represents the strategy of always issuing positive classification.
The point (0, 1) represents a perfect classification. Basically, it
is desirable to have a classifier with a high TP rate and low FP Predicted TP rate = 1 - Predicted FN rate (8)
rate.

The Area Under ROC Curve (AUC) is one possible method By changing the decision threshold, th, several points in the
to compare classifiers based on their ROC curves [4] [6]. Each ROC space can be generated. Thus, an ROC curve can be
ROC curve produces one value of AUC. The range of AUC is predicted. By knowing the ROC curve, the predicted AUC of
between 0 and 1 since AUC is a portion of ROC space having the ROC curve can be calculated. The Ameans of each point in
the ROC curve and the maximum value of the Ameans can be
an area of a unit square.
calculated as well.

III. Arithmetic Means of Accuracies One possible method to choose the best decision threshold
in a classifier is by observing the point in ROC curve which
The arithmetic means of accuracies (Ameans) is one
has the maximum value of Ameans. This paper proposes a
possible method in comparing classifiers between several method of choosing this decision threshold, tÿ ,, so that the
points in the ROC space [2]. A non-weighted Ameans for a maximum value.of Ameans can be calculated by the following
two-class classification is defined as:
equation.

Ameans =-(TP rate + TN rate) (3) Predicted Max Ameans = (FN (tbes,) + (1 - FP (tbes,)) (9)

where
The method of choosing the decision threshold, //*,„, starts
with predicting that the highest Ameans is obtained if the t/,es, is
TN rate = 1 - FP rate set at the intersection of the negative and positive probability
density functions as shown in the following equation.
_
number of negatives correctly classified (4)
number of total negatives
/»(0 = /,(0 (10)
The value of Ameans is ranging between 0 and 1. Ameans
can be used to measure the performance of classifiers that can In the case that fÿ(x) is a Gaussian probability density
only create one point in the ROC space. A better classifier is a function [7] having mean mN and standard deviation cfy, and
classifier producing a higher Ameans. fiix) is a Gaussian probability density function having mean mi>
and standard deviation ar, the best decision threshold, /(*,„ can
be calculated using the following equation.

61
2 predicted maximum Ameans is similar with the experiment
m
P
-

m\ results as well. Therefore these prediction methods are


, if <7ÿ= 0> ,

2 (mP-mN) applicable in predicting the classifier performances.


-b{ +yjb *
- 4 a,<
if crw < CTJt (11) TABLE I. The AUC and Ameans of Fig. 2
2a,
Classifier Calculation Methods
-i2 + y[b% -4a ÿ2ÿ2 Performance Experiments
if <7W > crP Measurement Imbalanced
Prediction
2a, Balanced

AUC 87.35% 87.12% 86.63%


where Maximum value
80.00% 79 90% 79.50%
of Ameans

a, =al-ai (12)

2(mP(7N mÿ(7p) (13)

1 2 i
c, = mNap - mpcrN -

2aN2 crP1 In | - P (14)


cr
N /
,

- Empirical ROC cuive


1 1
a2=aN-aP (15) * Point with the maximum Ameans
-
- Empirical ROC curve for imbalanced data set
0 Point with the maximum Ameans for imbalanced data set
Predicted ROC curve
o Predicted point with the maximum Ameans
b2* 2{mÿ(jp mpgÿ) (16) 0 1
.
0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
FP Rate

Figure 2. The predicted ROC Curve


2 2crlcrl In -
- m <jp
N
-

(17) The effects of the data sample mean and standard deviation
to the predicted ROC curve, AUC, and maximum Ameans are
shown in Fig. 3 and Fig. 4. The circles shown in Fig. 3 and Fig.
v Experiments and Results
4 depict the points in ROC space having the maximum Ameans
at the associated ROC curves. The intersection appeared in Fig.
.

The experiments are conducted by generating Gaussian 4 indicates that there is a point with equal performace resulted
iistributed random data. The class distributions have mN = 4 , from different values of standard deviation and different values
nP = 8, aN = 3, and o> = 2. The first experiment generates 1000 of decision threshold, th.
data for each class. The second experiment used an imbalanced
lata set (1000 data for the positive class and 500 data for the
legative class). By changing the decision threshold the ,

empirical ROC curves of this classifier are shown in Fig. 2.


The AUCs calculated from the empirical ROC curves are
shown in Table I. In the experiment by using these random
data, the empirical Ameans obtained by the decision thresholds
ire calculated. Subsequently the point having the maximum
,

values of Ameans are shown in Fig. 2.


The predicted ROC curve calculated based on the mean and
the standard deviation of the classes is shown in Fig. 2. The
predicted point with the maximum value of Ameans which is
obtained from the predicting method is also shown in Fig. 2.
The value of these AUC and Ameans of Fig. 2 are shown in
Table I.
03
.
0 4 0 5 0 $
The predicted ROC curve shown in Fig. 2 is similar with FP rate

the ROC curves obtained empirically from the balanced data


set and the imbalanced data set. The predicted AUC and Figure 3. Predicted ROC curves with m,v = 0, <jN = 2, and Op = 1.

62
third feature which has the highest average difference between
the classes.
09
.

08
.

TABLE II. The statistical value of the second and third class
IN IRIS DATA SET
07
.

First Second Third Fourth


06
Statistical Value
.

Feature Feature Feature Feature


Average of the 2"d 5 94.
2 77
.
4 26
.
1 33.

Class

04
.
Average of the 3"' 6 59.
2 97 5 55
.
2 03.
.

Class
03
.
Average difference
f
of the 2 d and 3
" '
0 65.
0 20
.
1 29
.
0 70.

02
. class
Standard deviation
0 1 0 52 0 31 0 47 0 20
of the 2 d class
.
" . . . .

Standard deviation
0 0 64 0 32 0 55 0 27
of the 3 J class
' . . . .

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1


FP rate

TABLE III. The STATISTICAL VALUE OF THE DISTANCE BETWEEN THE


Figure 4. Predicted ROC curves with my = 0 mP = 2, and
, = 2. CLASSES AND THE AVERAGE OF THE CLASSES

Standard
The ROC curve, AUC, and Ameans significantly depend Distance Average Deviation
on the mean and standard deviation of the data samples.
2"d class to the average of 2 J
"
class 0 71 0 34
Commonly, the data samples with higher mean difference and . .

lower standard deviations are easier to be classified.


3'd class to the average of 2 d class
"
1 74
.
0 70
.

The next experiments are done using the second and third 2"J class to the average of 3 class
"'
1 70
.
0 60
.

class of Iris flower data set [8]. Each class of the Iris data set "'
contains 50 data. The data set consists of information about the 3,d class to the average of 3 class 0 82
.
0 45
.

sepal length as the first feature, sepal width as the second


feature, petal length as the third feature, and petal width as the TABLE IV. The Predicted AUC and Ameans of Fig. 5
fourth feature of the flowers. The average and the standard Performance Measurement
deviation of the second and third class of this Iris data set are Feature of Iris data set Maximum value
AUC
given in Tabel II. Using these statistical values and an of Ameans
assumsion that the data distributions are Gaussian, the ROC 78.47% 71.75%
First feature (sepal length)
curves, AUC, and maximum value of Ameans from classifiers
based on decision thresholds are predicted and shown in Table Second feature (sepal width) 67.35% 62.59%

IV and Fig. 5. In Fig. 5 the predicted ROC curves from the first Third feature (petal length) 96.22% 89.74%
feature until the fourth feature are respectively denoted by
ROC1, ROC2, ROC3, and ROC4. The predicted points with Fourth feature (petal width) 97.93% 93.20%

the maximum value of Ameans assosiated with the ROC


Distance to the mean of the 2"r class 90.86% 85.38%
curves are respectively denoted by Ameans 1, Ameans2,
Ameans3, and Ameans4 in Fig. 5. Distance to the mean of the 3'J class 87.83% 80.13%

Further experiments are done using the Euclidean distance 1 '

.
..
.

-
J

between the data and the average of the second and third class. 09
.
ÿx _ -
ÿ ÿ -

The average and standard deviation of the Euclidean distances OB


are shown in Table III. The ROC curves, AUC, and maximum
07 /
value of Ameans based on these data are predicted and shown .

/ -
ROC1
j y
in Table IV and Fig. 5. In Fig. 5 the predicted ROC curves 06
.
: / ' / O Ameansl

obtained from the experiment based on the data sample


a>
<5 / / -
ROC2

distance to the average of the second and third class are


CC
a .
0 5
7 / /-ÿ, ÿ Ameans2
i -
< / / ÿ. ROC3
0 4 !i /
respectively denoted by ROCc2 and ROCc3. The predicted
ÿ

.
. * Ameans3
1 / /
I , / .
ROC4
points having the maximum value of Ameans assosiated with 03
.
"

i
1 /
/ V Ameans4
the ROC curves are denoted respectively by Ameansc2 and 02
t r ROCc2
/
.

Ameansc2
Ameansc3 in Fig. 5.
01
.
-
ROCc3
ÿ Ameansc3
The results in Table IV and Fig. 5 give us information 0
about the discriminability of the features. The highest 0 0.1 0.2 0.3 0.4 0.5
FP Rate
0.6 0.7 0.8 09 1

discriminability is obtained by using the fourth feature of the


Iris data set which has the lowest standard deviation within the
Figure 5. Experimental results using the Iris data set.
classes. The second best discriminability is obtained using the

63
VI. Conclusions [2] I. K. Timotius and S. G. Miaou, "Arithmetic means of accuracies: a
"

classifier performance measurement for imbalanced data set, Proc. of


Methods in predicting the ROC curve AUC, and the , Int. Conf. on Audio , Language, and Image Processing, 2010.
maximum Ameans based on the sample distributions have been [3] M. T. Coimbra and J. P. S. Cunha, "MPEG-7 visual descriptor -
presented. These methods are shown having the similar results
"

contribution for automated feature extractor in capsule endoscopy,


with the experimental results. IEEE Trans, on Circuit and Systems for Video Technology, vol. 16, no.
5, pp. 628-637 2006.
,

The predicted performace measurements might be useful as [4] T. Fawcett, "An introduction to ROC analysis," Pattern Recognition
tools for feature sample discriminability examination. This is Letters, vol. 27 issue 8, pp. 861 - 874, 2006.
,

for the reason that the measurements depend on not simply by [5] F. Provost and T. Fawcett, "Analysis and visualization of classifier
"

the mean difference between classes , but also the standard performance: comparison under imprecise class and cost distributions,
Proc. of the Third Int. Conf. in Knowledge Discovery and Data Mining,
deviation within classes (if the feature samples are assumed to
pp. 43-48, 1997.
have Gaussian distributions). In the future, we will conduct
[6] C. X. Ling, J. Huang, and H. Zhang, "AUC: a statistically consistent and
further research on the feature discriminability examination. more discriminating measure than accuracy, Proc. of Int Joint Conf. on
"

Artificial Intelligence, 2003.


References [7] P. Z. Peebles, Probability, Random Variables, and Random Signal
Principles. 3,d ed., McGraw-Hill: Singapore, 1993.
[1] C. E. Metz, "Basic principle of ROC analysis," Seminars in Nuclear
Medicine, vol.VIII no. 4, 1978.
,
[8] R. A. Fisher, "The use of multiple measurements in taxonomic
"
problems, Annals of Eugenics, vol. 7, no 2, pp. 179-188, 1936.

64

You might also like