Microelectron. Reliab., Vol. 30, No. 6, pp. 1085-1090, 1990. 0026--2714/9053.00 + .
00
Printed in Great Britain. © 1990 Pergamon Press pie
A N S-SHAPED SOFTWARE RELIABILITY GROWTH
MODEL WITH TWO TYPES OF ERRORS
NISHI KAREERl, P. K. KAPUR2 and P. S. GROVERt
'Department of Computer Science and 2Department of Operational Research, University of Delhi,
Delhi 110 007, India
(Receivedfor publication 11 December 1989)
Abstract--An S-shaped software reliability growth model (SRGM) based on a non-homogeneous Poisson
process (NHPP) with two types of errors has been proposed. The errors have been classified depending
upon their severity. We have estimated the model parameters and obtained the optimum release policies
which minimize the cost subject to achieving a given level of reliability. Numerical results illustrating the
applicability of the proposed model are also presented.
1. INTRODUCTION 2. SOFTWARE RELIABILITY GROWTH MODELS
As software forms an important part of many critical The general assumption in these models is that all
missions such as space shuttles, and important sys- defects during the development and correction process
tems such as nuclear reactors and heart monitors, the do not introduce any new defects and hence this ideal
reliable operation of these projects/systems depends process of perfect debugging allows the reliability to
critically on the reliable operation of their software increase throughout the testing process. The proposed
components. Because of the total dependence of these model also falls in this category and is described in
systems on the underlying software, the concept of detail in Section 3. The simplest model is the exponen-
software reliability has gained considerable importance tial SRGM based on an NHPP given by Goel and
over the years. Okumoto [6]. It has a mean value function given by
The life cycle (LC) of software involves a series of
re(t) = a(1 - e -bt) a > O, b > O, (1)
production activities and can generally be divided
into four phases: design, coding, testing and operation/ where a is the expected number of errors to be
maintenance phases [1]. In spite of great advance- eventually detected and b is the error detection rate
ments in the programming technology, the chances per error at time t.
for error occurrence due to human imperfection at A modification of this model which takes care of
every step are many. In other words, software can different types of errors in the software depending
never be made error free. on their severity is the modified exponential model
The efficient management of testing (manufacturer's developed by Yamada and Osaki [7]. This SRGM
end) and maintenance (user's end) phases of the assumes two types of errors in the software; Type I
software LC is very important as delay in the release errors are easy to detect, with b l as the error
of software may cost a company a significant amount detection rate per Type I error, and Type II errors are
of money in penalties or lost revenue, but a pre- difficult to detect, with b2 as the detection rate per
mature release may also backfire as it often means Type II error. The mean value function is
more cost in terms of fixes done in the field (user's
end) and damage to the company's reputation. It m(t) = ~p~a(1 - exp(-b,t)) (2)
i
is thus often desirable to estimate the attributes of
software that give an idea of its reliability at any point where p~ is the error content proportion of type i
of time in its LC. Several attempts have been made error,
in the past to estimate reliability measures such as 0<p~<-l, ~p~=l, 0<b2<bl<l, i=1,2.
the initial error content, the error detection rate and i
the number of remaining errors at any time of the The same can be generalized to k types of errors.
software LC, and a number of models have been A similar type of model has been proposed by Ohba
proposed. Review articles by Shanti Kumar [2], [8] to analyze a failure detection process in a module
Yamada and Osaki [3] and Geol [4] summarized most structured software, and has been named the hyper
of these models, which can be classified into several exponential SRGM. This model has a mean value
categories [2-5]. One of these consists of software function
reliability growth models (SRGMs) based on a
k
non-homogeneous Poisson process (NHPP). A brief re(t) = ~ a , ( 1 - exp(-bit)), (3)
overview of these models is presented in Section 2. i-1
1085
1086 N. KARr~R et al.
where k is the number of clusters with similar charac- data required, we estimate the parameters of the
teristics, at is the number of errors to be eventually model using Mishra's data [10] for the sake of
detected in cluster i and b~ is the error detection rate illustration, and discuss the optimum release policies
for cluster i. which would minimize cost subject to attaining
In contrast to the software reliability growth being a desired level of reliability. However, it may be
exponential, S-shaped growth, first given by Ohba [8], pointed out that Mishra's data [10] fit perfectly the
is observed more often in real-life situations. The modified exponential model (also illustrated in this
reasons for this are many. The testing process, which paper) because the failure detection phenomenon
is assumed to be a failure detection phenomenon in here is an error removal phenomenon in which the
exponential SRGM, is actually a two-phase process time required to debug an error is assumed to be
consisting of failure detection and its eventual re- negligible. We also apply the modified S-shaped
moval by isolation. The S-shaped SRGM takes care SRGM on these data. Obviously, the fit in this case
of the time taken to isolate and remove a fault, thus is not expected to be as good as the modified
the testing process is viewed correctly as a fault exponential model, because of the absence of the
isolation and not a failure detection process. To apply additional data required on the debugging phase.
this model, it is important therefore that the data be However, it brings out the applicability of the
fault isolation (which includes failure detection) and proposed model clearly.
not failure detection data [8]. Fault isolation data The mean value function of the modified S-shaped
are more accurate as some faults may be removed SRGM is given by
during the fault isolation phase of another fault
without detection by the test team (the test team is
m(t) = ~,p~a[1 - (1 + bd)exp(-bd)] (5)
i
often different from the correction team). Such faults,
although included in fault isolation data, will not be =~mi(t),
reported in failure detection data. The mean value i
function is given by where a is the number of errors to be eventually
m(t) = a l l - (1 + bt)e -bt] a, b > 0, (4) detected and
where a is the number of errors to be eventually mr(t) = april - (1 + bd)exp(-bd) ], i = 1, 2.
detected and b is the error detection rate.
p~ is the error content proportion of type i errors,
These models help us to draw inferences about the
0 < p ~ < 1,
cost-effectiveness and reliability of software systems.
The cost of a software system is measured from the 0<b2<b~< 1, a > 0 .
start of the testing time and includes the costs of
testing, debugging before release and debugging after Moreover,
the release. Software engineers have long stressed that ~pi= 1.
a defect found early in the software LC can be i
repaired at much less expense than one found later in The model has a time-dependent error detection rate
the LC. Therefore, to be sure that the fewest possible
defects remain in the software until the maintenance
d ( t ) - - ~i [p~b2 exp(-bd)/~pjexp(-bfl)(l +bfl)] t"
phase, the commonest method is to lengthen the
testing phase, as it is cheaper to correct a defect here (6)
than if the same defect is detected after release.
The aim of software development then is to The reliability R (x/t) of software is the probability
minimize this cost and still keep the reliability at an of its working error free in (t, t + x), given that the
acceptable level. In other words, the cost of testing last failure occurrence time is t > 0, x > 0.
and debugging should be balanced against the increase
in reliability to produce a software system at minimal R(x/t) = e x p [ - ( m ( t + x) - m(t))]
cost and simultaneously meeting the reliability
objectives. Such policies have been named 'optimal = e x p [ - ( ~ P i a ( ( l + bit)exp(-bit)
software release policies', and were first derived by
Okumoto and Goel [9] for each criterion individually -(l+bi(t+x))exp[-b,(t+ x)])]. (7)
by using the exponential SRGM and more recently
by Yamada [7].
4. PARAMETER ESTIMATION
3. MODIFIED S-SHAPED SRG MODEL
The model parameters a, b~, b2 are estimated as
In this paper, we propose a modified S-shaped follows. We use the method proposed by Ohba [8].
SRGM based on an NHPP, which reflects the real-life We assume that the data are available in the form of
situation more closely by accounting for the types of pairs ( t , z~), i = 1 . . . n, where zi are the cumulative
errors in the software. In the absence of the suitable number of errors detected up to ti.
S-shaped software reliability growth model 1087
The joint probability that pairs of data (t~, z~), and minor. However, we point out here that Mishra
i ffi I . . . n are observed is used the exponential model of G o d and Okumoto [6]
independently on two types of errors and obtained
Pr[m(0) = 0, m ( t l ) = zl . . . . . re(t, = z,)]
their estimates and also the estimates of total error
= 1-I [m (t,) -- r e ( t , _ l)] (z' -~'- ~) content and the error detection rate for the combined
i case which were unfortunately incorrect. We give
X exp{[m(ti) -- m(ti_ l)]}. (8) below the correct estimates and then apply the
modified exponential model on the same data to
where illustrate the utility of the model.
m(t,) = p , a ( 1 - (1 + b , t ) e x p ( - b ~ t ) ) , i =1,2. Exponential SRGM:
EST a EST b
This joint probability function may be used as the (1) Major errors: 163.813 0.28759 x 10 -3
likelihood function for estimating the model par- (2) Minor errors: 315.551 0.25756 x 10 -3
ameters. The estimates can be found by maximizing (3) Combined data: 478.2 0.2686 x 10 -3
the log likelihood L: MSE = 1148
(4) Modified exponential SRGM
L = ~' (zi - z,_l )ln[m(ti) - m(ti_ 1)]
i
EST a = 570.0674
- ~ ln(z,- z i_ 1)! - m ( t . ) . (9)
i
EST bl = 0.3069 x 10 -3
Taking the derivatives of L with respect to a,
EST b 2 = 0.8242 x 10 -4
bm and b2, and equating them to zero, we obtain the
following equations: Pl = 64.07 x 10 -2
a=z./I~p~[1-(1 + b,t,) e x p ( - bzt,) 1 P2 = 35.93 x 10 -2, MSE = 1148,
where EST = estimated; MSE = mean square error.
abkt~ e x p ( - - b k t . )
The mean value functions for cases (3) and (4) have
[(Zi -- Zi- 1) (bk t 2i exp(- bk t~) been fitted on the actual data in Figs 1 and 2.
~, -- bk t2-1 exp(-- b k t,_ ] ))] We may notice from (3) and (4) that although the
MSE in the two cases is the same and the estimates
~pj[(1 + b j t i _ l ) e x p ( - b j t , _ l ) ' of the parameters also give equally good fit (as shown
J -- (1 + bjti) e x p ( - b j t , ) ) ] in Figs 1 and 2), the exponential model as applied
k = 1, 2. (10) independently on the data of two types of errors does
not reflect the real-life phenomenon exactly. It gives
Solving the above three equations numerically b2 > b], i.e. the detection rate of Type II errors (which
gives the maximum likelihood estimates of a, b 1 are difficult to detect) is given as greater than the
and b2. detection rate of Type I errors (which are easy to
Before proceeding further, we first use the detect), whereas it seems more plausible to assume
modified-exponential model on Mishra's data, which b2 < b~, as is reflected in the modified exponential
are provided for each type of error described as major model.
CORRECTED EXR C U R V E Vs ACTUAL DATA MODIFIED EXR CURVE Vs ACTUAL DATA.
j
240 240
2oo I 200
160 ~
uJ
lEO
,0/t'
120
80 ~ so
,,v
0
n,.
,=,, ,40
0 I I L / ; i I
92.5 342.5 TI8.9 g78.5 157.82 148:?,818311.5217.8 82,X, 342.5 718.9 978.S " 125"/~11 1407,8 IESS,321TS. S
TESTING TIME TESTING TIME
• ACTUAL ERRORS o ESTIMATED ERRORS
• ACTUAL ERRORS 0 ESTIMATED ERRORS
Fig. 1. Exponential curve fitted on actual data for the
combined case of major and minor errors. Fig. 2. Modified exponential curve fitted on the actual data.
1088 N. KAREERet al.
MODIFIED S-SHAPED CURVE Vs ACTUAL DATA where
240
m~(t) =apib~t e x p ( - b i t ) , i = 1,2
20O
=~m~'(t)
180 = apib~ e x p ( - b i t ) ( 1 - bit), i = 1, 2.
i~ 120 It can be easily shown that
~ so
m~'(t) > 0 forO<t<l/bi
m~'(t) < 0 forl/bi<t<oo
~ -4 0
1/bi is a point of maxima of m~(t)
0
82.5 342.5 718.9 978.5 1257.8 1407.8 1838.3 2178.8 m'~(t) is a concave function.
TESTING TIME Therefore, as m'(t) is a sum of two concave functions,
Fig. 3. Modified S-shaped curve fitted on the actual data. it is also a concave function.
For the same data, by solving (10) numerically, our L.H.S. (11) = (c21 - cn)m~(T) + (c22 - q2)m'2(T)
modified S-shaped SRG model parameters are = lm~(T) + mm'2(T);
a = 307 1, m are positive constants as (c2i > eli), i = 1, 2.
b l = 0 . 1 8 0 5 7 x 10 -2 , b 2 = 0 . 5 9 1 9 x 10 -3 As rn~(T) is a concave function, it therefore follows
that L.H.S. (11) is also a concave function, with its
P] = 64.07 x 10 -2, P2 = 35.93 x 10 -2 point of maxima as tmax.
Hence, two cases of L.H.S. (11) arise:
MSE = 6234.
Case (I):
The mean value function (5) fitted on the actual
data is shown in Fig. 3. (c2i-- c,i)m; (tmax) > c3.
i
There then exist two positive points Ta, Tb satisfying
5. OPTIMAL RELEASE PROBLEM AND POLICIES (11), 0 < Ta < Tb < oO such that
Let C(T) represent the cost of the software as a C"(T)<Oat T=Ta, C"(T)>Oat T=Tb.
function of testing time (as mentioned earlier, the cost
of software is measured from the start of the testing Comparing C(0) and C(Tb), three cases of cost arise:
time) and R 0 represents the desired level of reliability (i) C(Tb) > C(O).
to be achieved. The optimal release problem can then There then exists a positive and unique point T¢
be stated as such that C(Tc)= C(Tb).
To minimize C(T)
(ii) C(Tb) < C(O).
s.t. R(x/T)>>.R o, There then exist two positive points Te and T/such
where T/> 0 and 0 < R0 < 1. The objective is to find that C(T~) = C(TI) = C(O).
an optimal release time T = T* for the software. (iii) C(Tb)= C(O).
As the cost of software starts from the testing Case (II):
phase, we need to consider th~ cost only for the
testing and maintenance phases. Let (c2i- cu)m; (tmax) <~c3.
i
c~i = cost of type i error removal during testing Then, for all T, C(T) increases. Taking the log of the
c2i = cost of type i error removal during operation reliability function, we obtain
c3 = cost of testing per unit time
Tec = length of LC of the software. In R ( x / t ) = R I + R2,
Then where
C(T) = ~ [climi(T) + c2i(mt(TLc) -- mi(T))] + c3 T. Ri = - b i a { ( l + b i t ) e x p ( - b i t )
i
- (1 + bi(t + x ) ) e x p [ - b i ( t + x)]}.
Differentiating C(T) w.r.t. T and equating it to zero,
we obtain Differentiating R i and equating to zero, we obtain
~, [c21- cli]m; ( T) = c3. (1 1) p, ab~ exp(-bit)[t(1 - e x p ( - bix))
i
- x e x p ( - b l x ) ] = 0,
Let us consider that
which gives
m'(t) = Z m~(t),
i tx, = (x exp(-bix))/(l - e x p ( - b i x ) ) , tx, > O.
S-shaped software reliability growth model 1089
It can be easily shown that (d) If C(Tb) < C(0) and 0 < Ro < R(x/t~i, ), then
T * = Tb.
dR~/dt < 0 for 0 < t < tx,
(3) (a) If C(Tb) = C(0) and R(x/t~i,) < R(x/O) <
dRj/dt > 0 for tx, < t < ~ , Ro < 1, then T* = max(Tb, To).
(b) If C(Tb) = C(O) and 0 < R(x/t=in) < Ro < R
which implies that ix, is a point of m i n i m u m and
(x/O), then T * = 0.
hence R~ is a convex function.
(c) If C ( T b ) = C ( O ) and R ( x / O ) = R o , then
Therefore In R ( x / t ) (being the sum of two convex
T* = 0.
functions) is a convex function, and hence R ( x / t ) is
(d) If C(Tb) = C(O) and 0 < Ro ~< R(x/t~an), then
a convex function, with tm~=as its point of minima.
reliability is attained at all points.
The cases of reliability then are
(i) R(x/O) >1 R(x/Tb) =~ T* = 0
(j) R ( x / t ~ n ) < R(x/O) < Ro < 1. (ii) R(x/O) < R(x/Tb) =~ T* = T b.
There then exists a unique point To > 0 such that Case IL Assume
R ( x / T o ) = Ro.
(c2,- c,,)m,, (T~x) -,~
_<
c3.
i
(ii) 0 < R(x/tmin) < Ro < R(x/O).
There then exist two points T~ and T: such that Here the cost increases at all points.
(i) If R(x/O) < Ro < 1, then T* = To.
R ( x / T I ) = R(x/T2) = R o.
(ii) If 0 < R0 < R (x/0), then T* = 0.
(iii) R (x/0) = Ro
There then exists a unique point To > 0 such that Numerical example
R ( x / T o ) = Ro The maximum likelihood estimates of the model
parameters a, bl, and b2 are
(iv) 0 < Ro ~<R(x/t~n).
The reliability here is achieved at all points. a = 307
The cost reliability optimal software release problem
bl =0.18057 × 10 -2
for a modified S-shaped error removal phenomenon
can then be summarized as follows. b2 = 0.5919 × 10 -3.
Theorem. Let c2~> c~t > O, c 3 > 0, x >l O and
0 < Ro < 1, i = l, 2. Let T* = optimal release time. Suppose that
Case L Assume c,=50, c12=90, c21=1000,
E (C2'- Cli)m;(tmax) > C3" c22 = 1200, c3 = 100.
i
(1) (a) If C(Tb) > C(0) and R(x/tmin) < R(x/O) < Let x = 100. Then R ( x / O ) = 0.04825.
R o < l, then If R 0 = 0.6 and TLc = 7500, then the cost reliability
(i) T0 > T b = ~ T * = T b optimal software release problem to be solved is
(ii) T~ < To ~< Tb ~ T* = Tb Minimize C(T)
(iii) 0 < To ~< T~ ~ T* = To.
s.t. 50ml (T) + 90m2(T ) + 1000(m I (7500) - ml (T))
(b) If C(Tb) > C(0) and 0 < R(x/t=~) < Ro < R
(x/O), then T * = 0, as at zero, reliability is + 1200(m2(7500 ) - m2(T)) + 100T,
greater than Ro and also the cost is minimum.
(c) If C(Tb) > C(0) and R(x/O)-- R0, then subject to R(IOO/T) >>.0.6, T/> 0.
T* = 0. Although the reliability level is also
attained at To, the cost is m i n i m u m at zero.
(d) If C ( T b ) > C(0) and 0 < R 0 ~ < R ( x / t ~ ) , SOFTWARE R E L I A B I L I T Y Vs TEST TIME
then, the required reliability is attained at all 0.8
p.
points but the cost is m i n i m u m at zero; 0.7
therefore T* = 0.
~. o.e
(2) (a) If C(Tb) < C(0) and R(x/t=n) < R(x/O) < Ro
< 1, then T* = max(T b, To). < 0.5
(b) If C(Tb) < C(0) and 0 < R(x/tml,) < Ro < R =: 0.4
(x/O), then
~ 0.3
(i) 7": I > TI ~ T * = 0 w
a.
(ii) r~ </'2 < ri=> T* = T: ~ o.a
(iii) T2 ~< T~ => T* = Tb.
(c) If C(Tb) < C(0) and R(x/O) = Ro, then
io.,! 0
(i) To~> T / = T* = 0 500 tSO0 2tO0 2900 5700 4500 5500 6100 • 0 0
(ii) rb < To < Tf =~ T* = T O TESTING TIME
(iii) T, < To ~< Tb =* T* = Tb Fig. 4. Reliability curve during the LC of the software
(iv) O < To <<.T¢ ~ T* = O. (numerical example).
1090 N. KAREERet al.
a EXPECTED SOFTWARE COST Vt TESTING TIME It is important for the decision-maker to decide
900 about the optimum time for the end of testing and the
beginning of operation phases. The aim, as mentioned
800 earlier, should be to strike a balance between the
software cost and its reliability. As is clear from
700
the above example, if the aim was only to minimize
cost, then the optimal release time would have been
soo
T b= 1180 (Fig. 5b), but then from Fig. 4, this
500 decision would have resulted in poor-quality software
with a very low reliability of working error free for
400 x( = 100) units of time after release. This example
helps to demonstrate the importance of considering
300 the cost and reliability criteria together in taking
IO00 2000 3000 4000 5000 6000 7000
a decision about the release time of the software
TESTING TIME
system.
Fig. 5(a). Total cost curve during the LC of the software
(numerical example).
REFERENCES
b T O T A L EXPECTED SOFTWARE COST Vs TESTING TIME
1. Conte, Dunsmore and Shen, Software Engineering:
550 Metrics and Models. Benjaminq2ummings, Menlo Park,
t CA (1986).
2. J. G. Shantikumar, Software reliability models: a
oo 325 review. Microelectron. Reliab. 23(5), 903-943 (1983).
3. S. Yamada and S. Osaki, Reliability growth models
for hardware and software systems based on non-
~ 320 homogeneous poisson process: a survey. Microelectron.
Reliab. 23(1), 91-122 (1983).
4. A. L. Goel, Software reliability models: assumptions,
~ 315 limitations, and applicability. IEEE Trans. Software
Engng, SE-11(12) (1985).
5. J. Musa and K. Okumoto, Software Reliability:
u~ 3t0 Measurement, Prediction, Application, McGraw-Hill,
0 ZOO 400 300 800 I000 1200 1400
New York (1987).
TESTING TIME > 6. A. L. Goel and K. Okumoto, Time dependent error
detection rate model for software reliability and other
Fig. 5(b). Magnified cost curve for the range 0-1500 units
performance measures. IEEE Trans. Reliab. R-28(3),
of testing time (numerical example).
206-211 (1979).
7. S. Yamada and S. Osaki, Optimal software release
policies with simultaneous cost and reliability require-
As O < R ( x / O ) < R o < 1, Tb= 1180, T0 = 6580 and ments. Eur. J. Operational Res. 31, 46-51 (1987).
C(Tb) < C(0) (also obvious from Fig. 5a). Hence 8. M. Ohba, Software reliability analysis models. IBM J.
from the theroem we have Res. Develop,, 28(4), 428-443 (1984).
9. K. Okumoto and A. L. Goel, Optimum release time for
T* = max(Tb, To) = max(1180, 6580) = 6580. software systems based on reliability and cost criteria.
J. Systems Software 1, 315-318 (1980).
Figures 4, 5a and 5b illustrate this cost reliability 10. P. N. Mishra, Software reliability analysis. IBM Systems
optimal software release problem. J. 22(3), 262-269 (1983).