0% found this document useful (0 votes)

181 views

Integrating Testing With Reliability

http://www.kualitatem.com

Uploaded by

kualitatem

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

181 views

Integrating Testing With Reliability

http://www.kualitatem.com

Uploaded by

kualitatem

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

SOFTWARE TESTING, VERIFICATION AND RELIABILITY

Softw. Test. Verif. Reliab. 2009; 19:175–198

Published online 15 July 2008 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/stvr.395

Integrating testing with

reliability
Norman Schneidewind∗, †, ‡, §

Professor Emeritus of Information Sciences, Naval Postgraduate School,

U.S. Senate, U.S.A.

SUMMARY
The activities of software testing and reliability are integrated for the purpose of demonstrating how
the two activities interact in achieving testing efficiency and the reliability resulting from these tests.
Integrating means modeling the execution of a variety of tests on a directed graph representation of an
example program. A complexity metric is used to construct the nodes, edges, and paths of the example
program. Models are developed to represent the efficiency and achieved reliability of black box and white
box tests. Evaluations are made of path, independent path, node, program construct, and random tests
to ascertain which, if any, is superior with respect to efficiency and reliability. Overall, path testing has
the edge in test efficiency. The results depend on the nature of the directed graph in relation to the type
of test. Although there is no dominant method, in most cases the tests that provide detailed coverage are
better. For example, path testing discovers more faults than independent path testing. Predictions are
made of the reliability and fault correction that results from implementing various test strategies. It is
believed that these methods can be used by researchers and practitioners to evaluate the efficiency and
reliability of other programs. Copyright © 2008 John Wiley & Sons, Ltd.

Received 22 August 2007; Revised 20 March 2008; Accepted 2 April 2008

KEY WORDS: test efficiency; software reliability; modeling efficiency and reliability

1. INTRODUCTION

Software is a complex intellectual product. Inevitably, some errors are made during requirements
formulation as well as during designing, coding, and testing the product. State-of-the-practice soft-
ware development processes to achieve high-quality software includes measures that are intended
to discover and correct faults resulting from these errors, including reviews, audits, screening by
language-dependent tools and several levels of tests. Managing these errors involves describing,

∗ Correspondence to: Norman Schneidewind, Professor Emeritus of Information Sciences, Naval Postgraduate School,
U.S. Senate, U.S.A.
† E-mail: [email protected]
‡ Fellow of the IEEE.
§ IEEE Congressional Fellow, 2005.

Copyright q 2008 John Wiley & Sons, Ltd.

176 N. SCHNEIDEWIND

classifying, and modeling the effects of the remaining faults in the delivered product and thereby
helping to reduce their number and criticality [1].
One approach to achieving high-quality software is to investigate the relationship between testing
and reliability. Thus, the problem that this research addresses is the comprehensive integration of
testing and reliability methodologies. Although other researchers have addressed bits and pieces of
the relationship between testing and reliability, it is believed that this is the first research to integrate
testing efficiency, the reliability resulting from tests, modeling the execution of tests with directed
graphs, using complexity metrics to represent the graphs, and evaluations of path, independent path,
node, random node, white box, and black box tests.
One of the reasons for advocating the integration of testing with reliability is that, as recommended
by Hamlet [2], the risk of using software can be assessed based on reliability information. He states
that the primary goal of testing should be to measure the reliability of tested software. Therefore,
it is undesirable to consider testing and reliability prediction as disjoint activities.
When integrating testing and reliability, it is important to know when there has been enough
testing to achieve reliability goals. Thus, determining when to stop a test is an important management
decision. Several stopping criteria have been proposed, including the probability that the software
has a desired reliability and the expected cost of remaining faults [3]. Use the probabilities associated
with path and node testing in a directed graph to estimate the closeness to the desired reliability of
1.0 that can be achieved. To address the cost issue, explicitly estimate the cost of remaining faults
in monetary units and estimate it implicitly by the number of remaining faults compared with the
total number of faults in the directed graph of a program.
Given that it cannot be shown that there are no more errors in the program, use heuristic arguments
based on thoroughness and sophistication of testing effort and trends in the resulting discovery
of faults to argue the plausibility of the lower risk of remaining faults [4]. The progress in fault
discovery and removal is used as a heuristic metric when testing is ‘complete’. At each stage of
testing, reliability is estimated to note the efficiency of various testing methods: path, independent
path, random path, node, and random node.

1.1. Challenges to efficient testing

A pessimistic but realistic view of testing is offered by Beizer [5]. An interesting analogy parallels
the difficulty in software testing with pesticides, known as the Pesticide Paradox. Every method
that is used to prevent or find bugs leaves a residue of subtler bugs against which those methods are
ineffectual. This problem is compounded because the Complexity Barrier principle states [5] that
Software complexity and presence of bugs grow to the limit of the ability to manage complexity
and bug presence. By eliminating the previously easily detected bugs, another escalation of features
and complexity has arisen. But this time there are subtler bugs to face, just to retain the previous
reliability. Society seems to be unwilling to limit complexity because many users want extra features.
Thus, users usually push the software to the complexity barrier. How close to approach that barrier is
largely determined by the strength of the techniques that can be wielded against ever more complex
and subtle bugs. Even in developing the relatively simple example program this paradox was found
to be true: as early detected bugs (i.e. faults) were easily removed and complexity and features
were increased, a residue of subtle bugs remained and was compounded by major bugs attributed to
increased complexity. Perhaps as the fields of testing and reliability continue to mature, the fields
will learn how to model these effects.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 177

A further complication involves the dynamic nature of programs. If a failure occurs during
preliminary testing and the code is changed, the software may now work for a test case that did
not work previously. But the code’s behavior on preliminary testing can no longer be guaranteed.
To account for this possibility, testing should be restarted. The expense of doing this is often
prohibitive [6]. It would be possible to model this effect but at the cost of unmanageable model
complexity engendered by restarting the testing. It appears that this effect would have been modeled
by simulation.
The analysis starts with the notations that are used in the integrated testing and reliability
approach to achieving high-quality software. Refer to these notations when reading the equations
and analyses.

1.2. Notations and definitions

edge: arc emanating from a node;

node: connection point of edges;
i: identification of an edge;
n: identification of a node;
c: identification of a program construct;
k: test number;
empirical: reliability metrics based on historical fault data.
1.2.1. Independent variables (i.e. not computed; generated by random process)

f (n): fault count in node n;

n f : number of faults in a program;
en : number: number of edges at node n;
n e : number of edges in a program (generated by random process in random path testing);
n(c, k): number of faults encountered and removed by testing construct c on test k.
1.2.2. Dependent variables (i.e. computed or obtained by inspection)

1.2.2.1. Number of program elements.

n n j : number of nodes in path j;
n n : number of nodes in a program;
n j : number of paths in a program.

1.2.2.2. Probabilities.
p( j): probability of traversing path j;
p(n): probability of traversing node n.

1.2.2.3. Expected values.

E(n): expected number of faults at node n during testing;
E( j): expected number of faults on path j during testing;
E p : expected number of faults encountered in a program based on path testing.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
178 N. SCHNEIDEWIND

1.2.2.4. Reliabilities.
Rn : empirical reliability at node n prior to fault removal;
R p : empirical reliability of program prior to fault removal;
Un : empirical unreliability at node n prior to fault removal;
U ( j): empirical unreliability on path j prior to fault removal;
R( j): empirical reliability of path j prior to fault removal;
R(c, k): empirical reliability achieved by testing construct c during test k after fault removal;
re (n): empirical number of remaining faults at node n prior to fault removal.

1.2.2.5. Test efficiencies.

e( j): efficiency of path testing;
e(n): efficiency of node testing;
e(c, k): efficiency of program construct c testing for test k;
Mc: McCabe cyclomatic complexity metric (i.e. number of independent paths);
t: test or operational time.

2. TEST STRATEGIES

There are two major types of tests, each with its own type of test case—white box and black box
testing [7]:

2.1. White box testing

White box testing is based on the knowledge about the internal structure of the software under test
(e.g. knowledge of the structure of decision statements). The adequacy of test cases is assessed
in terms of the level of coverage of the structure they reach (e.g. comprehensiveness of covering
nodes, edges, and paths in a directed graph) [8].
White box test case: A set of test inputs, execution conditions, and expected results developed for
a particular objective such as exercising a particular program path or to verify compliance with a
specific requirement. For example, exercise particular program paths with the objective of achieving
high reliability by discovering multiple faults on these paths.

2.2. Black box testing

In black box testing, it may be easier to derive tests at higher levels of abstraction. More information
about the final implementation is introduced in stages so that additional tests due to increased
knowledge of the structure are required in small manageable amounts, which greatly simplifies
structural, or white box, testing. However, it is not clear whether black box testing (e.g. testing
If Then Else statements) preceding or following white box testing (e.g. identifying If Then Else
paths) would affect test effectiveness.
Black box test case: Specifications of inputs, predicted results, and a set of execution conditions
for a test item. In addition, because only the functionality of the software is of concern in black box

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 179

testing, this testing method emphasizes executing functions and examining their input and output
data [9]. For example, specify functions to force the execution of program constructs (e.g. While
Do) that are expected to result in an entire set of faults to be encountered and removed. There are
no inputs and outputs specified because the C++ example program executes paths independent of
inputs, but is dependent on function probabilities. The inputs specify parameters and variables used
in program computations, and not path execution probabilities. The program produces a standard
output dependent only on function probabilities.
A variant of black box testing captures the values of and changes in variables during a test. In
regression testing, for example, this approach can be used to determine whether a program, modified
by removing faults, behaves correctly [10]. However, in this research, rather than observing variable
changes, black box testing is conducted by observing the results of executing decision statements
and determining whether the correct decision is made.

3. TESTING PROCESS

3.1. White box tests

Four types of testing are used, as described below. Path testing involves visiting both nodes and
paths in testing a program, whereas node testing involves only testing nodes. For example, in
testing an If Then Else construct, the If Then and Else components are visited in path testing,
whereas in node testing, only the If component is visited. Recognize the limitations of using
a directed graph for the purpose of achieving complete test coverage. For example, although it
is possible to represent initial conditions and boundary conditions [11] in a directed graph, the
amount of detail involved could make its use unwieldy. Therefore, it is better to represent only
the decision and sequence constructs in the graph. However, this is not a significant limitation
because the decision constructs account for the majority of complexity in most programs and high
complexity leads to low reliability. For illustrative purposes, a short program is used in Figure 1. This
program may appear to be simple. In fact, it is complex because of the iterative decision constructs.
Of course, only short programs are amenable to manual analysis. However, McCabe and asso-
ciates developed tools for converting program language representations to directed graphs for large
programs [12].
The following is an outline of the characteristics of the various testing schemes that were consid-
ered. First, identify program constructs: If Then, If Then Else, While Do, and Sequence in the
program to be tested. Then perform the following white box tests.

3.1.1. Path testing

In path testing, it is desired to distinguish the independent paths from the non-independent paths.
Therefore, as the McCabe complexity metric [13] represents the number of independent paths to
test, faults are randomly planted at nodes of a directed graph that is constructed with edges and
nodes based on this metric. This process provides a random number of faults that are encountered
as each path is traversed. Note that in path testing the selection of paths is pre-determined.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
180 N. SCHNEIDEWIND

start node

n=1 n=3
n=2
Else If Then Else Then
i=1 i=2
i=3 i=4

While Do
i=5 n=4
i=6
i=8 n=6
If Then Then
i=7 n=5
i=9 Do
If Then sequence
n=7
i = 10 n =8

n : node identification terminal node

independent paths:
nn = number of nodes = 8 1 --> 4--> 5--> 6 --> 7-->8--> tn
ne = number of edges = 10 2 --> 3 --> 4 --> --> 5 --> 6 --> 7 -->8 -- tn
i: edge identification 2 --> 1 --> 4 --> 5 --> 7 --> 8 --> tn
Mc = McCabe Complexity Metric = (ne - nn) + 1 = (10 - 8) + 1= 3
= 3 independent paths non-independent paths:
Bold = 3 "window panes" (independent circuits) 1 --> 4 -->5 --> -->7 -->8 --> tn
3 --> 4 --> 5 --> 6 --> 7 --> 8 --> tn
3 --> 4 --> 5 -->7 -->8 --> tn
4 --> 5 -->6 -->7 --> 8 --> tn
4 --> 5 -->7 -->8 --> tn
5 -->6 -->7 --> 8 --> tn
5 -->7 -->8 --> tn
6 --> 7 --> 8 --> tn
7 --> 8 --> tn

Figure 1. Directed graph illustrating McCabe complexity.

3.1.2. Random path testing

As opposed to path testing, which uses pre-determined paths, random path testing produces random
selection of paths. Thus, using the directed graph based on the McCabe metric, a random selection
of path execution sequences and the same random distribution of faults at nodes as in path testing,
a different sequence of fault encounters at the nodes will occur, compared with path testing.

3.1.3. Node testing

Using the directed graph based on the McCabe metric and the same distribution of faults as before,
node testing randomly encounters faults as only the nodes are visited.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 181

3.1.4. Random node testing

Using the directed graph based on the McCabe metric and a different random distribution of faults
than is used in the other tests, random node testing encounters a different set of faults, compared
with node testing. A different random distribution of faults is used because otherwise the same
result would be achieved as in node testing.

3.2. Black box testing

After the four types of white box tests have been conducted, perform the following steps:
Conduct black box testing: force function execution and observe resulting fault encounters and
removals [9].
Conduct white box testing: observe the response of a program to path and node testing [9].
Make reliability predictions using Schneidewind single parameter model (SSPM) [14], with
randomly generated fault data. Faults are generated randomly so that there will be no bias in the
fault distribution. Therefore, the fault distribution is not intended to be representative of a particular
environment. Rather, it is designed to be generic.
Predict the number of remaining faults and reliability with SSPM and compare with the empirical
test values.
Compare reliability predictions with results of black box and white box testing.

4. ASSUMPTIONS

Recognize that the following assumptions impose limitations on the integrated testing and relia-
bility approach. However, all models are abstractions of reality. Therefore, the assumptions do not
significantly detract from addressing the research questions below.
When faults are discovered and removed, no new faults are introduced. This assumption overstates
the reliability resulting from the various testing methods, but its effect will be experienced by all
the testing methods.
The probability of traversing a node is independent of the probability of traversing other nodes.
This is the case in the directed graph, which is used in the example program. It would not be the
case in all programs.
No faults are removed until a given test is complete. Therefore, as path testing visits some of
the same nodes on different tests, the expected number of faults encountered can exceed the actual
number of faults.

5. RESEARCH QUESTIONS

The following questions seem to be important in evaluating the efficiency of test methods and their
integration with reliability:
1. Does an independent path testing strategy lead to higher efficiency and reliability than path
and random path testing?

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
182 N. SCHNEIDEWIND

2. Does a node testing strategy lead to higher reliability and efficiency than random node testing?
3. Does the McCabe complexity metric [13] assist in organizing software tests?
4. Which of the testing strategies yields the highest reliability prior to or after fault removal?
5. Do reliability metrics, using SSPM, produce more accurate reliability assessments than node
and random node testing?
6. Which testing method, white box or black box, provides more efficient testing and higher
reliability?

6. INTEGRATED TESTING AND RELIABILITY MODEL

The following equations are used to implement the testing strategies and reliability predictions:

6.1. Fault discovery evaluation

The expected number of faults at node n is given by

E(n) = p(n) f (n) (1)

where p(n) is determined by the branch probabilities in Figure 1 and f (n) is determined by
randomly generating the number of faults at each node.
The probability of traversing path j is given by the following equation:
n
nj
p( j) = p(n) (2)
n=1

Then, using Equations (1) and (2) yields the following equation for the expected number of faults
encountered on path j:
nn j

E( j) = p(n) f (n) p( j) (3)
n=1

Furthermore, summing Equation (3) over the number of paths in a program yields the expected
number of faults in a program, based on path testing, in the following equation:

nj
Ep = E( j) (4)
j=1

6.2. Reliability evaluation

According to Equation (1), the empirical reliability at node n prior to fault removal is given in the
following equation:
p(n) f (n)
Rn = 1− n n (5)
n=1 p(n) f (n)

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 183

Now, the empirical unreliability at node n, according to Equation (5), is given by the following
equation:

Un = 1− Rn (6)

Then, using Equations (5) and (6) the unreliability on path j prior to fault removal is given by
the following equation:

nn nn p(n) f (n)
U ( j) = [ p( j)][Un ] = [ p( j)] n n (7)
n=1 n=1 n=1 p(n) f (n)

Then, according to Equation (6) the reliability of path j prior to fault removal is given by the
following equation:

R( j) = 1−U ( j) (8)

Finally, the reliability of the program R p is limited by the minimum of the path reliabilities computed
in Equation (8). Thus, the following equation is obtained:

R p = min R( j) (9)

Continuing the analysis, find the empirical number of remaining faults at node n, prior to fault
removal, according to the following equation:

nn
re (n) = n f − p(n) f (n) (10)
n=1

7. CONSTRUCTING THE DIRECTED GRAPHS OF EXAMPLE PROGRAMS

To obtain an operational profile of how program space was used, Horgan and colleagues [15]
identified the possible functions of their program and generated a graph capturing the connectivity
of these functions. Each node in the graph represented a function. Two nodes, A and B, were
connected if control could flow from function A to function B. There was a unique start and end
node representing functions at which execution began and was terminated, respectively. A path
through the graph from the start node to the end node represents one possible program execution.
In 1976, Thomas McCabe proposed a complexity metric based on the idea of the directed graph
as a representation of the complexity of a program. The directed graph can be based on functions,
as in the case of Horgan’s approach, or program statements that are used in this paper. McCabe
proposed that his metric be a basis for developing a testing strategy [13]. The McCabe complexity
metric is used as the basis for constructing the example directed graph that is used to illustrate the
integration of testing and reliability [16]. There are various definitions of this metric. The one that
is used here is given in the following equation [16]:

Mc = (n e −n n )+1 (11)

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
184 N. SCHNEIDEWIND

start node
1
n=1 4 n=3
n=2
2 If Then Else Then 1
Else i=1 i=2 3
4 i=4
i=3
4 4
While Do
i=5 n=4 2
i=6 4
2 4 i=8 Then
If Then n=6
i=7 n=5
i=9 Do
If Then sequence 5 2
n=7
2 7 i = 10 n =8

n : node identification tn:terminal node

nn = number of nodes = 8
ne = number of edges = 10
i: edge identification
Mc = McCabe Complexity Metric = (ne - nn) + 1 (10 - 8) + 1= 3
= Number of Independent Paths

Bold = Number of Faults planted for path testing (24 faults total).
Italics = Number of Faults planted for random node testing (27 faults total).

Bold = minimum reliability path: 5 --> 6 --> 7 --> 8, reliability = .7292

Figure 2. Planting faults in directed graph.

Here, n n = −Mc +(n e +1) for n n < n e and n n > −Mc +(n e +1), where n n is the number of nodes
representing program statements (e.g. If Then Else) and conditions (e.g. Then, Else), and n e is the
number of edges representing program control flow transition, as depicted in Figure 1.
This definition is convenient to use for the testing strategy because it corresponds to the number
of independent paths and the number of independent circuits (‘window panes’) in a directed graph.
See Figures 1 and 2. Strategy means that paths are emphasized in the test plan.
The approach is used for specifying Mc and n e and computing n e from Equation (11). Then
knowing the number edges and nodes in a directed graph, for a given complexity, the information
is in hand to represent a program. In the case of the While Do construct, only count one iteration
in computing Mc .
The directed graph of the program shown in Figure 1 is based on a C++ program that was
written for a software reliability model [17]. The program has 420 C++ statements. The program
computes cumulative number of failures, actual reliability, predicted reliability, rate of change of
predicted reliability, mean number of failures, fault correction rate, mean fault correction rate, fault

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 185

correction delay, prediction errors, and maximum likelihood parameter estimation. The directed
graph represents the decision logic of the model.

8. TESTING STRATEGIES

As pointed out by Voas and McGraw, some people erroneously believe that testing involves only
software. In fact, according to them, testing also involves specifications [18]. This research supports
their view; in fact, the While Do loop in Figure 1 represents the condition for being able to make
reliability predictions if the specified boundary condition on the equations is satisfied in the C++
program. Therefore, in all of the following testing strategies that are implemented in the example
program, it is implicit that testing encompasses both specifications and software.
In black box testing, the tester is unconcerned with the internals of the program being tested.
Rather, the tester is interested in how the program behaves according to its specifications. Test data
are derived from its specifications [9]. In contrast, in white box testing the tester is interested in how
the program will behave according to its internal logic. Test data are derived based on the internal
workings of the program [9]. Some authors propose that black and white box testing be integrated
in order to improve test efficiency and reduce test cost [19]. This would not be wise because each
strategy has objectives that are fundamentally different.
In white box testing, based on the test data, a program can be forced to follow certain paths.
Research indicates that applying one or more white box testing methods in conjunction with func-
tional testing can increase program reliability when the following two-step procedure is used:
1. Evaluate the adequacy of test data constructed using functional testing.
2. Enhance these test data to satisfy one or more criteria provided by white box testing
methods [20].
In this research, the approach is to adapt #2 to generate test data to satisfy path coverage criteria
to find additional faults.
In black box testing the program is forced to execute constructs (e.g. While Do) that are associated
with the functional specifications. For example, continue to compute functions while there is input
data. In white box testing, test nodes and paths are associated with the detailed logic and functions of
the program. For example, a path would involve computing the error in predicting reliability metrics.
Related to black box testing is the concept of the operational profile wherein the functions of
the program, the occurrence rates of the functions, and the occurrence probabilities are listed [21].
In the case where the functions are the program constructs (e.g. If Then Else), as shown in Figure
1. In the example program, occurrence rates of all constructs are 100%. Thus, rather than using
occurrence rates, the importance of the constructs is more relevant (e.g. While Do more important
than If Then).
In their study [7], regarding system functionality, they began with the assumption that coding
errors tend to be regional. Analysis of the results of the testing of the 53 system tasks within the
six functional categories supported this assumption. The data indicate that tasks and categories
with high execution quantities had more field deficiencies. These tasks and categories were more
complex, containing a broader range of functions made possible through additional lines of code.
Owing to this complexity, these areas were more susceptible to errors.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
186 N. SCHNEIDEWIND

These results suggest that there should be a focus in the testing effort on complex high payoff
areas of a program such as the While Do construct and associated constructs (see Figure 2), where
there is a concentration of faults. These constructs experience high probabilities of path execution
(i.e. ‘execution quantities’).
According to AT&T [22], the earlier a problem is discovered, the easier and less expensive it is to
fix, making software development more cost-effective. AT&T uses a ‘break it early’ strategy. The
use of independent path testing attempts to implement this approach because it is comparatively
easy and quick to expose the faults of these tests, with the expectation of revealing a large number
of faults early in the testing process.
As stated in [23], one of the principles of testing is the following: define test completion criteria.
The test effort has specific, quantifiable goals. Testing is completed only when the goals have been
reached (e.g. testing is complete when the tests that address 100% functional coverage of the system
have been all executed successfully). Although this is a noble goal and is achieved in the small
program, it is infeasible to achieve 100% coverage in a large program with multiple iterations. Such
an approach would be unwise due to the high cost of achieving fault removal at the margin.
Another principle stated in [23] is to verify test coverage: track the amount of functional coverage
achieved by the successful execution of each test. Implement this principle as part of the black
box testing approach, where the discovery and removal of faults as each construct is tracked
(e.g. If Then Else) is executed (see Table II).

9. TESTING STRATEGIES EVALUATION

One metric of test effectiveness is the ratio of the number of paths traversed to the total number
of paths in the program [24]. This is a good beginning, but it is only one characteristic of an
effective metric. In addition, it is important to consider the presence of faults on the paths. This is
the approach described below.
In order to evaluate the effectiveness of testing strategies, compute the fault coverage by two
means: path coverage and edge coverage. Recall that the number of faults encountered during path
testing can exceed the actual number of faults. Therefore, path testing must take this factor into
account. Path testing efficiency is implemented by using the following equation that imposes the
constraint that the sum of faults found on paths must not exceed the number of faults in the program:

nj
n j n
nj
e( j) = E( j) nf = p(n) f (n) p( j) nf
j=1 j=1 n=1
(12)

nj
for E( j) ≤ n f
j=1

As long as the constraint is satisfied, path testing is efficient because no more testing is done
n j
than is necessary to find all of the faults. However, for ( j=1 E( j))>n f , path testing is inefficient
because more testing is done than is necessary to find all of the faults.
For independent path testing, use Equation (12) just for the independent paths and compare the
result with that obtained using all paths.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 187

n j
Another metric of efficiency is ( j=1 E( j)), compared with n f . This metric is computed using
only independent paths. Then the computations are compared to see which testing method produces
the greater fault coverage in relation to the number faults in the program.
The final metric of testing efficiency is node testing efficiency given in the following equation:

nn
e(n) = p(n) f (n)/n f (13)
n=1

An important point about test strategy evaluation is the effects on testing efficiency of the order of
path testing because the number of faults encountered could vary with sequence. Take the approach
in Figures 1 and 2 that path sequence is top down, testing the constructs such as If Then Else in
order.

9.1. Results of test strategies evaluation

First, note that these are limited experiments in terms of the number of examples it is feasible to use.
Therefore, there is no claim that these results can be extrapolated to the universe of the integration
of testing with reliability strategies. However, it is suggested that researchers and practitioners can
use these methods as a template for this type of research. In addition, the directed graph in the
program example is small. However, this is not a limitation of the approach because large programs
are modular (or should be). Thus, a large program can be represented by a set of directed graphs
for modules and the methods could be applied to each one.
Table I shows the results from developing the path and node connection matrix corresponding to
the directed graph in Figure 1, which shows the independent circuits and lists the paths: independent
and non-independent. In this table a ‘1’ indicates connectivity and a ‘0’ indicates no connectivity.
‘Path number’ identifies paths that are used in the plots below. A path is defined as the sequence
of nodes that are connected as indicated by a ‘1’ in the table. This table allows us to identify the
independent paths that provide a key feature of the white box testing strategy. These paths are
italicized in Table I: paths 1, 3, and 4. The definition of an independent path is that it cannot be
formed by a combination of other paths in the directed graph [13].

Table I. Path and node connection matrix nodes n.

Path number j 1 2 3 4 5 6 7 8 tn
1 1 0 0 1 1 1 1 1 1
2 1 0 0 1 1 0 1 1 1
3 1 1 0 1 1 1 1 1 1
4 1 1 0 1 1 0 1 1 1
5 0 0 1 1 1 1 1 1 1
6 0 0 1 1 1 0 1 1 1
7 0 0 0 1 0 1 1 1 1
8 0 0 0 1 1 0 0 1 1
9 0 0 0 0 1 1 1 1 1
10 0 0 0 0 1 0 1 1 1
11 0 0 0 0 0 1 1 1 1
12 0 0 0 0 0 0 1 1 1

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
188 N. SCHNEIDEWIND

1.6000

1.4000

test number = path j

1.2000 independent paths: j = 1, 3, and 4 (all are efficient)

1.0000 efficient inefficient

e ( j)

0.8000 efficient: e (j) < =1(i.e, more faults to remove)

inefficient: e (j) > 1 (i.e., no more faults to
remove)
0.6000

0.4000

0.2000

0.0000
1 2 3 4 5 6 7 8 9 10 11 12 13
j

Figure 3. Path testing efficiency e( j) vs test number j.

Figure 2 shows how faults are randomly seeded in the nodes of the directed graph in order to
evaluate the various tests. It also shows the minimum reliability path, which is the reliability of the
program because the reliability of the program cannot be greater than the reliability of the weakest
path. The minimum reliability path is also noted in Figure 7, where R( j) = 0.7292.
Noting that in Figure 3 paths are equivalent to the number of tests, this figure indicates that
path testing is efficient only for the first seven paths; after that there is more testing than necessary
to achieve efficiency because tests 1, . . . , 7 have removed all the faults. All independent paths are
efficient based on the fact that these paths were identified in Table I. However, Figure 4 tells another
story: Here the expected number of faults that is found in both path testing and independent path
testing is compared with the number of actual faults. Although independent path testing is efficient,
it accounts for only 35.42% of the faults in the program. This result dramatically shows that it is
unwise to rely on independent path testing alone to achieve high reliability.
In Figure 5, recognizing that the number of nodes is equivalent to the number of tests, it is seen
that, with node testing, the tests do not cover all the faults in the program (i.e. efficiency = 0.7917).
Of the three testing strategies, path testing provides the best coverage. It finds all of the faults but
at the highest cost. The best method depends on the application, with path testing advisable for
mission critical applications, and independent path and node testing appropriate for commercial
applications because of their lower cost.

10. DYNAMIC TESTING ANALYSIS

Up to this point, the testing strategies have been static. That is, path testing, independent path testing,
and node testing have been conducted, considering the number of tests, but without considering

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 189

40.0000

independent paths j = 1, 3, and 5 tests find 8.500 faults = 35.42 % of faults in program

35.0000

30.0000

nf = 24 faults in program
25.0000
sum E (j)

20.0000

15.0000

10.0000
test number = path j

5.0000
efficient: sum E (j) <= nf

0.0000
1 2 3 4 5 6 7 8 9 10 11 12 13
j

Figure 4. The expected number of cumulative faults encountered (and removed) sum E( j) vs test number j.

30.00

number of faults in program

25.00

node testing efficiency gap

20.00 p (n) : probability of encountering faults at node n
f (n): number of faults at node n
sum [p (n) * f (n)]

15.00 node testing efficiency = .7917

= cumulative expected faults / number of faults = 19 / 24

10.00

5.00

0.00
1 2 3 4 5 6 7 8 9
n

Figure 5. Node testing cumulative expected number of faults found sum [ p(n)∗ f (n)] vs number of nodes n.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
190 N. SCHNEIDEWIND

30
Series 1: ra (T)
Series 2: Schneidewind Single Parameter Model R (T): MRE = .4510
25 Series 3: Yamada S Shaped Model R (T): MRE = .8753
R (T), ra (T)

Series1
20 Series2
Series3
15

0
1 2 3 4 5 6 7 8
T

Figure 6. Predicted remaining failures to occur at test time T and actual remaining
failures ra (T ) vs test time T .

test time. Of course, time does not stand still in testing. With each node and edge traversal, there is
an elapsed time. Now bring time into the analysis so that a software reliability model can be used
to predict the reliability of the program in the example directed graph.
There are a number of predictions that can be made of reliability to answer the question ‘when
to stop testing’? Among these are remaining failures and reliability [20] that are predicted below,
using the SSPM [11]. In order to consider more than one model for the analysis, the remaining
failures were predicted using the Yamada S-shaped model [25] and its mean relative prediction
error was compared with SSPM. The result was that SSPM has lower error mean relative error
(MRE) = 0.4510 vs MRE = 0.8753 for Yamada, as shown in Figure 6, that compares the actual
remaining failures with the predicted values for SSPM and Yamada.
In addition, the mean number of failures in the test intervals was predicted for both models
and their MREs were compared. For SSPM the value was 0.3865 and for Yamada the value was
0.4572. Thus, because of better prediction accuracy, SSPM predictions are compared with the results
obtained with node and random node testing.
The first step in applying SSPM is to estimate the single parameter from the randomly generated
faults present at the directed graph nodes. (Parameter is defined as the rate of change of failure
rate and t is the program test time.) Then faults are randomly seeded in the directed graph using
the Excel random number generator.
Now, in preparing to develop the equation for predicting the remaining failures, the cumulative
number of failures predicted to occur at test interval T is computed as follows [11]:
T
F(T ) = e−t dt = (1/)[1−e−T ] (14)
0

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 191

Then, using Equation (14) the number of remaining failures is developed:

It can be seen that as T → ∞ in Equation (14) F(T ) becomes (1/) the total failures over the
life of the software. Then subtracting (1/)[1−e−T ], the cumulative failures at time interval T ,
from (1/), the remaining failures are given by
R(T ) = (1/)e−T (15)
Next compute the MRE [16] for R(T ) and the remaining faults produced by node and random
node testing. The error statistics are computed by comparing the remaining fault metrics with the
remaining faults after fault removal. The results are shown in Figure 8, where random node testing
yields the minimum MRE. One conclusion to be drawn from this example is that testing produced
more accurate reliability assessments than reliability prediction.
Next, based on the assumption of fault occurrence being governed by a non-homogeneous Poisson
process (i.e. the mean m t is not constant) in SSPM [11], the reliability prediction is expressed as
n m xt e−m
t t
R(t) = 1− (16)
t=1 xt !
where m t is the predicted mean number of failures in interval t and x t is the number of failures in
interval t.
In addition, the empirical number of failures in interval t is needed so that R(t) and node and
random node reliability assessments can be compared with the actual values given by the following
equation:

xt
E R(t) = 1− n (17)
t xt
As reliability predictions are being compared, using SSPM, with node testing reliability assess-
ments, it is of interest whether specific or random test samples produce more accurate reliability
assessments. According to one author [7], random sampling may be used to reduce the test suite,
but it leads to a reduction in fault-detection capability. This may be true in some programs, but, as
Figure 8 shows, random node testing had the least error for remaining failures assessment.

11. BLACK BOX TESTING ANALYSIS

For the purpose of the testing model, consider black box testing to be composed of successive tests,
each one exercising a program construct, encountering faults in the construct, and removing them.
Thus, formulate the reliability based on test k of construct c as follows:

n(c, k)
R(c, k) = k (18)
nf
where n(c, k) is the number of faults removed on test k and n f is the number of faults in the
program in Equation (17). Thus, fault removals are accumulated with each test, until as many faults
as possible have been removed. The number of faults removed is limited by the number of faults
associated with the constructs in the program.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
192 N. SCHNEIDEWIND

In addition to reliability, the efficiency of black box testing is evaluated in Equation (18) as

n(c, k)
e(c, k) = k (19)
k
The meaning of Equation (18) is that e(c, k) computes the cumulative faults removed divided by
test number k, which is equal to the number of tests.

12. RESPONSE TO RESEARCH QUESTIONS

1. Does an independent path testing strategy lead to higher efficiency and reliability than path
testing and random path testing? Independent path testing alone will not uncover nearly the
number of faults in a program. In the experiments, the results were even worse when paths
were selected randomly because it is possible that random paths could be duplicated, rendering
random path testing inefficient. This fact made it difficult to compare random testing efficiency
with path testing because not every path was tested with random testing. Instead of comparing
individual path efficiencies, the coefficients of variation for random path and path testing
efficiency were computed to gain a sense of the variation in this metric. The values are 0.5544
and 0.5680 for random path testing and path testing, respectively. Thus, there is little to choose
in terms of variability of efficiency.
Note that because path testing traverses all nodes and edges, theoretically, path testing
would yield a reliability of 1.0 after fault removal; this high reliability cannot be obtained
with independent path testing and random path testing.
2. Does a node testing strategy lead to higher reliability and efficiency than random node testing?
As shown in Figure 9, node testing provides higher prediction accuracy.
3. Does the McCabe complexity metric [13] assist in organizing software tests? Yes, even though,
as has been shown, independent path testing lacks in complete fault coverage. Nevertheless,
the metric is useful for identifying major components of a program to test.
4. Which testing strategy yields the highest reliability prior to fault removal? This question is
addressed in Figure 7, which shows the superiority of path testing in early tests, with node
testing and random node testing catching up in later tests. Thus, overall, path testing is superior.
This is to be expected because path testing exercises both nodes and edges.
5. Do reliability metrics, using SSPM, produce more accurate reliability assessments than node
and random node testing? The answer for the remaining failure predictions is ‘no’, as Figure 8
demonstrates. The answer for reliability predictions is also ‘no’ as shown in Figure 9 where
node testing produces minimum error. These results reinforce the idea that testing can produce
reliability assessment accuracy that a reliability model may not be able to achieve.
6. Which testing method, white box or black box, provides more efficient testing and higher
reliability? This question is addressed in Table II, which shows the results of the black box
testing strategy. See Figure 2 to understand the fault removal process by noting how many
faults are planted at each construct. Because black box (Equation (19)) and white box testing
(Equation (13)) efficiency are computed differently, it is necessary to compare them on the
basis of cumulative faults removed, as a function of test number. When black box in Table II
is compared with path testing (i.e. white box testing) in Figure 4, it is seen that for the same

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 193

1.0000

R (j)
0.9000

0.8000

0.7000 rn

Rn
0.6000 Rn catches up to R (j) at n= 6
rn catches up to R (j) at n= 8
Reliability

minimum R (j) = .7292

0.5000

path tests: j = 1, 12
0.4000 node tests: n = 1, 8

0.3000

0.2000

0.1000

0.0000
1 2 3 4 5 6 7 8 9 10 11 12
(j, n)

Figure 7. Reliability obtained prior to fault removal by path testing R( j), node testing Rn , and random node
testing rn vs test number ( j, n).

number of tests, black box is superior (removes more faults). The reason is that this particular
type of black box testing exercises complete program constructs, finding and removing a large
number of faults during each test.
Now, comparing the black box testing of Table II with the white box testing of Figure 7, it is
seen that white box yields the higher reliability. This is to be expected because white box testing
provides a more detailed coverage of a program’s faults.

13. RELIABILITY MODELS THAT COMBINE FAULT CORRECTION WITH

TESTING

Thus far, there has been the assumption that faults encountered in traversing a directed graph
representation of a program have been removed (i.e. corrected). In reality, this may not be the
case unless fault correction is explicitly considered. There are several software reliability models
that include fault correction in addition to reliability prediction. These models are advantageous
because the results of tests, based on fault correction, are used in reliability prediction to improve
the accuracy of prediction. One such model [26,27] is used to make predictions based on fault
correction. It would not make sense to compare test efficiency of the fault correction model with, for
example, that of the path testing model because, as explained, the former includes fault correction

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
194 N. SCHNEIDEWIND

25.00

Series 1: Predicted with SSPM, Prior to Fault Removal: MRE = .4510

Series 2: Empirical Random Node Te sting, Prior to Fault Removal: MRE
= .2525
20.00
Series 3: After Fault Removal
Series 4: Empirical Node Te sting, Prior to Fault Removal: MRE = .2835

Series1
Series2
Remaining Failures

15.00 most accurate Series3

Series4

10.00

MRE: Mean Relative Error with respect to "After Fault

5.00 Removal" remaining failures

0.00
1 2 3 4 5 6 7 8
t

Figure 8. Remaining failures vs time interval t.

but the latter does not. However, insight into the effectiveness of fault correction can be obtained
by evaluating, for example, fault correction delay time over a series of test time intervals.
It was shown in [26,27], using Shuttle flight software failure data, that the cumulative number of
faults corrected by test time T, C(T ), is related to the cumulative number of failures F(T ) detected
by time T . In addition, in the case of the Shuttle data, the number of faults is equal to the number
of failures. This is assumed to be the case in the hypothetical fault data of Figure 2, which is
used in the predictions that follow. C(T ) and F(T ) are related by the delay time dT —the time
between fault detection and completion of fault correction. Recalling that for SSPM, F(T ) is given
in Equation (20), then C(T ) can be expressed in Equation (21):

F(T ) = (1/)[1−e−(T ) ] (20)

where (1/) is the total number of failures predicted over the life of the software:

C(T ) = (1/)[1−e−(T −dT ) ] (21)

A reasonable assumption is that dT is proportional to [F(T )/(1/)] (i.e. the larger the number of
failures detected, relative to the total, the longer the correction delay). Thus, dT becomes
dT = T ∗[F(T )/(1/)] (22)
Then the fault correction rate CR(T ) can be computed as
CR(T ) = C(T )/(1/) (23)

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 195

1.0000

most accurate
0.9000

0.8000

0.7000

0.6000
Series1
Reliability

Series3
0.5000 Series4
Series5

0.4000

Series 1: SSPM prediction: MRE = 1.256

0.3000 Series 5: Node Te sting: MRE = .1599
Series 3: Random Node Te sting: MRE = .2423
Series 4: Actual Reliability Values
0.2000
MRE: mean relative error with respect to actual values

0.1000

0.0000
1 2 3 4 5 6 7 8
t

Figure 9. Reliability vs time interval t.

In addition, the remaining faults resulting from fault correction is computed as

R(T ) = (1/)−C(T ) (24)
The equations are implemented in Figure 10 using the fault data of Figure 2. The utility of these plots
is that a software developer—usually having little information about its fault correction process—
could at least obtain a rough idea of the likely outcome of tests by making the predictions shown
in Figure 10. For example, the fact that correction delay is increasing could be a concern.
Another concern is the high number of remaining faults because in quality assurance programs,
the number of faults that are found F(T ) during testing is often the basis for indicating software
correctness. However, there is a paradox to this approach, as the remaining faults R(T ) is what
impacts negatively on software correctness, and not the faults that are found [28]. On the other
hand, a beneficial trend is the increasing correction rate.

13.1. Empirical approaches

Important aspects of fault correction and testing that are not covered by models, such as the above,
are the fault correction efficiency in the various phases of software development that must be
provided by empirical evidence. In a Hewlett–Packard division application, 31% of the requirements
faults were eliminated in the requirements phase, 30% of requirements faults were eliminated in
preliminary design, 15% during detailed design, and 24% removed during testing. Additionally,

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
196 N. SCHNEIDEWIND

Table II. Black box testing strategy.

Test Cumulative
faults Testing efficiency Reliability
number k Construct c removed k n(c, k) e(c, k) R(c, k)
1 If Then Else 4 4.0 0.1667
2 While Do 8 4.0 0.3333
3 First If Then 12 4.0 0.5000
4 Second If Then 14 3.5 0.5833
n(c,k)
Testing efficiency: e(c, k) = k k

n(c,k)
Reliability: R(c, k) = kn f , n f = number of faults = 24

14.0000
CR (T): fault correction rate
12.0000

10.0000 Series 1: F (T) CR (1) = .0639 Series1

F (T), dT, C (T), R (T)

Series 2: dT CR (2) = .1158

Series2
Series 3: C (T) CR (3) = .1580
8.0000 Series3
Series 4: R (T) CR (4) = .1923
CR (5) = .2202 Series4
6.0000 CR (6) = .2427
CR (7) = .2608
CR (8) = .2750
4.0000

2.0000

0.0000
1 2 3 4 5 6 7 8
T

Figure 10. SSPM predicted cumulative failures F(T ), correction delay dT , cumulative faults corrected C(T ),
and remaining faults R(T ) vs test time interval T .

51% of the detailed design faults slipped into the testing phase. The other important aspect of
efficiency is the effort required to remove the faults. This investigation confirms that it is costly to
wait. The total effort expended to remove 236 intra-phase faults was 250.5 h while it took 1964.8 h
to remove the 248 faults that were corrected in later phases. Faults undetected within the originating
phase took approximately eight times more effort to correct. In fact, the problem does not get better
as time passes. Faults found in the field are at least in an order of magnitude more expensive to
fix than those found while testing. Faults that propagate to later phases of development produce a
nearly exponential increase in the effort, and thus in the cost, of fixing those faults [29].
A confirming example is provided by the Praxis critical systems development of the Certification
Authority for the Multos smart card scheme on behalf of Mondex International. The authors claim
that correctness by construction is possible and practical. It demands a development process that
builds correctness into every step. It demands rigorous requirements definition, precise system-
behavior specification, solid and verifiable design, and code whose behavior is precisely understood.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
INTEGRATING TESTING WITH RELIABILITY 197

It demands defect removal and prevention at every step. The number of system faults is low
compared with systems developed using less formal approaches. The distribution of effort clearly
shows that fault fixing constituted a relatively small part of the effort (6%) [30]; this contrasts
with many critical projects where fixing of late-discovered faults takes a large proportion of project
resources, as in the Hewlett–Packard example.
Experiences like these should lead the software engineering community to adopt (1) phase-
dependent predictions in reliability and testing models and (2) defect removal and fault correction
in all phases of the development process.

14. CONCLUSIONS

For white box testing, path testing was the most efficient overall. This is not surprising because
path testing exercises all components of a program—statements and transitions among statements.
Although not surprising, it was comforting to find that the law of diminishing returns has not been
overturned by the black box testing result in Table II, where both testing efficiency and reliability
increase at a decreasing rate. Results such as these can be used as a stopping rule to prevent an
organization’s testing budget from being exceeded.
An interesting result, based on Table II and Figure 3, is the superiority of black box testing
over white box testing in finding and removing faults, due to its coverage of complete program
constructs. On the other hand, the application of white box testing yields higher reliability than
black box testing because the former, using path testing, for example, mirrors a program’s state
transitions that are related to complexity, and complexity is highly related to reliability [31].
It is not clear whether these results would hold up if different programs and directed graphs
were used. A fertile area for future research would be to experiment with the test strategies on
different programs. Because it is clear that models are insufficient for capturing pertinent details of
the reliability and testing process, it is important to include empirical evidence in evaluating testing
strategies. Therefore, a promising area for future research would be to incorporate empirical data,
such as the data in the previous section, in integrated and reliability models to see whether testing
efficiency is improved.

REFERENCES

1. IEEE/AIAA P1633TM /Draft 13. Draft Standard for Software Reliability Prediction, November 2007.
2. Hamlet D. Foundations of software testing: Dependability theory. Proceedings of the Second ACM SIGSOFT Symposium.
Foundations of Software Engineering, 1994; 128–139.
3. Prowell SJ. A cost-benefit stopping criterion for statistical testing. Proceedings of the 37th Annual Hawaii International
Conference on System Sciences (HICSS’04)—Track 9, 2004; 90304b.
4. Hailpern B, Santhanam P. Software debugging, testing, and verification. IBM Systems Journal 2002; 41(1).
5. Beizer B. Software Testing Techniques (2nd edn). Van Nostrand Reingold: New York, 1990.
6. Reliable Software Technologies Corporation. http://www.cigital.com/.
7. Chen TY, Yu YT. On the expected number of failures detected by subdomain testing and random testing. IEEE
Transactions on Software Engineering 1996; 22(2):109–119.
8. Tonella P, Ricca F. A 2-layer model for the white-box testing of Web applications. Sixth IEEE International Workshop
on Web Site Evolution (WSE’04), 2004; 11–19.
9. Howden WE. Functional Program Testing and Analysis. McGraw-Hill: New York, 1987.

Copyright q 2008 John Wiley & Sons, Ltd. Softw. Test. Verif. Reliab. 2009; 19:175–198
DOI: 10.1002/stvr
198 N. SCHNEIDEWIND

10. Xie T, Notkin D. Checking inside the black box: Regression testing by comparing value spectra. IEEE Transactions on
Software Engineering 2005; 31(10):869–883.
11. Myers G. The Art of Software Testing. Wiley: New York, 1979.
12. http://www.mccabe.com/.
13. McCabe TJ. IEEE Transactions on Software Engineering 1976; Se-2(4):308–320.
14. Schneidewind NF. A new software reliability model. The R&M Engineering Journal 2006; 26(3):6–22.
15. Wong WE, Horgan JR, Mathur AP, Pasquini A. Test set size minimization and fault detection effectiveness: A case study
in a space application. COMPSAC’97—21st International Computer Software and Applications Conference, 1997; 522.
16. Fenton NF, Pfleeger SL. Software Metrics: A Rigorous & Practical Approach (2nd edn). PWS Publishing Company:
Boston, 1997.
17. Schneidewind NF. Reliability modeling for safety critical software. IEEE Transactions on Reliability 1997; 46(1):88–98.
18. Voas JM, McGraw G. Software Fault Injection: Inoculating Programs Against Errors. Wiley: New York, 1998.
19. Beydeda S, Gruhn V, Stachorski M. A graphical class representation for integrated black- and white-box testing.
Seventeenth IEEE International Conference on Software Maintenance (ICSM’01), 2001; 706.
20. Horgan JR, Mathur AP. Assessing testing tools in research and education. IEEE Software 1992; 9(3):61–69.
21. Musa JD. Software Reliability Engineering: More Reliable Software, Faster and Cheaper (2nd edn). Authorhouse: 2004.
22. General Accounting Office (GAO). Best Practices: A More Constructive Test Approach is Key to Better Weapon System
Outcomes. GAO: Washington, 2000.
23. Mogyorodi GE. Bloodworth Integrated Technology, Inc. ‘What is Requirements-based Testing? Cross Talk, March 2003.
24. Schick GJ, Wolverton RW. A History of Software Reliability Modeling, University of Southern California and Thompson
Ramo Woodridge Corporation (undated).
25. Xie M. Software Reliability Modelling. World Scientific: Singapore, 1991.
26. Schneidewind NF. Modeling the fault correction processes, part 2. The R&M Engineering Journal 2004; 24(Part 2(1)):
6–14; ISSN 0277-9633.
27. Schneidewind NF. Modeling the fault correction processes, part 1. The R&M Engineering Journal 2003; 23(Part 1(4)):
6–15; ISSN 0277-9633.
28. Zage D, Zage W. An analysis of the fault correction process in a large-scale SDL production model. Twenty-fifth
International Conference on Software Engineering (ICSE’03), 2003; 570.
29. Runeson P, Holmstedt Jönsson M, Scheja F. Are found defects an indicator of software correctness? An investigation
in a controlled case study. Fifteenth International Symposium on Software Reliability Engineering (ISSRE’04), 2004;
91–100.
30. Hall A, Chapman R. Correctness by construction: Developing a commercial secure system. IEEE Software 2002;
19(1):18–25.
31. Khoshgoftaar TM, Munson JC. Predicting software development errors using software complexity metrics. IEEE Journal
on Selected Areas in Communications 1990; 8(2):253–261.
32. Keller T, Schneidewind NF, Thornton PA. Predictions for increasing confidence in the reliability of the space shuttle
flight software. Proceedings of the AIAA Computing in Aerospace 10, San Antonio, TX, 28 March 1995; 1–8.

ST Unit 1 - Notes
No ratings yet
ST Unit 1 - Notes
20 pages
Department of Information Science and Engineering 18IS62-Software Testing Notes
100% (5)
Department of Information Science and Engineering 18IS62-Software Testing Notes
36 pages
ST Module 1
No ratings yet
ST Module 1
40 pages
Patterns: From System Design To Software Testing: Jason O. Hallstrom Adem Delibas
No ratings yet
Patterns: From System Design To Software Testing: Jason O. Hallstrom Adem Delibas
15 pages
STM Notes III-II IT
No ratings yet
STM Notes III-II IT
127 pages
Software Testing Unit 1 Notes
No ratings yet
Software Testing Unit 1 Notes
37 pages
Software Testing 456
No ratings yet
Software Testing 456
31 pages
STModule_1__Question_Answers
No ratings yet
STModule_1__Question_Answers
11 pages
STM Unit-1,2,3 PDF
No ratings yet
STM Unit-1,2,3 PDF
88 pages
Jntu College of Engineering, Pulivendula Computer Science and Engineering Department
No ratings yet
Jntu College of Engineering, Pulivendula Computer Science and Engineering Department
16 pages
UNIT 4 SE
No ratings yet
UNIT 4 SE
20 pages
Considering_fault_removal_efficiency_in_software_reliability_assessment
No ratings yet
Considering_fault_removal_efficiency_in_software_reliability_assessment
7 pages
Software Testing Methodologies U-I
No ratings yet
Software Testing Methodologies U-I
1 page
Software Testing Question solution1 (2)
No ratings yet
Software Testing Question solution1 (2)
11 pages
Software Testing
No ratings yet
Software Testing
16 pages
SW Testing and QA Take Home Exam Answers
No ratings yet
SW Testing and QA Take Home Exam Answers
50 pages
Ch 6 Software Testing
No ratings yet
Ch 6 Software Testing
21 pages
SE Unit IV 2marks
No ratings yet
SE Unit IV 2marks
7 pages
Considering Fault Removal Efficiency in Software Reliability Assessment
No ratings yet
Considering Fault Removal Efficiency in Software Reliability Assessment
7 pages
Software Testing 2nd IA
No ratings yet
Software Testing 2nd IA
11 pages
Automatic Test Software: Yashwant K. Malaiya
No ratings yet
Automatic Test Software: Yashwant K. Malaiya
15 pages
Software testing
No ratings yet
Software testing
57 pages
Testing Assignment
No ratings yet
Testing Assignment
4 pages
Exam 1
No ratings yet
Exam 1
8 pages
Software Testing Imp
No ratings yet
Software Testing Imp
23 pages
ST Module1
No ratings yet
ST Module1
35 pages
Harsh Stqa
No ratings yet
Harsh Stqa
76 pages
Software Testing Experiments
No ratings yet
Software Testing Experiments
53 pages
Semi-Proving: An Integrated Method For Program Proving, Testing, and Debugging
No ratings yet
Semi-Proving: An Integrated Method For Program Proving, Testing, and Debugging
17 pages
Software Reliability
No ratings yet
Software Reliability
24 pages
Software Testing
No ratings yet
Software Testing
18 pages
Manual Testing Interview Questions_RM-PART-02
No ratings yet
Manual Testing Interview Questions_RM-PART-02
9 pages
Testing
No ratings yet
Testing
57 pages
Lecture Notes: Software Testing
No ratings yet
Lecture Notes: Software Testing
116 pages
St Module1
No ratings yet
St Module1
22 pages
Software Testing and Quality Assurance
No ratings yet
Software Testing and Quality Assurance
4 pages
4.high, Midum, Low 4. High, Midum, Low: Severity Priority
No ratings yet
4.high, Midum, Low 4. High, Midum, Low: Severity Priority
5 pages
Testing Int Questions
No ratings yet
Testing Int Questions
33 pages
Chapter 7
No ratings yet
Chapter 7
22 pages
ST Mod 1
No ratings yet
ST Mod 1
25 pages
SWQTesting Unit-1
No ratings yet
SWQTesting Unit-1
31 pages
Software Testing1
No ratings yet
Software Testing1
139 pages
(2019-ICSE) Hunting For Bugs in Code Coverage Tools Via Randomized Differential Testing
No ratings yet
(2019-ICSE) Hunting For Bugs in Code Coverage Tools Via Randomized Differential Testing
12 pages
CVR College of Engineering: Purpose of Testing - CO1
No ratings yet
CVR College of Engineering: Purpose of Testing - CO1
90 pages
Harsh Stqa
No ratings yet
Harsh Stqa
76 pages
Unit 4 - Testing Operating System SEM-5
No ratings yet
Unit 4 - Testing Operating System SEM-5
12 pages
Mutation Operators For Spreadsheets: Robin Abraham and Martin Erwig, Member, IEEE
No ratings yet
Mutation Operators For Spreadsheets: Robin Abraham and Martin Erwig, Member, IEEE
17 pages
Module II notes (6)
No ratings yet
Module II notes (6)
32 pages
1401 5830 PDF
No ratings yet
1401 5830 PDF
14 pages
Software Testing Notes Prepared by Mrs. R. Swetha M.E Unit I - Introduction at The End of This Unit, The Student Will Be Able To
No ratings yet
Software Testing Notes Prepared by Mrs. R. Swetha M.E Unit I - Introduction at The End of This Unit, The Student Will Be Able To
30 pages
STM Key
No ratings yet
STM Key
9 pages
Ste (2023 - 01 - 12 06 - 41 - 25 UTC)
No ratings yet
Ste (2023 - 01 - 12 06 - 41 - 25 UTC)
32 pages
Software Testing
No ratings yet
Software Testing
273 pages
Assignment - Set-1 (60 Marks) : Master in Business Administration-MBA SEM - III MI0033 - Software Engineering
No ratings yet
Assignment - Set-1 (60 Marks) : Master in Business Administration-MBA SEM - III MI0033 - Software Engineering
27 pages
Chapter I - Software
No ratings yet
Chapter I - Software
10 pages
Software Testing Notes PDF
No ratings yet
Software Testing Notes PDF
193 pages
SE Unit 4
No ratings yet
SE Unit 4
40 pages
Testing of Program Correctnes in Formal Theory: Special Issue On ICIT 2009 Conference - Bioinformatics and Image
No ratings yet
Testing of Program Correctnes in Formal Theory: Special Issue On ICIT 2009 Conference - Bioinformatics and Image
10 pages
Testing Unit 3
No ratings yet
Testing Unit 3
24 pages
Defect Prediction in Software Development & Maintainence
From Everand
Defect Prediction in Software Development & Maintainence
Rudra Kumar
No ratings yet
Mastering the Art of Unit Testing: Unraveling the Secrets of Expert-Level Programming
From Everand
Mastering the Art of Unit Testing: Unraveling the Secrets of Expert-Level Programming
Steve Jones
No ratings yet
Coca Cola HBC
No ratings yet
Coca Cola HBC
37 pages
An Essay On Software Testing For Quality Assurance
No ratings yet
An Essay On Software Testing For Quality Assurance
9 pages
Multipleparameter Coupling Metrics For Layered Componentbased Software
No ratings yet
Multipleparameter Coupling Metrics For Layered Componentbased Software
20 pages
TB_frq5
No ratings yet
TB_frq5
13 pages
Wooldridge 6e Ch09 SSM
No ratings yet
Wooldridge 6e Ch09 SSM
8 pages
Comparison of Mixolab and Falling Number
No ratings yet
Comparison of Mixolab and Falling Number
13 pages
CHAPTER 3 Results and Discussion
No ratings yet
CHAPTER 3 Results and Discussion
5 pages
Comparative Study of The Efficacy of Gleptoferron and Iron Dextran in Anemia Prevention in Piglets
100% (3)
Comparative Study of The Efficacy of Gleptoferron and Iron Dextran in Anemia Prevention in Piglets
1 page
Language of Research
No ratings yet
Language of Research
4 pages
MECH3430 Measurement and Control Computer Labs Dr. Nariman Sepehri
No ratings yet
MECH3430 Measurement and Control Computer Labs Dr. Nariman Sepehri
3 pages
CasptoneProject_General_predefense1-3-SARGUILLA&LUMALANG (2)
No ratings yet
CasptoneProject_General_predefense1-3-SARGUILLA&LUMALANG (2)
53 pages
Organisation Theory and Behaviour Assignment
No ratings yet
Organisation Theory and Behaviour Assignment
4 pages
Elementary Probability
No ratings yet
Elementary Probability
25 pages
Buy ebook Assessment of Children and Youth with Special Needs 5th Edition Libby G. Cohen cheap price
100% (2)
Buy ebook Assessment of Children and Youth with Special Needs 5th Edition Libby G. Cohen cheap price
82 pages
The Sat As A Predictor of College Success
No ratings yet
The Sat As A Predictor of College Success
8 pages
Consumer Behaviour - Margiela
No ratings yet
Consumer Behaviour - Margiela
29 pages
Laplace Transform Homotopy Perturbation Method For The Approximation of Variational Problems PDF
No ratings yet
Laplace Transform Homotopy Perturbation Method For The Approximation of Variational Problems PDF
34 pages
Text As Connected Discourse
No ratings yet
Text As Connected Discourse
69 pages
Effects of Third Year Medical Technology Students ' Amount of Sleep, Screen Time, and Physical Activity To Their Academic Performance
No ratings yet
Effects of Third Year Medical Technology Students ' Amount of Sleep, Screen Time, and Physical Activity To Their Academic Performance
15 pages
Graphic Organizers and Their Effects On The Reading Comprehension of Students With LD
No ratings yet
Graphic Organizers and Their Effects On The Reading Comprehension of Students With LD
14 pages
Ort 85 314 PDF
No ratings yet
Ort 85 314 PDF
9 pages
Trekking Report
No ratings yet
Trekking Report
182 pages
Sources of History: Historical Criticism: Truthfulness of The Past. These Evidences
No ratings yet
Sources of History: Historical Criticism: Truthfulness of The Past. These Evidences
3 pages
Revised Financial Management Syllabus 2019
No ratings yet
Revised Financial Management Syllabus 2019
16 pages
DLP DIASS Q2 Week A - Settings, Processes and Tools in Communication
No ratings yet
DLP DIASS Q2 Week A - Settings, Processes and Tools in Communication
14 pages
Role of Managerial Economics in Decision Making
No ratings yet
Role of Managerial Economics in Decision Making
2 pages
WORKBOOK
No ratings yet
WORKBOOK
13 pages
A Review On Yard Management in Container Terminals
100% (1)
A Review On Yard Management in Container Terminals
16 pages
Literature Review Godrej
100% (2)
Literature Review Godrej
6 pages
Get Sustainable Development Strategies A Resource Book First Edition Barry Dalal-Clayton Free All Chapters
100% (4)
Get Sustainable Development Strategies A Resource Book First Edition Barry Dalal-Clayton Free All Chapters
84 pages
Bloom's Taxonom-WPS Office
No ratings yet
Bloom's Taxonom-WPS Office
12 pages
SUMMER 2023 BUS 498 EXIT Marketing Assesment
No ratings yet
SUMMER 2023 BUS 498 EXIT Marketing Assesment
15 pages
Basic Education Department Senior High School: Central West, Bauang La Union
No ratings yet
Basic Education Department Senior High School: Central West, Bauang La Union
35 pages