0% found this document useful (0 votes)
3 views

testing1234567

The document outlines the concepts and processes of Quality Assurance (QA) in software development, emphasizing the importance of software quality, assurance, control, and engineering. It distinguishes between verification and validation, and discusses various testing methodologies, including static and dynamic evaluations, as well as black box and white box testing. Additionally, it highlights the significance of peer reviews, unit testing, and defect severity assessments in ensuring software reliability and quality.

Uploaded by

sta.emails
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

testing1234567

The document outlines the concepts and processes of Quality Assurance (QA) in software development, emphasizing the importance of software quality, assurance, control, and engineering. It distinguishes between verification and validation, and discusses various testing methodologies, including static and dynamic evaluations, as well as black box and white box testing. Additionally, it highlights the significance of peer reviews, unit testing, and defect severity assessments in ensuring software reliability and quality.

Uploaded by

sta.emails
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

QUALITY ASSURANCE

Michael Weintraub
Fall, 2015
Unit Objective
• Understand what quality assurance means
• Understand QA models and processes
Definitions According to NASA
• Software Assurance: The planned and systematic set of activities that ensures that software life cycle processes and
products conform to requirements, standards, and procedures.
• Software Quality: The discipline of software quality is a planned and systematic set of activities to ensure quality is
built into the software. It consists of software quality assurance, software quality control, and software quality engineering. As
an attribute, software quality is (1) the degree to which a system, component, or process meets specified requirements. (2)
The degree to which a system, component, or process meets customer or user needs or expectations [IEEE 610.12 IEEE
Standard Glossary of Software Engineering Terminology].
• Software Quality Assurance: The function of software quality that assures that the standards, processes, and
procedures are appropriate for the project and are correctly implemented.
• Software Quality Control: The function of software quality that checks that the project follows its standards,
processes, and procedures, and that the project produces the required internal and external (deliverable) products.
• Software Quality Engineering: The function of software quality that assures that quality is built into the software by
performing analyses, trade studies, and investigations on the requirements, design, code and verification processes and
results to assure that reliability, maintainability, and other quality factors are met.
• Software Reliability: The discipline of software assurance that 1) defines the requirements for software controlled
system fault/failure detection, isolation, and recovery; 2) reviews the software development processes and products for
software error prevention and/or controlled change to reduced functionality states; and 3) defines the process for measuring
and analyzing defects and defines/derives the reliability and maintainability factors.
• Verification: Confirmation by examination and provision of objective evidence that specified requirements have been
fulfilled [ISO/IEC 12207, Software life cycle processes]. In other words, verification ensures that “you built it right”.
• Validation: Confirmation by examination and provision of objective evidence that the particular requirements for a specific
intended use are fulfilled [ISO/IEC 12207, Software life cycle processes.] In other words, validation ensures that “you built
the right thing”.

From: http://www.hq.nasa.gov/office/codeq/software/umbrella_defs.htm
Software Quality Assurance

Technology Objective: Designing a quality system


and writing quality software
√ The tech team aims to deliver a correctly behaving system to the
client

Software Quality Assurance is about assessing if the


system meets expectations

Доверяй, но проверяй
(Russian Proverb - Doveryay, no proveryay)

Trust, but verify


Validation Versus Verification
Validation Verification

Are we building the right Are we building the


product or service? product or service right?

Both involve testing – done at every stage


but “testing can only show the presence of errors,
not their absence” Dijkstra
Validation

Typically a client-leaning activity

After all, they are the ones who asked for the
system

Product Trials User Experience


Evaluation
Verification
Optimist: It’s about showing
correctness/goodness
Pessimist: It’s about identifying defects

Good Input ? Good Output


System
Bad Input Bad Output
?
Quality versus Reliability
Quality Assurance Reliability

Assessing whether a Probability of failure-free


software component or software operation for a
system produces the specified duration in a
expected/correct/accepted particular environment
behavior or output
relationship between a Cool phrases
given set of inputs
Five 9’s
OR
No down-time
Assessing features of the
software
Fun Story – First Computer Bug (1947)
The First "Computer Bug". Moth found
trapped between points at Relay # 70, Panel
F, of the Mark II Aiken Relay Calculator while
it was being tested at Harvard University, 9
September 1947.

The operators affixed the moth to the


computer log, with the entry: "First actual
case of bug being found". They put out the
word that they had "debugged" the machine,
thus introducing the term "debugging a
comp...uter program".

In 1988, the log, with the moth still taped by


the entry, was in the Naval Surface Warfare
Center Computer Museum at Dahlgren,
Virginia. The log is now housed at the
Smithsonian Institution’s National Museum of
American History, who have corrected the
date from 1945 to 1947. Courtesy of the
Naval Surface Warfare Center, Dahlgren,
VA., 1988. NHHC Photograph Collection, NH
96566-KN (Color).

From https://www.facebook.com/navalhistory/photos/a.77106563343.78834.76845133343/10153057920928344/
Testing is Computationally Hard
The space is huge and it is generally infeasible to test
anything completely

Assessing quality is an exercise in establishing confidence


in a system
Or Minimizing Risks

Other factors include App1

Quality of the Process OS1 Each layer


introduces
Quality of the Team VM risk
Quality of the Environment Host OS

Hardware
Lots to Consider
• Component behavior
• Interactions between
components
• System and sub-
system behavior
• Interactions between
sub-systems
• Negative path
• Behavior under load
• Behavior over time
• Usability
Two Approaches

Static Evaluations Dynamic Evaluations

Making judgments Involves executing the


without executing the code and judging
code performance
Static Technique - Reviews
Fundamental QA Technique

Peer(s) reviews artifact for correctness and clarity


Often a formal process

Value: finding issues at


design/definition time rather than
waiting for results of the step to
complete Requirements Test Plans

Highly effective, but does not Architecture


Implementation
replace the need for dynamic & Design

techniques
One Extreme: Jury/Peer Reviews
Before anything is accepted, someone other than
the creator must review it and approve it
• Single reviewer model
– Usually a “certified” / senior person

• Panel model
– Highly structured reviews

– Can take significant preparation

• Usually done at the design or


development stage

• May introduce delay between


when code is written and when it
gets reviewed
Reviews
Models exist for both reviewer or author to lead the discussion

Author usually provides participants materials to study in advance

Requires positive and open attitudes and preparation

Value
Review Meeting
Second opinion on clarity,
effectiveness, and efficiency
Moderator Scribe
Learning from others
Review Panel
Peers Avoids “board blindness” on seeing
Author
Experts flaws
Client(s)
Peer pressure to be neat and tie up
loose ends
Paired Programming
Lightweight Peer Reviews

One person drives while the other


watches/reviews

Derived from Extreme


Programming, current favorite in
agile
 Continuous review
When compared to solo dev
models,  Shared problem solving
MAY cause higher initial cost per  Better communications
module created (time and  Learning from Peer
resource), BUT higher quality and  Social!
lower overall cost
 Peer Pressure
See as an example
http://collaboration.csc.ncsu.edu/laurie/Papers/XPSardinia.PDF
What do reviews look for?
Clarity
Can the reader easily and directly understand what the artifact is doing

Correctness
Analysis of algorithm used

Common Code Faults


1. Data initialization, value ranges and type mismatches
2. Control: are all the branches really necessary (are the conditions properly and
efficiently organizated)? Do loops terminate?
3. Input: are all parameters or collected values used?
4. Output: every output is assigned a value?
5. Interface faults: Parameter numbers, types, and order; structures and shared
memory
6. Storage management: memory allocation, garbage collection, inefficient
memory access
7. Exception handling: what can go wrong, what error conditions are defined and
how are they handled

List adapted from W. Arms: http://www.cs.cornell.edu/Courses/cs5150/2015fa/slides/H2-testing.pdf


Examples
You are asked to sort an array. There are many algorithms to sort an
array. [You aren’t going to use a library function so you have to write
this]

Many choices exist. Suppose you are deciding between bubble sort,
quicksort, and merge sort.
All will work (sort an array), but which will be the better code ?

Bubble sort is very easy to write: two loops. Slow on average O(n2) –
how big will n be?? O(n) for memory.

Quicksort is complicated to write. O(n log(n)) on average, O(n2) worst


case. Requires constant memory O(n). Very effective on in-memory
data. Most implementations are very fast.

Mergesort is moderate to write. O(n log(n)) worst case. Memory


required is a function of the data structure. Very effective on data that
requires external access.
Expressively Logical…
boolean SquareRoot (double dValue, boolean SquareRoot (double dValue,
double &dSquareRoot) double &dSquareRoot)
{ {
boolean bRetValue = false; dSquareRoot = NULL;

if (dValue < 0) { if (dValue < 0)


dSquareRoot = NULL; return false;
bRetValue = false;
} dSquareRoot = pow(dValue, 0.5);
else { return true;
dSquareRoot = pow(dValue, 0.5); }
bRetValue = true;
}
return bRetValue;
}
Static Program Analyzers
Evaluate code modules automatically looking for errors or
odd things
 Loops or programs within multiple exits (more common) or entries (less
common)
 Undeclared, uninitialized, or unused variables
 Unused functions/procedures, parameter mismatches
 Unassigned pointers
 Memory leaks
 Show paths through code/system
 Show how outputs depend on inputs
Rules of Defensive Programming
(taken from Bill Arms)
Based on Murphy’s Law:
Anything that can go wrong, will

1. Write SIMPLE code 4. Eliminate all compiler


warnings from code
2. If code is difficult to read,
RE-WRITE IT 5. It never hurts to check
system states after
3. Test implicit assumptions modification
– Check all parameters
passed in from other
modules
Dynamic Evaluations
Quick Terminology Objective

• Mistake Write test cases and organize


– A human action that results in
an incorrect result them into suites that cause failure
• Fault / Defect and illuminate faults
– Incorrect step, process, or data
within the software
• Failure Ideally you will fail in striving for
– Inability of the software to this objective,
perform within performance
criteria but you will be surprised how
• Error successful you may be
– The difference between the
observed and the expected
value or behavior
Who is a Tester?

Developers Good for exposing known risks


areas

Experienced Outsiders Good for finding gaps missed


and Clients by developers

Inexperienced Users Good for finding other errors

Mother Nature Always finds the hidden flaw


Approaches
1. Top Down
– System flows are tested Especially useful in
• UI’s, UX
– Units are stubbed • Workflows
• Very large systems
2. Bottom Up
– Each unit is tested on its own

3. Stress
– Test at or past design limits
Testing Flow
(Dynamic Evaluation)

Soak –
Operational
Readiness Acceptance

Installation
Performance
Functional Client
Unit Integration
Test Operational Test
Test
System Test

Unit
Two Forms of Testing

Black Box White Box


Black Box Testing
• No access to the internal • With software, this tests
workings of the system the interface
under test (SUT)
→What is input to the
• Testing against system?
specifications
– The tester knows what the →What you can do from
SUT’s I/O or behavior the outside to change
should be the system?

• The tester observes the →What is output from the


results or behavior system?
Can a Component Developer Do
Black Box Testing?
White Box Testing
• Have access to the • Testing evaluates logical
internal workings of the paths through code
system under test (SUT) – Conditionals
– Loops
– Branches
• Testing against
specifications, with
access to algorithms, • Impossible to exercise all
paths completely, so you
data structures, and make compromises
messaging.
– Focus paths on only
• The tester observes the important paths
results or behavior • Keeping components
small is a big help here

– Focus on only important


data structures
Ground Floor – Unit Testing
Tests focus an individual Unit tests decouple the
component
1. Interfaces developer from the code
2. Messages Individual code ownership is not
3. Shared memory required if unit tests protect the
4. Internal functions
code

Emphasizes adherence to
the specifications Unit tests enable
refactoring
Code bases often include After each small change, the unit
the code and the unit tests tests can verify that a change in
as a coherent piece structure did not introduce a
Usually done by developers building the change in functionality
component
What Makes for a Good Test
Test Perspective Tester Perspective
• Either addresses a partition of inputs or Know why the test exists
tests for common developer errors
• Automated – Should target finding specific
• Runs Fast
problems
– To encourage frequent use
– Should optimize the cost of
• Small in scope defining and running the test
– Test one thing at a time
against the likelihood of
• When a failure occurs, it should finding a fault/failure
pinpoint the issue and not require much
debugging
– Failure messages help make the issue
clear
– Should not have to refer to the test to
understand the issue
Organizing Testing
Test Plan Test Suite
Describes test activities A set of test cases and scripts to
1. Scope measure answers
2. Approach Often the post condition of one test
3. Resources is often used as the precondition for
4. Schedule the next one
Identifies
• What is to be tested
• The tasks required to do the testing
• Who will do each task OR
• The test environment
• The test design techniques
• Entry and exit criteria to be used
• Risk identification and contingency Tests may be executed in any order
planning

Adapted from http://sqa.stackexchange.com/questions/9119/test-suite-vs-test-plan


Defect Severity
An assessment of a defect’s impact
Can be a major source of contention between dev and test

Show stopper. The functionality cannot be delivered


Critical unless that defect is cleared. It does not have a workaround.
Major flaw in functionality but it still can be released. There is a
Major workaround; but it is not obvious and is difficult.
Minor Affects minor functionality or non-critical data. There is an easy
workaround.
Trivial Does not affect functionality or data. It does not even need a
workaround. It does not impact productivity or efficiency. It is
merely an inconvenience.
Test Exit Report – Input to Go/No Go Decision
1. Document Purpose 5. Types of testing
Short description about the objective performed
– Description of tests run
2. Application Overview
Overview of the SUT 6. Test Environment and
Tools
3. Testing Scope – Description of the environment.
Describes the functions/modules in and Helpful for recreating issues and
out of scope for testing. Also identifies understanding context
what was omitted.
7. Recommendations
4. Metrics – Workaround options
Results of testing, including summaries
– Number of test cases planned vs
executed 8. Exit Criteria
– Number of test cases passed/failed – Statement whether SUT passes or
not
– Number of defects identified and
their Status & Severity
– Distribution of defects 9. Conclusion/Sign Off
– Go/ no go recommendation
Testing Hint #1 – Mess With Inputs
• If a single value, try
– Negative values
– Alternate types
– Very small or very large inputs (overflow buffers if you can)
– Null values

• If input is a sequence, try


– Using a single valued sequence
– Repeated values
– Varying the length of sequences and the order of the data
– Forcing situations where the first, last and middle values are used

• Try to force each and every error message

• Try to force computational overflows or underflows


Testing Hint #2 – Force Every Path

Each logical path must be exercised at least once


Each execution path through the code

• If…then…else = Two paths


• Switch…case() = One path per case
• +one path if no catch-all case

• Repeat…Until ≥ Two paths


• While…Do ≥ Two paths

• Object member functions = 1 path per signature


Testing Hint #3 – Mess With Interfaces
• Remember, interfaces may be involve
1. References to data or functions
• Data may be passed by-reference or by-data Internals
• Methods only have data interfaces 1. Functions
2. Shared memory 2. Data
3. Messages

• Set interface parameters to extremely low and high values


• Set pointer values to NULL
• Mis-type the parameters or violate value boundaries
– e.g. set input as negative where the signature expects ≥ 0

• Call the component so it will fail and check the failure reactions

• Pass too few or too many parameters

• Bombard the interface with messages

• With shared memory, vary accessor instantiation and access activities


Testing Hint #4 – Be Diabolical

Try to break the system by using data with


extreme values to crash the system
Life Lessons
• If unit testing is not thorough, • Unit tests will be most needed
all subsequent testing will when you have the least
likely be a waste of time. amount of time
– Unit tests should be created
• You should always take the before they are needed, not
time to do a good job with unit
testing when you need them
– Even when the project is
falling behind

• The end of a project is almost


always compressed
– Developers often defer testing-
related tasks until as late as
possible
System Test
Integrating components and sub-systems to create the
system
Testing checks on component compatibility, interactions,
correctly passing information, and timing

Like Unit Test, activities focus Unlike Unit Test


on following uses and data
• Components may come from
1. Typical many, independent parties
2. Boundaries • Bespoke development may
meet Off-The-Shelf or reused
3. Outliers components
4. Failures • Testing becomes a group
activity
• Testing may move to an
independent team altogether
WARNING

Will be a complete and utter waste if


components are not thoroughly tested
Unlike Components, Systems Have
Emergent Behavior

Some behavior is only clear when you put


components together

This has to be tested too,

although it can be very hard to plan in advance!


Integrating Multiple Parties May Introduce
Conflict
System Integration Implications

• Components may come from • Who controls integration


multiple, possibly independent, readiness?
parties – What does lab entry mean?
– Are COTS components trusted?
• Bespoke development may
meet Off-The-Shelf or reused • How to assign credit for test
results and then who is
components
responsible for repairs?
– How to maintain momentum
• Testing becomes a group when everyone isn’t at the
activity table?
– When partner priorities are not
• Testing may move to an shared?
independent team altogether – What about open source?
Testing Focus
Emphasizes component compatibility, interactions,
correctly passing information, and timing

Integration aims to find misunderstandings one component


introduces when it interacts with other components

Use Cases are a useful testing


model

• Forces components to interact


• Sequence diagrams form a
strong basis for designing
these tests
– Articulates the inputs required and
the expected behaviors and
outputs
Iterative Development Leads to Iterative Testing
Two senses
1. Create tests incrementally

2. Run tests iteratively


a. On check-in and branch merge, test all affected modules
b. On check-in, test all modules
c. Per a schedule, test all modules Regression Testing
– E.g. daily

Each change, especially after a bug fix, should mean adding


at least one new test case

It is always best to test after each change as completely as


you can, and completely before a release
Your testing is good enough until a problem
shows that it is not good enough
It is hard to know when you should feel enough confidence
to release the system
Confidence comes, in part, on the sub-test of possible tests selected

high
Picking the Subset
Selection based on company many defects found few defects found
policy
Every statement must executed
Software
one Quality
Every path must be exercised low high
Crafted by specific end user use
cases (scenario testing) few defects found few defects found

Selection based on testing team


experience Test Quality
low
Measuring Quality: Defect Density
Using the past to estimate the future

Judges code stability by comparing past number of bugs per


code measure (lines of code, number of modules,…) to
present measured levels
Poor Software
10 9.5 Quality
bugspre−release (i) + bugspost−release (i) 9
bugDensityrelease(i) = Expected
𝑐𝑜𝑑𝑒𝑀𝑒𝑎𝑠𝑢𝑟𝑒 8
7 Quality
7

Defect Density
If density for the next release’s additional code is 6 Poor Test
within ranges of prior releases, it is a candidate for Coverage/Quality
5
release 4
3
Unless test or development practices have improved 2
1
0
Release 1 2
Measuring Quality: Defect Seeding
Using a known quantity as inference to the unknown

Judges code stability by intentionally inserting bugs into a program


and then measuring how many get found as an estimator for the
actual number of bugs
seededBugsp𝑙𝑎𝑛𝑡𝑒𝑑(i)
bugsrelease(i) = ∗ bugsfound(i)
seededBugsfound(i)

Challenges
1. Seeding is not easy. Placing right kinds of bugs in enough of the code is hard.

– Bad seeding, being too easy or too hard to find, creates false senses of confidence in your
reviews and testing
• Too easy: doesn’t mean that most or all of the real bugs were found.
• Too hard: danger of looking past the Goodenov line or for things that aren’t there

2. Seeded code must be cleansed of any missed seeds before release. Post clean-
up, the code must be tested to insure nothing got accidently broken.
Measuring Quality: Capture-Recapture
Applies estimating technique used in predicting wild-life
populations (Humphrey, Introduction to Team Software Process, Addison
Wesley, 2000)

Uses data collected by two or more independent


collectors
Collected via reviews or tests

Example: Estimating Turtle Population


You tag 5 turtles and release them.
You later catch 10 turtles, two have tags.
𝑇𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑡𝑢𝑟𝑡𝑙𝑒𝑠 10 𝑡𝑢𝑟𝑡𝑙𝑒𝑠

5 𝑡𝑢𝑟𝑡𝑙𝑒𝑠 2 𝑡𝑢𝑟𝑡𝑙𝑒𝑠
10 𝑡𝑢𝑟𝑡𝑙𝑒𝑠 ∗ 5 𝑡𝑢𝑟𝑡𝑙𝑒𝑠
𝑇𝑜𝑡𝑎𝑙 # 𝑜𝑓 𝑡𝑢𝑟𝑡𝑙𝑒𝑠 = = 25 𝑡𝑢𝑟𝑡𝑙𝑒𝑠
2 𝑡𝑢𝑟𝑡𝑙𝑒𝑠
Capture-Recapture
Each collector finds some defects out of the total number of
defects
Some of these defects found will overlap
Method

1. Count the number of defects found by each collector (A, B)


2. Count the number of intersecting defects found by each collector (C)
3. Calculate defects found = (A+B) - C
(𝐴∗𝐵)
4. Estimate total defects = 𝐶
(𝐴∗𝐵)
5. Estimate remaining defects = - (A+B)-C
𝐶

If multiple collectors, assign A to the highest collected number and set B to the rest of
the collected defects. When multiple engineers find the same defect, count it just once.
Performance Testing
Measures the system’s capacity to process
load
Involves creating and executing an operational
profile that reflects the expected values of uses

Performance Stress
Aims to assess compliance with Identify defects that emerge only
non-functional requirements under load
Endurance
Measures reliability and availability

Ideally the system should degrade gracefully rather than collapse under load
Under load, other issues like protocol overhead or timing issues take center
stage

You might also like