Title: Part III: Execution
1Part III Execution Based Verification and
Validation
- Katerina Goseva - Popstojanova
- Lane Department of Computer Science and
Electrical Engineering - West Virginia University, Morgantown, WV
- katerina_at_csee.wvu.edu www.csee.wvu.edu/katerin
a
2Outline
- Introduction
- Definitions, objectives and limitations
- Testing principles
- Testing criteria
- Testing techniques
- Black box testing
- White box testing
- Fault based testing
- Fault injection
- Mutation testing
- Hierarchy for test adequacy criteria
3Outline
- Testing levels
- Unit testing
- Integration testing
- Top-down
- Bottom-up
- Regression testing
- Validation testing
- Non-functional testing
- Configuration testing
- Recovery Testing
- Safety testing
- Security testing
- Stress testing
- Performance testing
4Black box testing
- Also called behavioral or functional testing
- The program test cases are based on the system
specification - Test planning can begin early in the software
process
5Black box testing
6Black box testing - Equivalence partitioning
- Input data and output results often fall into
different classes where all members of a class
are related - Each of these classes is an equivalence partition
where the program behaves in an equivalent way
for each class member - An equivalence class represents a set of valid or
invalid states for input conditions
7Black box testing - Equivalence partitioning
- Equivalence classes may be defined according to
the following guidelines - If an input condition specifies a range, one
valid and two invalid equivalence classes are
defined - If an input condition requires a specific value,
one valid and two invalid equivalence classes are
defined - If an input condition specifies a member of a
set, one valid and one invalid equivalence class
are defined - If an input condition is Boolean, one valid and
one invalid class are defined
8Black box testing - Equivalence partitioning
- Example
- If input is a 5-digit integer between 10,000 and
99,999 (input condition range) - equivalence partitions are
- less than 10,000 (invalid)
- between 10,000 and 99, 999 (valid)
- more than 99,999 (invalid)
- Choose test cases at the boundary of these sets
- 9,999 10,000 99,999 100,000
9Black box testing Boundary value analysis
- Boundary value analysis (BVA) requires selection
of test cases that exercise bounding values -
- Most software engineers intuitively perform BVA
to some degree - BVA is a technique that complements equivalence
partitioning - Rather than focusing solely on input conditions,
BVA derives test cases from the output domain as
well
10Black box testing Random testing
- Random testing is essentially a black box testing
technique in which a program is tested by
randomly selecting some subsets of all possible
inputs - The distribution of the inputs may be arbitrary
or may reflect the distribution of the inputs in
the application environment (i.e., operational
profile)
11White box testing
- Also called glass-box testing or structural
testing - Derivation of test cases according to program
structure - Control flow coverage
- Data flow coverage
- Higher coverage is better
- White box testing is not an alternative to black
box testing. Rather, it is a complementary
approach
12White box testing
- The starting point for white box testing is a
program flow graph - Nodes represent one or more statements
- Arcs (also called edges or links) represent flow
of control
13Binary search (Java)
14Binary search program flow graph
15Control flow coverage
- All-paths coverage traverse all possible paths
- Equivalent to exhaustively testing the program
- In general, all-paths coverage is not possible
- Loops result in an infinite number of possible
paths - If there are no loops, but only branching
instructions, the number of possible paths
increases exponentially with the number of
branching points - There may be paths that are never executed (quite
likely, the program contains a fault in this case)
16Control flow coverage
- All-statements coverage covers all possible
statements - Rather weak criterion
- It is relatively easy to achieve 100 statement
coverage
17Control flow coverage
- All-branches coverage all possible branches are
chosen at least once - Nodes that contain a condition, such as the
boolean expression in an if statement, can be a
combination of elementary predicates connected by
logical operators - igt0 V jgt0 requires at least two tests to
guarantee that both branches are taken. For
example i1,j1 and i0,j1. Other possible
combinations of through values of the atomic
predicates (i1,j0 and i0,j0) need not be
considered to achieve branch coverage - Extended branch coverage requires that all
possible combinations of elementary predicates in
conditions to be covered
18Control flow coverage
- All-independent paths coverage
- Independent path is any path through the program
which traverses at least one new edge in the flow
graph (exercising one or more new conditions) - 1, 2, 3, 8, 9
- 1, 2, 3, 4, 6, 7, 2
- 1, 2, 3, 4, 5, 7, 2
- 1, 2, 3, 4, 6, 7, 2, 8, 9
- If all independent program paths are executed we
can be sure that - Every statement has been executed at least once
- Every branch has been executed for true and false
conditions
19Control flow coverage
- The number of independent paths in a program can
be discovered by computing the cyclomatic
complexity - Cyclomatic complexity Number of edges - Number
of nodes 2 - 11 - 9 2 4
- For programs without goto statements cyclomatic
complexity equals number of conditions in a
program - Design test cases to execute each of these paths
- A dynamic program analyser may be used to check
that paths have been executed
20Data flow coverage
- Data flow testing selects test paths of a program
according to the location of definitions and uses
of variables in the program - A variable is defined in a certain statement if
it is assigned a (new) value - DEF(S) X statement S contains a definition
of X - After that, a new value will be used in
subsequent statements - USE(S) X statement S contains a use of X
21Data flow coverage
- Variable definition in statement X is alive in
statement Y if there exists a path from X to Y in
which that variable does not get assigned a new
value in some intermediate node such a path is
called definition-clear - Two types of variable use
- P-uses predicate uses (like those in the
conditional part of an if-statement) - C-uses all other uses (like uses in
computations or I/O statements) - With data flow analysis too, we may define test
adequacy criteria and use these criteria to guide
testing
22Data flow coverage
- All-uses coverage traverse a definition-clear
path between each definition of a variable an
each (P- or C-) use of that definition and each
successor of that use - Each successor is included to force all branches
following a P-use to be taken - All-DU-paths requires that each
definition-clear path is either cycle-free or a
simple cycle - Slightly stronger criterion than all-uses coverage
23Data flow coverage
- All-def coverage requires each definition to be
used at least once - All-C-uses/Some-P-uses coverage requires
definition-clear paths from each definition to
each computational use if definition is used
only in predicates, at least one definition-clear
pat to a predicate use must be executed
24Data flow coverage
- All-P-uses/Some-C-uses coverage requires
definition-clear paths from each definition to
each predicate use if definition is used only in
computations, at least one definition-clear path
to a computational use must be executed - All-P-uses coverage requires definition clear
paths from each definition to each predicate use
25Telcordias ATAC white box testing tool
http//xsuds.argreenhouse.com
26ATAC
- ATAC is a coverage analysis tool that aids in
testing programs written in the C or C
programming language - Using ATAC focuses on three main activities
- instrumenting the software to be tested
- occurs at compile-time
- large systems can be instrumented
a-piece-at-a-time - executing software tests
- determining how well the software has been tested
- display uncovered source code
- generate reports that reveal the current coverage
measures for each criteria
27Program instrumentation, test execution and
coverage analysis
28ATAC
- ATAC highlights the covered and uncovered blocks
in the source code and prioritizes them into an
order in which you should try to cover them - Constructing the tests is the role of the tester
ATAC does not construct the tests or determine
what inputs are needed to cover the uncovered
code - ATAC, however, simplifies the tester's job by
guiding him or her into creating a small set of
high-efficiency, high-leverage test cases that
yield high coverage quickly
29ATAC Display of uncovered source code
30ATAC - Display of uncovered source code
- Each color represents a certain weight
- If, for example, a block has weight 30, it means
any test case that causes that block to be
exercised, or covered, is guaranteed to cover a
minimum of 29 other blocks as well - White represents zero weight and red represents
the highest weight among all blocks in the file - If a block is highlighted in white, it means that
it has already been covered by a test case and
covering it again will not add new coverage - If, on the other extreme, a block is highlighted
in red, it means that it has not been covered by
any test case so far and covering it first is the
most efficient way to add new coverage to the
program
31ATAC - Display of uncovered source code
32ATAC - Display of uncovered source code
33ATAC
- Coverage criteria
- Block - a sequence of instructions that, except
for the last instruction, is free of branches and
function calls - a block may contain more than one statement if no
branching occurs between statements - a statement may contain multiple blocks if
branching occurs inside the statement - an expression may contain multiple blocks if
branching is implied within the expression (e.g.,
conditional, logical-and, and logical-or
expressions)
34ATAC Basic blocks
Block 1 consists of a logical-expression embedded
within a compound conditional-expression
Block 2 consists of an entire conditional
expression
Block 3 consists of the entire body of an
if-statement
35ATAC Report on current coverage
36ATAC Report on current coverage
37ATAC
- Coverage criteria
- Decision (i.e., branch coverage)
- C-uses
- P-uses
- All-uses (sum of P-uses and C-uses coverage
measures)
38ATAC - An example C-uses
39ATAC - An example P-uses
40ATAC
- Other coverage criteria
- function-entry - all functions are called at
least once - function-return - all explicit and implicit
returns or exits from a function are executed at
least once - complete function- return coverage usually
guarantees complete function-entry coverage,
since, a function usually has at least one return
or exit - function-call - each call to a function is
covered at least once - Complete function-call coverage does not
guarantee complete function-entry coverage since
it is possible to have a function that does not
contain any function calls
41ATAC Report on current coverage
42Fault based testing
- Aimed at finding a test set with a high ability
to detect faults - Two techniques
- Fault injection
- Mutation testing
43Fault injection
- Non-software example
- We want to estimate the number of pikes in Lake
Soft - Catch a number of pikes, N, in Lake Soft
- Mark them and throw them into Lake Soft
- Catch a number of pikes, M, in Lake Soft
- Suppose that M out of the M pikes are found to
be marked, the total number of pikes originally
present in Lake Soft is then estimated as - (M-M)N/M
44Fault injection
- Fault injection
- Artificially inject (seed) a number faults in
program - When program is tested, we will discover both
injected faults and new ones - Total number of faults is then estimated from the
ration of those two numbers
45Fault injection
- Several assumptions underlie this method
- Injected faults are representative for real
faults - Both real and injected faults have the same
distribution - The results can be trusted if we find many
injected faults and relatively few others - The opposite in not true
46Mutation testing
- A large number of variants (mutants) of a program
is generated - Each of these mutants slightly differs from the
original version - Usually mutants are obtained by mechanically
applying a set of simple transformations called
mutation operations - Replace a constant by another constant
- Replace a variable by another variable
- Replace a constant by a variable
- Replace an arithmetic operator by another
arithmetic operator - Replace a logical operator by another logical
operator - Insert a unary operator
- Delete a statement
47Mutation testing
- Next, all these mutants are executed using a
given test set - When a test produces a different result for one
of the mutants, that mutant is said to be dead - Mutants that produce the same results for all the
tests are said to be alive - Mutation adequacy score of a test set is given by
- D/M, where D is the number of dead mutants and M
is the total number of mutants
48Mutation testing
- Suppose we have a program P with a component T
- Two variants of mutation testing
- strong mutation testing
- Requires that tests produce different results for
program P and a mutant P - weak mutation testing
- Requires that component T and its mutant T
produce different results at the level of P this
difference need not crop up
49Mutation testing
- Several assumptions underlie mutation testing
- Incorrect versions of a program differ from a
correct version by relatively minor faults - Tests that can reveal simple faults can also
reveal complex faults
50Comparison of test techniques
- Most techniques are heuristic in nature and lack
a sound theoretical basis - Manual test techniques rely heavily on the
qualities of the participants in the test process - Relating different test adequacy criteria
- Sound theoretical answer to the questions such as
Is All-uses adequacy criterion stronger or
weaker that the All-branches adequacy criterion?
51Comparison of test techniques
- First, let us define the notion of stronger
- Criterion X is stronger than criterion Y if, for
all programs P and all test sets TS, X-adequacy
implies Y-adequacy - In testing literature this relation is known as
subsume - In this sense, All-uses criterion is stronger
than All-branches criterion - A problem with any graph-based adequacy criterion
is that it can only deal with paths that can be
executed (feasible paths)
52Hierarchy for test adequacy criteria
All-pats
- An arrow A?B indicates that A is
- stronger than (subsumes) B
- Arrows with an asterisk () denote
- relations which hold only for the
- infeasible version
All-DU-pats
All-uses
All-C-uses/ Some-P-uses
All-P-uses/ Some-C-uses
Strong mutation
All-defs
All-P-uses
All-independent paths
Extended branch
Weak mutation
All-branches
All-statements
53Hierarchy for test adequacy criteria
All-statements
Function-return
Function-call
Function-entry
54Conclusion
- Experiments indicate that there is no best
testing technique - Different techniques tend to reveal different
types of faults - The use of multiple testing techniques results in
the discovery of more faults