Title: Whitebox Testing
1Whitebox Testing
2Plan
- Test Data Adequacy
- Program-based adequacy
- Relation to TMM
3Test Data Adequacy
- The extent to which a set of test data is
enough for testing a program/unit with a given
specification - Classification by source of information
- Specification-based
- Required testing is specified in terms of
identified features of the unit under test - Blackbox testing
- Equivalence partitioning, boundary value
analysis, - Program-based
- Required testing is specified in terms of
properties of unit under test - Whitebox testing
- More now ?
4More Formally
- A test adequacy criterion, C, is a function
- where
- P programs
- S specifications
- T test sets, , where
- D set of inputs for elements in P
- The output of C may be interpreted as the quality
of the test set - A specific output limit may be used to define a
stop value
5Example Statement Adequacy
1, if all statements of P are executed when
tested with T
0, otherwise
- Do all programs have a statement adequate test
set? - Can you device an algorithm to check whether a
program has dead code?
6Test Data Adequacy
- Classification by testing approach
- Structural testing
- Adequacy specified in terms of test sets
coverage of the structure of the program (or
specification) - Branch and path coverage
- Data flow-based coverage
- Fault-based testing
- Adequacy is some mesaurement of tests ability to
detect faults/defects - Mutation testing
- Error-based testing
- Requires tests of certain error-prone points of a
program - (Loop testing)
7Chord Example
8Chord Example
9Control Flow Adequacy
- Control flow modeled as a directed flow graph
- A node represents a linear sequence of
computation - An edge represents conditional transfer of
control - Invariants
- There is a node (the begin node) with in-degree 0
representing computation start - There is a node (the end node)with out-degree 0
representing computation end - All nodes must be on a path from the begin node
to the end node
10Flow Graph
0
1
2
3
1b
1a
1c
1d
1e
11Path Coverage
- Execution of a test from the test set T gives an
execution path, P, of the flow graph - Path Coverage
- A set P of execution paths satisfies the path
coverage criterion if and only if P contains all
execution paths from the begin node to the end
node in the flow graph - Clearly impractical
12Statement Coverage
- How does a test set look like?
- Actual parameters
- id
- Object state
- predecessor, self, successor
- Example adequate set?
- (3, n1, n1, n1)
- (4, n1, n3, n1)
13Branch Coverage
- A set P of execution paths satisfies the branch
coverage criterion if and only if for all edges e
in the flow graph, there is at least one path p
in P such that p contains the edge e.
14Branch Coverage
0
1
2
3
1b
1a
1c
1d
1e
- Example adequate set?
- (3, n1, n1, n1)
- (4, n1, n3, n1)
- (4, n3, n2, n4)
15Decision/Condition Coverage
- Branch/condition coverage looked only at compound
value of condition - Condition coverage
- Each condition in a decision should take on both
true and false at least once - Decision/condition coverage
- Condition coverage each decision outcome should
be exercised
16Condition Coverage
0
1
2
3
1b
1a
1c
1d
1e
- Example adequate set?
- (3, n1, n1, n1)
- (4, n1, n3, n1)
- (4, n3, n2, n4)
- (2, n1, n3, n1)
- (1, n3, n2, n3)
- (2, n3, n1, n3)
(Maybe. Exercise minimize it)
17Independent Paths
- The cyclomatic number of a graph, G, is given as
- v(G) E - N p
- where p is the number of strongly connected
components - v(G) gives the maximal size of a set of
independent paths in a graph - Gives a reasonable level of independent test
cases that should be found
18add pseudo flow
19Data-flow-based Coverage
- We are here interested in places where data
(variables) are defined and used - Data flow of variables according to execution
paths - Defined (def)
- A value is bound to the variable
- int x 5 x 10
- Used (use)
- Variable is utilized but not changed
- Predicate use (p-use)
- if (x gt 0)
- Computational use (c-use)
- y x x
20Data-flow-based Coverage
- Main adequacy criteria
- All def
- All def-use paths
- Defs?
- id
- start, end, idx
- Uses?
- p-uses?
- c-uses?
but no cycles!
21Fault-adequacy criteria
- Estimating the number of remaining faults?
- Error seeding
- Basic idea
- Introduce articificial faults into code under
development in a random fashion - Assume these faults are representative
- Run test
- Measure
- Estimate number of inherent faults
- Caveat
- Seems hard to create faults under above
assumptions in practice
22Mutation testing
- Mutant
- Given a program p, a mutant m is an alternative
program that is generated from p (in some way) - Goal of mutation testing
- Kill the mutants ?
- Given p and a collection of mutants, M, of p,
create test set t that distinguishes p from each
mutant m in M - gt mutation adequate
23- Test case that distinguishes?
- (2, n1, n1, n1)
- Remember input
- (id, predecessor, self, successor)
24Mutation test adequacy
- What if a lot of mutants are still alive?
- Test data may be inadequate
- Programs may be semantically equivalent
- Mutation adequacy score
- D number of dead mutants
- M total number of mutants
- E number of equivalent mutants
Undecidable in general
25Justification of mutation testing
- Competent programmer hypothesis
- Programmers create programs that are close to
being correct - gt Do small deviations to original program
- Coupling effect hypothesis
- Test cases that detect simple errors (i.e.,
simple mutants) also detect complex errors - gt Do simple deviations to original program
26When is an adequacy criterion adequate?
- Weyuker, 1988
- 1. Applicability axiom
- For every program there exists an adequate test
set - 2. Non-Exhaustive Applicability axiom
- There is a program P and a test set T such that P
is adequately tested by T, and T is not an
exhaustive test set - 3. Monotonicity axiom
- If T is adequate for P and T is a subset of T,
then T is adequate - 1 3 gt exhaustive testing is always adequate
- 4. Inadequate Empty Set axiom
- The empty set is not an adequate test set for any
program
27Weyuker, 1988
- Non-obvious axioms
- 5. Antiextensionality
- There are programs P and Q such that P is
semantically equivalent Q, T is adequate for P,
but T is not adequate for Q - 6. General Multiple Change Property
- There are programs P and Q which are the same
shape, and a test set T such that T is adequate
for P, but T is not adequate for Q - (Shape minor syntactic changes)
- 7. Antidecomposition
- There exists a program P and component Q such
that T is adequate for P, T is the set of
vectors of values that variables can assume on
entrance to Q for some t of T, and T is not
adequate for Q - gt System scope coverage does not necessarily
achieve component coverage - Why? Examples?
- 8. Anticomposition
- There exist programs P and Q, and test set T,
such that T is adequate for P, and the set of
vectors of values that variables can assume on
entrance to Q for inputs in T is adequate for Q,
but T is not adequate for PQ - ( composition)
- gt Adequate testing at component level is not
equivalent to adequate testing of a system of
components - Why? Examples?
28Weyuker, 1988
- These are not enough
- Assign a Gödel number to each program so that it
can be retrieved algorithmically from this number - A test set T is Gödel adequate for a program
with Gödel number p if p belongs to T - This is a crappy adequacy criteria, but it
conforms to the previous axioms - What then?
- Renaming
- Let P be a renaming of Q then T is adequate for
P if and only if T is adequate for Q - Complexity
- For every n, there is a program P, such that P is
adequately tested by a size n test set, but not
by any size n-1 test set - Statement Coverage
- If T is adequate for P, then T causes every
executable statement of P to be executed.
29(No Transcript)
30The Testing Maturity Model
Level 1 Initial In the beginning was chaos .
Level 2 Phase definition Institutionalize
procedures Start test planning Develop test/debug
goals
Level 3 Integration Control/monitor the test
process Integrate test in sw-devel
cycle Establish training program Establish test
organisation
Level 4 Mgt. and measurement Sw quality
evaluation Establish test measurement
progr. Organisationwide review program
Level 5 Opt./Defect prev/QC Test process
optimisation Quality control Use process data for
defect prevnt.
31TMM Level 2
- Maturity goal
- Institutionalize basic testing techniques and
methods - Whitebox (and blackbox) testing are fundamental
techniques - But different organizations/projects need to
choose, e.g., based on criticality of system and
skill level of developers - Need to be reflected in
- Policy statements
- Test plans
- Training material
- Also contributions to higher levels of TMM
- Can be embedded in software lifecycle
- Can contribute to monitoring quality
32Roles
- Developer/tester
- Learn and use techniques
- Report
- Select techniques judiciously
- Managers
- Ensure proper education
- Provide resources
- Design organizational policies and standards
- User/client
- Give precise (enough) descriptions of
requirements - Not directly involved
33Selecting Techniques
34Summary
- Whitebox testing is a supplement to other testing
techniques - A lot of sub-techniques exist
- Mostly useful on smaller units
- In practice blackbox and whitebox are used more
or less simultaneously