Title: Software Testing
1Software Testing
- Sudipto Ghosh
- CS 406 Fall 99
- November 23, 1999
2Learning Objectives
- Coverage Principle
- Saturation principle
- Mutation testing
- Test case generation and tool support
- Testing OO programs
- Testing distributed and real-time systems
3Principle underlying test assessment
- Uniform principle that underlies test assessment
throughout the testing process - Coverage Principle
- Result of intensive research at Purdue and other
research groups in software testing
4The coverage principle
- To formulate and understand the coverage
principle, we need to understand - coverage domains
- coverage elements
- Coverage domain
- finite domain, related to the program under test
- Coverage element
- individual elements in the domain
5The coverage principle (contd)
Coverage Domains
Coverage Elements
Requirements Functions Statements Branches All
uses All paths
6The coverage principle
- Measuring test adequacy and improving a test set
against a set of well-defined, increasingly
strong, coverage domains leads to improved
confidence in the reliability of the system under
test.
7The coverage principle (contd)
- Note the following properties of a coverage
domain - It is related to the program under test.
- It it finite.
- It may come from program requirements, related to
the inputs and outputs. - Exercise Give examples.
- It may come from the program code.
- Exercise Give examples.
- It aids in measuring test adequacy and the
progress made in testing.
8Using coverage criteria
- Improving coverage improves our confidence in the
correct functioning of the program under test - Given a program P, and a test set T, suppose that
T is adequate w.r.t. a coverage criterion C. - Does that mean that P is error free?
- Obviously ???
9Test effort
- Several measures available
- Size of the test set T
- A test set with a larger number of test cases
corresponds to higher effort than one with less
number of test cases
10Error detection effectiveness
- Each coverage criterion has its error detection
ability. - How do you measure effectiveness?
- Fraction f of faults guaranteed to be revealed by
a test T that satisfies C. - At least a fraction f of the faults in P will be
revealed by test T that satisfies C. - No absolute measure of effectiveness of any given
coverage criterion for a general class of
programs and arbitrary test sets.
11Error detection effectiveness (contd)
- Empirical studies conducted by researchers give
us an idea of the relative goodness of various
coverage criteria. - For a variety of criteria, we can make a
statement like - Criterion C1 is definitely better than C2.
- Sometimes, we can say
- Criterion C1 is probably better than C2.
12Error detection effectiveness (contd)
- Such information allows us to construct a
hierarchy of coverage criteria - This hierarchy helps in organizing and managing
testing.
13The saturation effect
- The rate at which new faults are discovered
reduces as test adequacy w.r.t. a finite coverage
domain increases. - The rate reduces to zero when the coverage domain
is exhausted.
Rate of fault discovery
0
1
Coverage
14Saturation effect Fault view
N
Remaining Faults
M
0
tfs
tfe
tds
tdfe
tme
Functional
Testing Effort
15Saturation effect Reliability view
Rm
Rd
Rdf
Rf
Reliability
Rm
Rdf
Mutation
Rd
Dataflow
Rf
Decision
Functional
tfs
tfe
tds
tde
tdfs
tdfe
tms
tfe
True reliability (R) Estimated reliability
(R) Saturated region
Testing Effort
FUNCTIONAL, DECISION, DATAFLOW AND
MUTATION COVERAGE PROVIDE VARIOUS TEST EVALUATION
CRITERIA
16Mutation testing
- Fundamentally different from previous approaches.
- Take a program.
- Create mutants by making simple changes.
- Goal of testing is to make sure that each mutant
produces an output different from the output of
the original program.
17Mutant operators
- Makes a small unit change in the program
- Replace an arithmetic operator by another
- Replace a constant by another
- Change an array reference
- Change the label for a goto statement
- Replace a variable by a special value (like 0)
- First-order mutant obtained by making exactly one
change. - Need to model real faults.
18Basis for mutation testing
- Competent programmer hypothesis
- Coupling effect
19Competent programmer hypothesis
- Programmers are generally competent and dont
create programs at random. - A programmer produces a program that is very
close to a correct program. - A correct program can be constructed from an
incorrect program with minor changes in the
program.
20Coupling effect
- Test cases that distinguish programs with minor
differences with each other are so sensitive that
they will also distinguish programs with more
complex differences.
21Mutation testing
- You are building up your test set T with test
cases. - Generate mutants for P. Suppose that there are N
mutants. - Execute each mutant and P on each test case in T.
- Find how many mutants were distinguished. These
mutants are called dead. Let there be D dead
mutants.
22Mutation testing
- For each mutant that was not distinguished
(called, live), find out how many are equivalent
to P, i.e. they will always produce the same
output as P. Let E be the number of equivalent
mutants. - The mutation score is
- Add more test cases to T and continue testing
until the mutation score is 1.
23Using mutation testing to improve test sets
- Empirically shown to be stronger than c-uses and
p-uses. - Suppose that you have used some coverage criteria
and you have a test set T adequate w.r.t. the
criteria. - You have removed all the errors found during
testing and now the program passes the tests in
T. - You can use mutation testing to improve T.
24Performance
- Serious problem with mutation testing
- Number of mutants generated with the first-order
operators is large - Fortran For a program with L lines of code, the
total number of mutants is L2 - All have to be compiled and executed
- Tester has to spend considerable time to decide
if some mutants are equivalent. This is an
undecidable problem.
25Test case generation and tool support
- Once you have decided a coverage criterion, there
are two problems - Decide if a test set is adequate w.r.t. the
criterion - Generate a set of test sets
- Difficult to do both tasks manually
- Feasibility of a path is an undecidable problem
26Test generation
- Fully automated tool not possible
- Tools aid the tester
- Ways
- Tool randomly generates tests until the criterion
is satisfied - Generates a lot of unnecessary tests
27Test generation
- Testing is done manually by performing structural
testing iteratively - Tools gives feedback on what paths are left
- Static dataflow tools can be used to decide what
input values should be given for that path to be
executed - Symbolic evaluation techniques can be used
- Ingenuity and creativity of the tester is
important - Often the criteria is weakened (lt 100)
28Testing tools
- Usual steps
- Instrument the program with probes
- Execute the program with test cases
- Analyze the results of the probe data
29Testing object oriented programs
- Not really different if you are using black-box
testing - Why is white-box testing different for OO?
- White-box likely to be at unit level.
- Basic unit is Class
- Fundamental differences between testing classes
and functions - Testing methods individually and functions may be
similar
30Testing OO Programs
- Testing all methods separately is not same as
testing the class - Class cannot be tested directly only an instance
can be tested. - How do you test abstract classes?
- Control flow characterized by message passing
among objects, no sequential control flow within
a class like in a function - Different approach required.
31Testing OO Programs
- State associated with object also influences the
path of execution methods of a class can
communicate among themselves using the state. - for function, arguments and global variables
determine the path of execution - Inheritance
- structure of inheritance hierarchy
- kind of inheritance
32Problems with inheritance
- Structure (lattice or tree)
- Lattice - a class at the bottom may inherit some
features more than once - Classes are more complex and error-prone
- Kind (strict or redefining allowed)
- Dynamic binding of operations, we dont know
until run-time whether the operation is defined
33Techniques
- Exercise the routines provided by an object with
the goal of uncovering errors in the
implementation of the routines or state of the
object or both - Find patterns of method invocation of the object
under test with different arguments. - State-based testing tests for interaction by
changing the state of the object under which the
methods are tested.
34Testing distributed software
- Uniprocessor environment assumptions do not hold
- Global environment
- Deterministic execution
- Sequential execution of instructions
- Insertion of debugging statements do not alter
execution of product - Timing-dependent faults
- Need special techniques
- history files
35Issues in testing real-time systems
- Real-world inputs
- No control over timing
- No control over order
- Similar problems as with distributed software
- Often no human operators
- More robustness required
- high fault-tolerance
- Difficulty in setting up test cases
36Testing real-time systems
- Structure analysis
- Correctness proofs
- Systematic testing
- Statistical testing
- Simulation