Mutation testing - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Mutation testing

Description:

... are only a small subset of eth whole space of all possible slices considered ... In Measuring cohesion using eth overlap of slices, we need to perform operations ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 47
Provided by: disc8
Category:
Tags: eth | mutation | testing

less

Transcript and Presenter's Notes

Title: Mutation testing


1
Mutation testing
I am grateful to Kostas Adamopoulos for his
permission to use slides he prepared about
mutation testing as part of this weeks slides
2
Tutorial 6 - solutions
  • 1. Binkley and Harman found that the typical size
    of a slice was about a third of the program from
    which it is constructed. However, how an they be
    sure that they slices are typical? Provide a
    critique of their choice of slicing criteria for
    their study.
  • Sample Answer
  • In a certain application only a certain kind of
    slice will be constructed. For example, if a
    program is to be decomposed to help with
    comprehension, then slices need only be
    constructed at the end of a procedure for
    important variables. The slices in this
    situation are only a small subset of eth whole
    space of all possible slices considered by
    Binkley and Harman. Thus, in different
    applications, the size of a typical slice may
    be very different.
  • Harman and Binkley include all criteria. This
    will include those which tend towards degenerate
    criteria, such as backward slicing right at eth
    start of a procedure, which may produce a very
    small slice. This might artificially reduce the
    average size of a slice for the sample of all
    possible criteria, compared to realistic
    choices of sets of criteria.

3
Tutorial 6 - solutions
  • 2. Why is an amorphous slice always as small and
    possibly smaller than a syntax-preserving slice
    constructed for the same criterion?
  • Sample Answer
  • The amorphous slice does not require that
    transformation is used. All syntax-preserving
    slices are also amorphous slices (though the
    reverse is not true). Therefore, a valid (though
    silly) amorphous slicing algorithm would be to
    simply produce the syntax-preserving slice.
    Therefore an amorphous slice need be no larger
    than eth corresponding syntax-preserving slice.
    However, as we have seen, the ability to use
    transformation can reduce the size of the slice,
    so amorphous slices may be smaller than their
    syntax-preserving counter-parts.

4
Tutorial 6 - solutions
  • 3. Give two situations in which syntax-preserving
    slicing would be better than amorphous slicing
    and two in which amorphous slicing would be
    better than syntax-preserving slicing
  • sample Answer
  • In debugging, we want to find a bug in a program
    so it would not be helpful if the slice
    transformed the program syntax-preserving
    slicing would be better.
  • In Measuring cohesion using eth overlap of
    slices, we need to perform operations like union
    and intersection on slices and this requires
    syntax-preservation to be meaningful
    syntax-preserving slicing would be better.
  • In testing we are concerned with generating test
    cases. If we are trying to cover some
    predicate-controlled branch, then we may as well
    have an amorphous slice, which will be smaller
    (especially if it is faster). The user does not
    need to see the slice, so it does not matter if
    the syntax is preserved amorphous slicing will
    be better
  • In comprehension of an unfamiliar program,
    slicing can be used to split the program into
    smaller parts the smaller the batter. Since
    amorphous slices will be smaller than
    syntax-preserving slices, amorphous slicing will
    be better.

5
Tutorial 6 - solutions
  • 4. Consider the program below
  • D 2r
  • FaceArea Pirr
  • C PiD
  • SurfaceArea 2FaceArea Ch
  • slice SurfaceArea
  • What is the smallest amorphous slice on the final
    value of the variable slice?
  • It is
  • slice 2 (r2 hr) Pi

6
Tutorial 6 - solutions
  • The syntax-preserving slice for the same program
    and criterion is the whole program

7
Faults and Failures
  • Fault is the group of incorrect statements in
    the program that causes a failure
  • Failure is an external, incorrect behaviour of a
    program (i.e. an incorrect output or a runtime
    failure)

Faults generate Failures
8
MT is based on two basic Assumptions
  • The Competent Programmer HypothesisIn general
    programmers are competent. That is, the programs
    they write are nearly correct. The program
    differs from a correct version in only a few
    small ways.
  • The Coupling Effect HypothesisLarge program
    faults, particularly those of a semantic nature
    are coupled with smaller syntactic faults that
    can be detected with mutation testing(Hypothesis
    ed 1978, supported empirically 1992, demonstrated
    theoretically 1995, but is it true?).

9
What is Mutation Testing
  • White-box, error-based testing technique
  • Build-in adequacy criteria
  • The quality of a test set is measured according
    to its effectiveness or ability to detect faults
  • Tool support
  • Goal is generating good sets of tests rather than
    finding faults

10
The Idea Underpinning Mutation Testing
  • Seeding the implementation with a fault (mutating
    the original program) by applying a mutation
    operator
  • Then determine whether testing identifies this
    fault
  • Different result
  • the fault introduced has been identified
  • If the test case distinguishes between the mutant
    and the original program
  • it is said to kill the mutant
  • Same result
  • the mutant and the original program produce
    identical results
  • the mutant is still alive

11
How it works (again)
If P and P give different results, thenmutant
P is killed by test case T, so Tcan detect the
difference between the correctand buggy program
  • If P and P give the same results then either
  • Test case T not good enough to detect thefault
    so we need to come with a better test
  • Or P and P are equivalent programs Pis an
    equivalent mutant, no tests can bedevised that
    can distinguish them

Program P
Test both P and P using the same test case T
Apply mutation operator
Mutant P
so this is mainly a way tojudge the
effectiveness of our test data, to improve them,
and by doing that to detect faults (syntactic
semantic ones)
12
Test Data Effectiveness
  • Mutation testing provides a way tojudge the
    effectiveness of the test data
  • The test set should kill all the mutants
  • If not, then we can improve the test set
  • Test generation may be based on mutation testing
  • Tests are generated to kill the mutants
  • In the same way mutants should be of high
    performance
  • Difficult to be eliminated

13
Mutation Score (MS)
  • Mutation Score (Adequacy Score) of Program P and
    test set T is

of Killed Mutants
MS (P,T)
of Non-Equivalent Mutants
Where, Non-Equivalent Mutants Total Mutants
Equivalent Mutants 0 lt MS lt 1 or 0 lt
MS lt 100 Test data is mutation-adequate if its
mutation score is 100 (in this case they kill
all non-equivalent mutants)
14
Mutants
  • Original Program P
  • Mutant P of P
  • A program similar to P
  • P differs from P by a single mutation
  • Each kind of mutation corresponds to a typical
    error programmers usually make
  • Off-by-one, spelling, typos, etc.
  • i.e. imagine that P was identical to P except
    that exactly one was changed to a

15
Mutation Operators(Also called mutant operator,
mutagenic operator, mutagen, mutation
transformation, mutation rule)
  • It is a rule that is applied to a program to
    create mutants
  • Replace each operand by every other syntactically
    legal operand
  • Modify expressions by replacing operators and
    inserting new operators
  • Delete entire statements, etc.
  • Categorised into mutation classes
  • Statement, Operator, Variable, Constant, etc.

16
Examples of Mutation Operators
Some Mothra Mutation Operators for Fortran (from
a total of 22 operators)
Proteum has 71 Mutation Operators for C
categorised as follows Statement 15, Operator
46, Variable 7, Constant 3
17
Creating a Mutant
Apply a Mutation OperatorChange to
.. .. .. c a b
.. .. ..
.. .. .. c a b
.. .. ..
Original Program P
Mutant P of Original Program P
18
Testing a Mutant
Original Program P
Mutant P
.. c a b ..
.. c a b ..
Apply Test Case Tto both P and P
Result R
Result R
If R ltgt R then mutant P is killed by Test Case T
Loop If we continue improving test case T and
still getting R R then P is possibly an
Equivalent Mutant
If R R then Improve Test Case T and test again
19
An Example of a Mutant
  • Program P. x y z.
  • A mutant P of P. x y z.
  • Test case1 (y 3, z 1) kills the mutant

X4
X3
  • Test case2 (y 2, z 2), the mutant is still
    live

X4
X4
20
Equivalent Mutant Problem
  • An equivalent mutant is syntactically different
    from the original program, but has the same
    behavior.
  • The general problem of deciding whether a mutant
    is equivalent to the original program is
    theoretically undecidable.
  • This is a hugely important obstacle which will
    need to be overcome to facilitate mutation
    testing in practice.

21
Example of an Equivalent Mutant
  • Original Program P.If (y2 z2) x y
    z.
  • Mutant P of P .If (y2 z2) x y
    z.

X 4
X 4
P is an equivalent mutant of P because no
possible test canever kill this mutant. If the
condition is true the mutated statementreturns
x4 for any possible test case, same as the
original statement.
22
Equivalent Mutant Problem
  • Determining whether a mutant is equivalent is not
    decidable
  • How do we know whether a mutant which remains
    unkilled is simply hard to kill (stubborn) or
    equivalent?
  • Could we avoid generating equivalent mutants?

23
Large Number of Mutants
. . . . . .. M . . . . . . . . . .
. . . . . M. . . . . . . . . . . .
P2
P1
. . . . . .. . . . . . . . . . . .
. . . . . .. . . . . . . . . . . M
P
P3
. . . . . .M . . . . . . . . . . .
M . . . . .. . . . . . . . . . . .

Pn
P4
24
Large Number of Mutants
  • Even for simple programs n can be a very large
    number
  • N depends on the size of P and on how many
    mutation operators we apply on P
  • Reducing the number of mutants is the second
    problem that needs to be addressed

25
Existing Methodologies to Reduce Large Number of
Mutants
  • Selective Mutation
  • Reduces the number of mutation operators applied
  • Mutant Sampling
  • Randomly selects a subset of mutants

The number of mutants under test is reduced
26
Advances in Mutation Testing
  • Reduction technique Selective Mutation
  • Approximation technique Weak Mutation
  • Algorithmic execution technique Schema-based
    Mutation
  • Heuristics for detecting equivalent mutants
  • Algorithms for automatic test data generation
  • Interface Mutation, Class Mutation
  • Distribution of computational expense
  • Avoiding human intensiveness

27
Selective Mutation(a do fewer approach)
  • Applying mutation with only the most critical
    mutation operators being used key operators
  • provide almost the same coverage as non-selective
    mutation
  • Select only mutants that are truly distinct from
    other mutants
  • decreases the number of mutants produced
  • reduces computational cost significantly
  • Getting as much testing strength as possible with
    as few mutants as possible

28
Mutation Sampling(a do fewer approach)
  • Sampling only a randomly selected subset of the
    mutants to run
  • Using samples of some a priori fixed size
  • Using samples without a priori fixed size
  • select mutants until sufficient evidence has been
    collected to determine that a statistically
    appropriate sample size has been reached

29
Weak Mutation(a do smarter approach)
  • An approximation technique that compares the
    internal states of the mutant and the original
    program immediately after execution of the
    mutated portion of the program
  • Reduces the computational cost, but do we really
    get what we want?

30
Weak Mutation Example
x x y x z Print (x)
x x y / inspect x / x z Print (x)
31
Using Distributed Computational Resources (a do
smarter approach)
  • Using novel computer architectures to distribute
    the computational expense over several machines

32
Using Intelligent Algorithms(a do smarter
approach)
  • Intelligently storing state information, this
    technique factors the expense of running a mutant
    over several related mutant executions and
    thereby lowers the total computational cost

33
Schema-based Mutation(a do faster approach)
  • Not mutating an intermediate form
  • The Mutant Schema Generation (MSG) method encodes
    all mutations into one source-level program, a
    metamutant
  • This program is compiled (once), with the same
    compiler used during development and is executed
    in the same operational environment at
    compiled-program speeds

34
Example
x x y
Switch (N) Case 1 x x y Case 2 x x /
y Case 3 x x y Case 4 x z y Case 5

35
Mutation Testing Tools
  • Mothra (for Fortran 77 )
  • Downloadable http//www.isse.gmu.edu/ofut/rsrch/
    mut.html
  • For UNIX systems
  • 22 mutation operators
  • Interpretive approach
  • Proteum PROgram TEsting Using Mutants (for C)
  • Proteum/IM, Proteum/IM 2.0, Proteum/FSM,
    Proteum/ST, Proteum/PN
  • Downloadable?
  • For UNIX systems
  • 71 operators
  • Separate compilation approach
  • Jester JUnit test tester (for Java)
  • Downloadable http//jester.sourceforge.net/
  • Insure (for C)
  • Commercial product

36
Mothra
  • Mothra is a suite of tools for performing
    mutation testing for Fortran 77
  • Interpretive execution
  • Mutgen for generating mutants
  • A testing harness for running a test on a set of
    mutant programs and recording the results
  • Godzilla for automatically generating test cases

37
Using Mothra
  • Select and generate a set of mutants
  • Generate an initial set of test cases and the
    corresponding outputs that they generate
  • Confirm the outputs are correct
  • Repeat until all mutants are killed
  • Run the mutants on the test sets
  • Equivalent mutants
  • Generate and confirm new tests
  • When you are done, you have an adequate suite of
    tests

38
Proteum Family Tools
  • Proteum is a suite of tools for performing
    mutation testing for C programs
  • Unit testing (Proteum), Integration testing
    (Proteum/IM), both (Proteum/IM 2.0), Finite State
    Machines (Proteum/FSM), Statecharts
    Specifications (Proteum/ST), Petri Nets
    specification (Proteum/PN)
  • Test case handling (execution, inclusion/exclusion
    , etc), Mutant handling (creation, selection,
    execution, analysis), Adequate Analysis (mutation
    score and reports)
  • Allows separate compilation each mutant is
    individually created, compiled, linked, and run
  • This approach can be significantly faster (15-20
    times) than an interpretive system, if mutant run
    times greatly exceed individual compilation/link
    times else compilation bottleneck may result

39
Fundamental premise of Mutation Testing
  • In practice, if the software contains a fault,
    there will usually be a set of mutants that can
    only be killed by a test case that also detects
    the fault.

40
Future Testing Systems
  • Programmer submits a program unit
  • System replies with a set of input/output pairs
    that are guaranteed to form an effective test of
    the unit by being close to mutation adequate

41
Current Research at Kings
  • Kostas Adamopoulos is a PhD student working on
    Mutation Testing
  • We are looking at search as a way of attacking
    the twin problems of number of mutants and
    equivalent mutants.
  • This part of the lecture is not examinable. Feel
    free to leave now (quietly) if you are not
    interested.
  • Make sure you come back for the tutorial though
  • Ok, for the two of you left, come to the front

42
Co-evolution?
  • GA for Mutants
  • Fitness is measured according to the ability to
    avoid being killed
  • If this ability is too high then penalize the
    fitness of this mutant because it probably an
    equivalent one
  • GA for Test Cases
  • Fitness is measured according to the ability of
    killing mutants

Two competitive populations.Can this lead to
Co-evolution?
43
Co-evolution
  • Fitness of each individual of one population is
    re-evaluated with respect to the other population
  • Achieves selective mutation
  • Mutation operators not selected a priori
  • Individual mutants selected
  • Tailored to the specific program under test,
    based upon their fitness
  • Guarantees non equivalent mutants
  • Stubborn mutants might also be eliminated (?)
  • The robustness of the algorithm probably will
    rediscover eliminated stubborn mutants (?)

44
Work in Progress and Future Work
  • Mutation tool GAs for Co-evolution
  • Comparison of real results and simulation
  • Comparative analysis with other methodologies
    (selective mutation, mutant sampling)

45
High-order Mutants
  • This methodology could be used to check the
    validity of the Coupling Effect Hypothesis
  • Large program faults, particularly those of a
    semantic nature are coupled with smaller
    syntactic faults that can be detected with
    mutation testing
  • Will an effective test set for simple,
    first-order mutants be in the same level of
    effectiveness for more complex, high-order
    mutants?

46
Mutation testing Tutorial
  • 1. What is an equivalent mutant and why are
    equivalent mutants a problem?
  • 2. For the program fragment x xy give five
    examples of mutants which are equivalent and five
    which are not equivalent
  • 3. Give examples of test cases which kill you
    five non-equivalent mutants
  • 4. Now make up some simple program fragments and
    try to think of some more stubborn mutants of
    these fragments. That is, mutants which are hard
    to kill but which are not equivalent.
Write a Comment
User Comments (0)
About PowerShow.com