Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements - PowerPoint PPT Presentation

About This Presentation

Title:

Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements

Description:

Test case filtering is concerned with selecting from a test suite T a subset T' ... Operating on an initial population of candidate solutions or chromosomes ... – PowerPoint PPT presentation

Number of Views:301

Avg rating:3.0/5.0

Slides: 23

Provided by: wassim4

Learn more at: https://research.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Test Case Filtering and Prioritization Based on Coverage of Combinations of Program Elements

1
Test Case Filtering and Prioritization Based on
Coverage of Combinations of Program Elements

Wes Masri and Marwa El-Ghali
American Univ. of BeirutECE Department
Beirut, Lebanon
wm13_at_aub.edu.lb

2
Test Case Filtering

Test case filtering is concerned with selecting
from a test suite T a subset T that is capable
of revealing most of the defects revealed by T
Approach T to cover all elements covered by T

3
Test Case Filtering What to Cover?

Existing techniques cover singular program
elements of varying granularity
methods, statements, branches, def-use pairs,
slice pairs and information flow pairs
Previous studies have shown that increasing the
granularity leads to revealing more defects at
the expense of larger subsets

4
Test Case Filtering

This work explores covering suspicious
combinations of simple program elements
The number of possible combinations is
exponential w.r.t. the number of singular
elements ? use an approximation algorithm
We use a genetic algorithm

5
Test Case Filtering Conjectures

Combinations of program elements are more likely
to characterize complex failures
The percentage of failing tests is typically much
smaller than that of the passing tests
Each defect causes a small number of tests to
fail
Given groups of (structurally) similar tests,
smaller ones are more likely to be
failure-inducing than larger ones

6
Test Case Filtering Steps

Given a test suite T, generate execution profiles
of simple program elements (statements, branches,
and def-use pairs)
Choose a threshold Mfail for the maximum number
of tests that could fail due to a single defect
Use the genetic algorithm to generate C, a set
of combinations of simple program elements that
were covered by less than Mfail tests ?
suspicious combinations
Use a greedy algorithm to extract T, the
smallest subset of T that covers all the
combinations in C

7
Genetic Algorithm

A genetic algorithm solves a problem by
Operating on an initial population of candidate
solutions or chromosomes
Evaluating their quality using a fitness function
Uses transformation to create new generations
with improved quality
Ultimately evolving to a single solution

8
Fitness Function

We use the following equation
fitness(combination) 1 - tests
where tests is the percentage of test cases
that exercised the combination
The smaller the percentage the higher the
fitness
The aim is to end up with a manageable set of
combinations in which each combination occurred
in at most Mfail tests

9
Initial Population Generation

Generated from union of all execution profiles
Size 50 in our implementation
0?0 always, 1?1 with small probability P

10
Transformation Operator

Combines two parent chromosomes to produce a
child
Passes down properties from each, favoring the
parent with the higher fitness.
Goal child to have a better fitness than its
parents
Replace the parent with the worse fitness with
the child

11
Solution Set

The obtained solution set contains all the
encountered combinations with high-enough fitness
values ? suspicious combinations

12
Experimental Work

Our subject programs included
The JTidy HTML syntax checker and pretty printer
1000 tests 8 defects 47 failures
The NanoXML XML parser 140 tests 4 defects 20
failures

13
Experimental Work

We profiled the following program elements
basic-blocks or statements (BB)
basic-block edges or branches (BBE)
def-use pairs (DUP)
Next we applied the genetic algorithm to generate
the following
a pool of BBcomb
a pool of BBEcomb
a pool of DUPcomb
a pool of ALLcomb (combinations of BBs, BBEs and
DUPs)
The values of Mfail we chose for JTidy, and
NanoXML were 100, and 20, respectively

14
Profile Type Tests Selected Defects Revealed
BB 5.3 55.0
BBcomb 9.6 65.6
BBE 6.5 78.7
BBEcomb 10.2 87.5
DUP 11.7 81.2
DUPcomb 14.1 87.5
ALL 12.4 94.8
ALLcomb 14.1 100.0
SliceP 26.7 100.0

JTidy results
In the case of ALLcomb, 14.1 of the original
test suite was needed to exercise all of the
combinations exercised by the original test
suite, and these tests revealed all the defects
revealed by the original test suite
In previous work we showed that coverage of slice
pairs (SliceP) performed better than coverage of
BB, BBE and DUP this is why we are including the
results of SliceP here for comparison.

Above Figure compares the various techniques to
random sampling
All variations performed better than random
sampling
BBcomb revealed 10.6 more defects than BB but
selected 4.2 more tests
BBEcomb revealed 8.8 more defects than BBE but
selected 3.7 more tests
DUPcomb revealed 6.3 more defects than DUP but
selected 2.4 more tests
ALLcomb performed better than SliceP, since it
revealed all defects, as SliceP did, but selected
12.6 less tests

16
Experimental Work

Concerning BBcomb , BBEcomb , DUPcomb, the
additional cost due to the selection of more
tests might not be well justified, since the rate
of improvement is no better than it is for random
sampling
Concerning ALLcomb, not only did it perform
better than SliceP, but it is considerably less
costly
It took 90 seconds on average per test to
generate its profiles (i.e., BBs, BBEs and
DUPs), whereas it took 1200 seconds per test to
generate the SliceP profiles (1 day vs. 2 weeks)

NanoXML observations
BB, BBE, DUP, and ALL did not perform any better
than random sampling, whereas BBcomb, BBEcomb,
DUPcomb, and ALLcomb performed noticeably better
BBcomb, BBEcomb, DUPcomb, and ALLcomb revealed
all the defects, but at relatively high cost,
since over 50 tests were needed to be executed
The cost of running the genetic algorithm and the
greedy selection algorithm has to be factored in
when comparing our techniques to others

18
Test Case Prioritization

Test case prioritization aims at scheduling the
tests in T so that the defects are revealed as
early as possible
Summary of our technique
Prioritize combinations in terms of their
suspiciousness
Then assign the priority of a given combination
to the tests that cover it

19
Test Case Prioritization Steps

Identify combinations that were exercised by 1
test assign that test priority 1, and add it to
T
Identify combinations that were exercised by 2
tests assign those tests priority 2, and add
them to T
and so on until all tests are prioritized, or
Mfail is exceeded, or all combinations were
explored
Use the greedy algorithm to reduce T
Any remaining tests that were not prioritized
will be scheduled to run randomly following the
prioritized tests

20
Element
Element tests defects
BBcomb 6.75 56.25
BBEcomb 7.55 81.25
DUPcomb 12.6 87.5
ALLcomb 13.05 100.0
JTidy prioritization results when step 3
is satisfied, i.e., when all tests are
prioritized, or Mfail is exceeded, or all
combinations were explored Observation Using
BBcomb, BBEcomb, and DUPcomb not all defects were
revealed. Combinations of BB, BBE, and DUP
(ALLcomb) are needed to reveal all defects.
21
Element
Element tests defects
BBcomb 50.2 100.0
BBEcomb 50.8 100.0
DUPcomb 52.8 100.0
ALLcomb 53.5 100.0
NanoXML prioritization results Observati
on All defects were revealed using BBcomb,
BBEcomb, DUPcomb , or ALLcomb, but at a high cost
of selected tests.
22
Conclusion

Our techniques performed better than similar
coverage-based techniques that consider program
elements of the same type and that do not take
into account their combinations
Will conduct a more thorough empirical study
Will use APFD (Average Percentage of Faults
Detected) approach to evaluate prioritization

Write a Comment

User Comments (0)