Simplifying and Isolating FailureInducing Input - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

Simplifying and Isolating FailureInducing Input

Description:

What is the minimal test case that still produces the failure? ... Removing two or more changes at once may result in an even smaller, still failing test case ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 44
Provided by: Csu48
Category:

less

Transcript and Presenter's Notes

Title: Simplifying and Isolating FailureInducing Input


1
Simplifying and IsolatingFailure-Inducing Input
  • ?
  • Debugging
  • Presented by Nir Peer
  • University of Maryland

2
Introduction
3
Overview
  • Given some test case, a program fails.
  • What is the minimal test case that still produces
    the failure?
  • Also, what is the difference between a passing
    and a failing test case?
  • or in other words

4
How do we go from this
lttd alignleft valigntopgtltSELECT NAME"op sys"
MULTIPLE SIZE7gtltOPTION VALUE"All"gtAllltOPTION
VALUE"Windows 3.1"gtWindows 3.1ltOPTION
VALUE"Windows 95"gtWindows 95ltOPTION
VALUE"Windows 98"gtWindows 98ltOPTION
VALUE"Windows ME"gtWindows MEltOPTION
VALUE"Windows 2000"gtWindows 2000ltOPTION
VALUE"Windows NT"gtWindows NTltOPTION VALUE"Mac
System 7"gtMac System 7ltOPTION VALUE"Mac System
7.5"gtMac System 7.5ltOPTION VALUE"Mac System
7.6.1"gtMac System 7.6.1ltOPTION VALUE"Mac System
8.0"gtMac System 8.0ltOPTION VALUE"Mac System
8.5"gtMac System 8.5ltOPTION VALUE"Mac System
8.6"gtMac System 8.6ltOPTION VALUE"Mac System
9.x"gtMac System 9.xltOPTION VALUE"MacOS X"gtMacOS
XltOPTION VALUE"Linux"gtLinuxltOPTION
VALUE"BSDI"gtBSDIltOPTION VALUE"FreeBSD"gtFreeBSDltO
PTION VALUE"NetBSD"gtNetBSDltOPTION
VALUE"OpenBSD"gtOpenBSDltOPTION VALUE"AIX"gtAIXltOPT
ION VALUE"BeOS"gtBeOSltOPTION VALUE"HP-UX"gtHP-UXltO
PTION VALUE"IRIX"gtIRIXltOPTION VALUE"Neutrino"gtNe
utrinoltOPTION VALUE"OpenVMS"gtOpenVMSltOPTION
VALUE"OS/2"gtOS/2ltOPTION VALUE"OSF/1"gtOSF/1ltOPTIO
N VALUE"Solaris"gtSolarisltOPTION
VALUE"SunOS"gtSunOSltOPTION VALUE"other"gtotherlt/SE
LECTgtlt/tdgtlttd alignleft valigntopgtltSELECT
NAME"priority" MULTIPLE SIZE7gt ltOPTION
VALUE"--"gt--ltOPTION VALUE"P1"gtP1ltOPTION
VALUE"P2"gtP2ltOPTION VALUE"P3"gtP3ltOPTION
VALUE"P4"gtP4ltOPTION VALUE"P5"gtP5lt/SELECTgtlt/tdgt
lttd alignleft valigntopgtltSELECT NAME"bug
severity" MULTIPLE SIZE7gtltOPTION
VALUE"blocker"gtblockerltOPTION VALUE"critical"gtcr
iticalltOPTION VALUE"major"gtmajorltOPTION
VALUE"normal"gtnormalltOPTION VALUE"minor"gtminorltO
PTION VALUE"trivial"gttrivialltOPTION
VALUE"enhancement"gtenhancementlt/SELECTgtlt/trgtlt/t
ablegt
File
Print
Segmentation Fault
5
into this
ltSELECTgt
File
Print
Segmentation Fault
6
Motivation
  • The Mozilla open-source web browser project
    receives several dozens bug reports a day.
  • Each bug report has to be simplified
  • Eliminate all details irrelevant to producing the
    failure
  • To facilitate debugging
  • To make sure it does not replicate a similar bug
    report
  • In July 1999, Bugzilla listed more than 370 open
    bug reports for Mozilla.
  • These were not even simplified
  • Mozilla engineers were overwhelmed with work
  • They created the Mozilla BugAThon a call for
    volunteers to process bug reports

7
Motivation
  • Simplifying meant turning bug reports into
    minimal test cases
  • where every part of the input would be
    significant in reproducing the failure
  • What we want is the simplest HTML page that still
    produces the fault.
  • Decomposing specific bug reports into simple test
    case is of general interest
  • Lets automate this task!

8
Simplification of test cases
  • The minimizing delta debugging algorithm ddmin
  • Takes a failing test case
  • Simplifies it by successive testing
  • Stops when a minimal test case is reached
  • where removing any single input entity will cause
    the failure to disappear

9
How to minimize a test case?
  • Test subsets with removed characters (shown in
    grey)
  • A given test case
  • Fails (?) if Mozilla crashes on it
  • Passes (?) otherwise

10
How to minimize a test case?
Original failing input
Try removing halfNow everything passes, weve
lost the error inducing input!
Try removing a quarter ok found something!
11
How to minimize a test case?
Try removing a quarter instead
OK, weve gotsomething!So keep it, and continue
Good, carry on
Lost it!Try removing an eighth instead
12
How to minimize a test case?
Removing an eighth
Good, keep it!
Lost it!Try removing a sixteenth instead
Great! were making progress
OK, now lets see if removing single characters
helps us reduce it even more
13
How to minimize a test case?
Removing a single character
Reached a minimal test case!
Therefore, this should be ourtest case
14
Formalization
15
Testing for Change
  • The execution of a program is determined by a a
    number of circumstances
  • The program code
  • Data from storage or input devices
  • The programs environment
  • The specific hardware
  • and so on
  • Were only interested in the changeable
    circumstances
  • Those whose change may cause a different program
    behavior

16
The change that Causes a Failure
  • Denote the set of possible configurations of
    circumstances by R.
  • Each r?R determines a specific program run.
  • This r could be
  • a failing run, denoted by r?
  • a passing run, denoted by r?
  • Given a specific r?
  • We focus on the difference between r? and some
    r??R that works
  • This difference is the change which causes the
    failure
  • The smaller this change, the better it qualifies
    as a failure cause

17
The change that Causes a Failure
  • Formally, the difference between r? and r? is
    expressed as a mapping ? which changes the
    circumstances of a program run
  • The exact definition of d is problem specific
  • In the Mozilla example, applying d means to
    expand a trivial (empty) HTML input to the full
    failure-inducing HTML page.

Definition 1 (Change).A change ? is a mapping ?
R?R.The set of changes is C ? R ? R.The
relevant change between two runs r?,r??R isa
change ??C s.t. ?(r?) ? r?.
18
Decomposing Changes
  • We assume that the relevant change d can be
    decomposed into a number of elementary changes
    d1,..., dn.
  • In general, this can be an atomic decomposition
  • Changes that can no further be decomposed

Definition 2 (Composition of changes).The change
composition?? C ? C ? C is defined as (?i ?
?j)(r) ?i(?j(r))
19
Test Cases and Tests
  • According to the POSIX 1003.3 standard for
    testing frameworks, we distinguish three test
    outcomes
  • The test succeeds (PASS, written here as ?)
  • The test has produced the failure it was intended
    to capture (FAIL, written here as ?)
  • The test produced indeterminate results
    (UNRESOLVED, written as ?)

Definition 3 (rtest).The function rtest R ?
?,?,? determines for a program run r?Rwhether
some specific failure occurs (?) or not (?) or
whether the test isunresolved (?).
Axiom 4 (Passing and failing run).rtest(r?) ?
and rtest(r?) ? hold.
20
Test Cases and Tests
  • We identify each run by the set of changes being
    applied to r?
  • We define c? as the empty set?? which identifies
    r? (no changes applied)
  • The set of all changes c? ?1,?2,...,?n
    identifiesr? (?1??2?...??n)(r?)

Definition 5 (Test case). A subset c?? c? is
called a test case.
21
Test Cases and Tests
Definition 6 (test). The function test 2? ?
?,?,? is defined as followsLet c?? c? be a
test case with c? ?1,?2,...,?n. Then test(c)
rtest((?1??2?...??n)(r?)) holds.
Corollary 7 (Passing and failing test cases).
The following holds test(c?) test(?)
? (passing test case) test(c?)
test(?1,?2,...,?n) ? (failing test case)
22
Minimizing Test Cases
23
Minimal Test Cases
  • If a test case c ? c? is a minimum, no other
    smaller subset of c? causes a failure
  • But we don't want to have to test all 2c? of c?
  • So we'll settle for a local minimum
  • A test case is minimal if none of its subsets
    causes a failure

Definition 8 (Global minimum). A set c?? c? is
called the global minimum of c? if?c' ? c? ?
(c' lt c?? test(c')?? ?) holds.
Definition 9 (Local minimum). A test case c?? c?
is a local minimum of c? or minimal if?c' ? c ?
(test(c')?? ?) holds.
24
Minimal Test Cases
  • Thus, if a test case c is minimal
  • It is not necessarily the smallest test case
    (there may be a different global minimum)
  • But each element of c is relevant in producing
    the failure
  • Nothing can be removed without making the failure
    disappear
  • However, determining that c is minimal still
    requires 2c tests
  • We can use an approximation instead
  • It is possible that removing several changes at
    once might make a test case smaller
  • But we'll only check if this is so when we remove
    up to n changes

25
Minimal Test Cases
  • We define n-minimality removing any combination
    of up to n changes, causes the failure to
    disappear
  • We're actually most interested in 1-minimal test
    cases
  • When removing any single change causes the
    failure to disappear
  • Removing two or more changes at once may result
    in an even smaller, still failing test case
  • But every single change on its own is significant
    in reproducing the failure

Definition 10 (n-minimal test case). A test case
c?? c? is n-minimal if?c' ? c ? (c - c'?? n
? test(c')?? ?) holds. Consequently, c is
1-minimal if ?di ? c ? (test(c - di)?? ?) holds.
26
The Delta Debugging Algorithm
  • We partition a given test case c? into subsets
  • Suppose we have n subsets D1,...,Dn
  • We test
  • each Di and
  • its complement ??i c? - Di

27
The Delta Debugging Algorithm
  • Testing each Di and its complement, we have four
    possible outcomes
  • Reduce to subset
  • If testing any Di fails, it will be a smaller
    test case
  • Continue reducing Di with n 2 subsets
  • Reduce to complement
  • If testing any ?i c? - Di fails, it will be a
    smaller test case
  • Continue reducing ?i with n - 1 subsets
  • Why n - 1 subsets and not n 2 subsets?
    (Maintain granularity!)
  • Double the granularity
  • Done

28
The Delta Debugging Algorithm
29
Example
  • Consider the following minimal test case which
    consists of the changes ?1, ?7, and ?8
  • Any test case that includes only a subset of
    these changes results in an unresolved test
    outcome
  • A test case that includes none of these changes
    passes the test
  • We first partition the set of changes in two
    halves
  • none of them passes the test

30
Example
  • We continue with granularity increased to four
    subsets
  • When testing the complements, the set ?2 fails,
    thus removing changes d3 and d4
  • We continue with splitting ?2 into three subsets

31
Example
  • Steps 9 to 11 have already been carried out and
    need not be repeated (marked with )
  • When testing ?2, changed ?5 and ?6 can be
    eliminated
  • We reduce to ?2 and continue with two subsets

32
Example
  • We increase granularity to four subsets and test
    each
  • Testing the complements shows the we can
    eliminate d2

33
Example
  • The next steps show that none of the remaining
    changes ?1, ?7, and ?8 can be eliminated
  • To minimize this test case, a total of 19
    different tests was required

34
Case Studies
35
The GNU C Compiler
define SIZE 20 double mult(double z, int n)
int i, j i 0 for (j 0 j lt n j)
i i j 1 zi zi (z0
1.0) return zn void copy(double to,
double from, int count) int n (count 7)
/ 8 switch (count 8) do case 0 to
from case 7 to from case
6 to from case 5 to from
case 4 to from case 3 to
from case 2 to from case 1
to from while (--n gt 0) return
mult(to, 2) int main(int argc, char
argv) double xSIZE, ySIZE double
px x while (px lt x SIZE) px (px
x) (SIZE 1.0) return copy(y, x, SIZE)
  • This program (bug.c) causes GCC 2.95.2 to crash
    when optimization is enabled
  • We would like to minimize this program in order
    to file a bug report
  • In the case of GCC, a passing program run is the
    empty input
  • For the sake of simplicity, we model change as
    the insertion of a single character
  • r? is running GCC with an empty input
  • r? means running GCC with bug.c
  • each change di inserts the ith character of bug.c

36
The GNU C Compiler
  • The test procedure would
  • create the appropriate subset of bug.c
  • feed it to GCC
  • return ? iff GCC had crashed, and ? otherwise

77
755
377
188
37
The GNU C Compiler
  • The minimized code is
  • The test case is 1-minimal
  • No single character can be removed without
    removing the failure
  • Even every superfluous whitespace has been
    removed
  • The function name has shrunk from mult to a
    single t
  • This program actually has a semantic error
    (infinite loop), but GCC still isn't supposed to
    crash
  • So where could the bug be?
  • We already know it is related to optimization
  • If we remove the O option to turn off
    optimization, the failure disappears

t(double z,int n)int i,jfor()iij1ziz
i(z00)return zn
38
The GNU C Compiler
  • The GCC documentation lists 31 options to control
    optimization on Linux
  • It turns out that applying all of these options
    causes the failure to disappear
  • Some option(s) prevent the failure

ffloat-store fno-default-inline fno-defer-pop
fforce-mem fforce-addr fomit-frame-pointer fno-
inline finline-functions fkeep-inline-functions
fkeep-static-consts fno-function-cse ffast-math
fstrength-reduce fthread-jumps fcse-follow-jum
ps fcse-skip-blocks frerun-cse-after-loop freru
n-loop-opt fgcse fexpensive-optimizations fsche
dule-insns fschedule-insns2 ffunction-sections
fdata-sections fcaller-saves funroll-loops funr
oll-all-loops fmove-all-movables freduce-all-giv
s fno-peephole fstrict-aliasing
39
The GNU C Compiler
  • We can use test case minimization in order to
    find the preventing option(s)
  • Each di stands for removing a GCC option
  • Having all di applied means to run GCC with no
    option (failing)
  • Having no di applied means to run GCC with all
    options (passing)
  • After seven tests, the single option -ffast-math
    is found which prevents the failure
  • Unfortunately, it is a bad candidate for a
    workaround because it may alter the semantics of
    the program
  • Thus, we remove -ffast-math from the list of
    options and make another run
  • Again after seven tests, it turn out that
    -fforce-addr also prevents the failure
  • Further examination shows that no other option
    prevents the failure

40
The GNU C Compiler
  • So, this is what we can send to the GCC
    maintainers
  • The minimal test case
  • The failure only occurs with optimization
  • -ffast-math and -fforce-addr prevent the failure

41
Minimizing Fuzz
  • In a classical experiment by Miller et al.
    several UNIX utilities were fed with fuzz input
    a large number of random characters
  • The studies showed that in the worst case 40 of
    the basic programs crashed or went into infinite
    loops
  • We would like to use the ddmin algorithm to
    minimize the fuzz input sequences
  • We examine the following six UNIX utilities
  • NROFF (format documents for display)
  • TROFF (format documents for typesetter)
  • FLEX (fast lexical analyzer generator)
  • CRTPLOT (graphics filter for various plotters)
  • UL (underlining filter)
  • UNITS (convert quantities)

42
Minimizing Fuzz
  • The following table summarizes the
    characteristics of the different fuzz inputs used

?S segmentation fault, ?A arithmetic exception
43
Minimizing Fuzz
  • Out of the 6?? 16 96 test runs, the utilities
    crashed 42 times (43)
  • We apply our algorithm to minimize the
    failure-inducing fuzz input
  • Our test function returns ? if the input made the
    program crash
  • or ? otherwise
Write a Comment
User Comments (0)
About PowerShow.com