Delta: Heuristically Minimize - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Delta: Heuristically Minimize

Description:

but I couldn't have filed most of them without Delta. Delta has always been able to find a radically smaller file, which I have been ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 27
Provided by: scottm5
Category:

less

Transcript and Presenter's Notes

Title: Delta: Heuristically Minimize


1
DeltaHeuristically Minimize Interesting
Filesdelta.tigris.org
  • Daniel S. Wilkerson
  • work with Scott McPeak

2
This quater million line file crashes my tool!
  • We had a quarter million line (preprocessed) C
    file that crashed our C front-end (Elsa)
  • How long would it take you to minimize that by
    hand?
  • Delta reduced it in a few hours to a page or two
    of code
  • While we did something else!

3
Delta Debugging Algorithm
  • Andreas Zellers Delta Debugging Algorithm
  • For file minimization, reduces to this
  • for each granularity g from 0 to log2 N
  • partition the file into 2g parts
  • for each part
  • test if the file minus part is still interesting
  • if so, permanently throw out that part
  • Result is one minimal
  • removing any one line will make test fail

4
Example both blue needed
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

5
both blue needed g 0
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

cant delete the box since it contains both b and
e
6
both blue needed g 1
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

cant delete contains b
cant delete contains e
7
both blue needed g 2
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

can delete
can delete
8
both blue needed g 3
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

can delete
can delete
9
both blue needed final
  • a
  • b
  • c
  • d
  • e
  • f
  • g
  • h

10
You could do this manually...
  • and be much more clever
  • ...but delta is often faster
  • I find it surprising that minimizing a file
    exibiting a certain behavior, brute force mostly
    wins over cleverness
  • Computers are as dumb as hell but they go like
    60 -- Richard Feynman

11
Do a controlled experiment
  • An experiment does many things
  • the interesting bit
  • and the boilerplate just to make it go
  • A control is another experiment
  • that only does the boilerplate
  • Do both and subtract finds interesting bit
  • gcc -c F control F passes gcc
  • oink F grep 'error... but not oink

12
topformflat explaining hierarchical structure
  • To delta, a file is a sequence of lines
  • topformflat explains the nesting of C/C
  • Simple flex filter that copies input to output
  • but doesnt print newlines nested deeper than a
    nesting-depth argument
  • Strategy repeatedly minimize with increasing
    nesting depths

13
topformflat Example
void foo() for(...) x - 5 bar()
while(...) j void bar() z
17 foo() void baz() ...
14
topformflat Example, level0
void foo() for(...)x - 5bar()while(...)j
void bar() z 17foo() void baz() ...
15
topformflat Example, level1
void foo() for(...) x - 5 bar()
while(...) j void bar() z 17
foo() void baz() ...
deleted
16
topformflat Example, level2
void foo() for(...) x - 5 bar()
while(...) j void bar() z
17 foo() void baz() ...
17
Science Most bugs exhibitableby small inputs
  • On any input size, the result is almost always
    small
  • for C input to a compiler, 1-2 pages of code.
  • Seems to be a phenomenon of computation
  • there actually is Science in Computer Science!
  • but not always
  • delta worked for a week and still had 50 files
  • a buffer had to fill up and then flush

18
The Configuration File Trick
  • Delta generalizes to many situations if you
  • parameterize the process with a file
  • minimize the file.
  • Simon Goldsmith was instrumenting Java system
    binaries
  • during class-loading JVM would seg-fault
    nothing really comprehensible would happen
  • wrote a script to read a config file for which
    instrumented classes to put into the jar file
  • use delta to minimize the config file

19
Simulated Annealing
  • Simulated Annealing
  • Large, non-convex sub-space
  • Gradient of goodness
  • Random local moves
  • likely to find another point in the sub-space
  • Moves parameterizable by a temperature.
  • Some say the ability to sometimes get worse is
    essential
  • I say locality, randomness, and temperature

20
Delta as Simulated Annealing
  • space files that pass your test
  • goodness smaller file is better
  • local moves chop out a chunk of file
  • note that we never get worse
  • so delta is greedy
  • temperature chunk size
  • we have an exponential annealing schedule,
    which is not unusual, says wikipedia anyway.

21
Delta surprisingly effective
  • Especially given how ignorant and general it is
  • Most ideas for improvements are how to make the
    local moves better at staying in the space
  • These ideas generally require knowing what the
    file means.
  • Important point But note how well delta already
    does knowing nothing!
  • and topformflat only knows nesting and quotes!

22
Improvement use knowledge of dependencies to
improve moves
If you know the language semantics, reject moves
that would violate it, or only make moves that
would produce a legal file
decl
use
23
Fan Mail
  • From Flash Sheridan
  • This is just a quick thank-you note for Delta.
    ... it immediately reduced a ... bug file from
    16K lines to ten (GCC bug 22604).
  • Oddly enough, it initially found a different bug
    (22603), since I'd only specified "internal
    compiler error", not "segmentation fault".

24
Fan Mail, p.2
  • From Flash Sheridan
  • Delta has become even more valuable since my
    initial thank-you note.
  • I'm not sure it's helped with all of the GCC bugs
    I've been filing... but I couldn't have filed
    most of them without Delta.
  • Delta has always been able to find a radically
    smaller file, which I have been able to attach to
    my bug report.

25
Fan Mail, p.3
  • From Richard Guenther
  • delta is saving a lot of gcc developers life ) I
    would guess 1 of 3 bugs sumitted to the gcc
    bugzilla get their testcase reduced using delta.
  • ... a little bit more accurate would be to say
    we're using delta to reduce all testcases from
    the gcc bugzilla in case they get entered
    unreduced.

26
Delta This simple dumb script is everywhere!
  • One class devoted to it in both Berkeley and
    Stanford Software Engineering Courses
  • Berkeley We've just assigned a delta-related
    homework to the students today
  • Stanford I gave them a homework assignment for
    CS295 using delta. Feedback was positive but
    unquantified.
  • Why did it take so long to think of this simple
    thing?
Write a Comment
User Comments (0)
About PowerShow.com