Dynamically Discovering Likely Program Invariants to Support Program Evolution PowerPoint PPT Presentation

presentation player overlay
1 / 21
About This Presentation
Transcript and Presenter's Notes

Title: Dynamically Discovering Likely Program Invariants to Support Program Evolution


1
Dynamically Discovering Likely Program Invariants
to Support Program Evolution
  • Michael D. Ernst, Jake Cockrell,
  • William G. Griswold, David Notkin
  • Presented by Nick Rutar

2
Program Invariants
  • Useful in software development
  • Protect programmers from making errant changes
  • Verify properties of a program
  • Can be explicitly stated in programs
  • Programmers can annotate code with invariants
  • This can take time and effort
  • Many important invariants will be missed

3
  • Could there be a way to dynamically discover
    program invariants???

4
Daikon An Invariant Detector
  • Pick a source program (Daikon is language
    independent)
  • Instrument source program to trace variables of
    interest
  • Run instrumented program over test cases
  • Infer variants over
  • Instrumented variables (variables present in
    source)
  • Derived variables
  • Created variables that might be of interest

5
Derived Variables
  • From any Sequence s
  • Length size(s)
  • Extremal elements s0, s1, s-1, s-2
  • From a numeric sequence
  • sum(s), min(s), max(s)
  • Any Sequence s and numeric variable(i)
  • Element at index si, si-1
  • Subsequences s0i, s0i-1
  • From Function Invocations
  • Number of calls so far

6
Example Program(taken from The Science of
Programming)
  • i, s 0
  • do i ? n ?
  • i, s i 1, s bi
  • Precondition
  • n 0
  • Postcondition
  • s (? j 0 j lt n bj)
  • Loop Invariant
  • 0 i n and
  • s (? j 0 j lt i bj)

7
Daikon results from the program(100 randomly
generated input arrays of length 7-13)
  • ENTER
  • N size(B)
  • N in 7 13
  • B - All elements -100
  • EXIT
  • N I orig(N) size(B)
  • B orig(B)
  • S sum(B)
  • N in 7 13
  • B - All elements -100
  • LOOP
  • N size(B)
  • S sum(B0 I -1)
  • N in 7 13
  • I in 0 13
  • I N
  • B - all elements in -100.100
  • sum(B) in -556.539
  • B0 nonzero in -99.96
  • B-1 in -88.99
  • N ! B-1
  • B0 ! B-1

boxes indicate generated invariants that match
expected ones
8
Architecture of the Daikon tool
Invariants
Original Program
Instrumented Program
Detect Invariants
Data Trace
Run
Instrument
Test Suite
9
Instrument
Original Program
Instrumented Program
  • Daikon has instrumenters for Java, C, and Lisp
  • Source to Source Translation
  • Determines which variables are in scope
  • Inserts code to dump the variables into an output
    file
  • Creates a declaration file
  • Variables being instrumented
  • Types in the original program
  • Representations in the trace file
  • Sets of variables that may be sensibly compared
  • Operates only on scalar numbers and arrays of
    numbers.
  • Scalar numbers includes characters and booleans
  • Any other type is converted to one of these forms

10
Instrumented Program
Run
Data Trace
  • At each program point of interest
  • Instrumented Program writes to a data trace file
  • All variables in scope
  • Global Variables
  • Procedure Arguments
  • Local Variables
  • Return Values (at procedure exits)
  • Modification bit
  • Whether a value has been set since last time
  • For small programs runtime may be I/O bound

11
Data Trace
Detect Invariants
Invariants
  • Single variable invariants (numeric or sequence)
  • Constant value x a (variable is a constant)
  • Uninitialized x uninit (variable is never set)
  • Modulus x a mod b (x mod b a always holds)
  • Multiple variables up to 3 (numeric or sequence)
  • Linear relationship y ax b.
  • Reversal x is the reverse of y
  • Invariants over x - y, x y
  • These are just a few
  • Complete list can be found in the paper
  • Domain-Specific invariants can easily be coded in

12
Run Time of Daikon
  • Informally, can be characterized as
  • Time O( (vars³ x falsetime
  • trueinvs x testsuite) x program)
  • vars is the number of variables at a program
    point (in scope)
  • Most invariants are falsified quickly
  • Only true invariants are checked for the entire
    run
  • Potentially cubic because invariants involve at
    most 3 variables
  • falsetime is the (small constant) time to falsify
    a potential invariant
  • trueinvs is the (small) number of true invariants
    at a program point
  • testsuite is the size of the test suite
  • Must balance accuracy versus runtime
  • program is the number of instrumented program
    points
  • The default is proportional to the size of the
    program
  • Users can control the extent of instrumentation

13
Invariant Stability
  • Size of Test Suite
  • Too Small
  • Small number of invariants
  • More false invariants
  • Too large
  • Increases runtime linearly
  • Interesting vs. Uninteresting
  • Different size test suites will have more/less
    invariants
  • Uninteresting
  • Difference in a bound on a variables range
  • Different small set of possible values
  • Interesting everything else

14
Invariant differences(2500-element test suite)
Invariant Type/Test Cases 500 1000 1500 2000
Identical Unary 2129 2419 2553 2612
Missing Unary 125 47 27 14
Diff Unary 442 230 117 73
Interesting 57 18 10 8
Uninteresting 385 212 107 65
Identical binary 5296 9102 12515 14089
Missing Binary 4089 1921 1206 732
Diff Binary 109 45 24 19
Interesting 22 21 15 13
Uninteresting 87 24 9 6
15
Invariants and Program Correctness
  • Compare invariants detected across programs
  • Correct versions of programs have more invariants
    than incorrect ones
  • Examination of 424 intro C programs from U of
    Washington
  • Given of students, amount of money, of
    pizzas, calculates whether the students can
    afford the pizzas.
  • Chose eight relevant invariants
  • people 150
  • pizzas 110
  • pizza_price 9,11
  • excess_money 0...40
  • slices 8 pizza
  • slices 0 (mod 8)
  • slices_per 0,1,2,3
  • slices_left ? people - 1

16
Relationship of Grade and Goal Invariants
Invariants Detected
Grade 2 3 4 5 6
12 4 2 0 0 0
14 9 2 5 2 0
15 15 23 27 11 3
16 33 40 42 19 9
17 13 10 23 27 7
18 16 5 29 27 21
17
Other Applications of Invariants
  • Inserted as assert statements for testing
  • Double-check existing documentation
  • Check against existing assert statements
  • Useful when program self-checks are ineffective
  • Discovering Bugs
  • Generate test cases or validate existing test
    suites
  • Could possibly direct a correctness proof

18
Ongoing and Future Work
  • Increasing Relevance
  • Invariant is relevant if it assists programmer
  • Repress invariants logically implied by others
  • Unrelated variables dont need to be compared
  • Ignore variables not assigned since last time
  • Viewing and Managing Invariants
  • Overwhelming for a programmer to sort through
  • Various tools for selective reporting of
    invariants
  • Ordering by category
  • Retrieves invariants based on supplied property
  • List of invariants by program point

19
More Ongoing Work
  • Improving Performance
  • Balance between invariant quality and runtime
  • Number of Derived Variables used
  • Richer Invariants
  • Invariants over Pointer based data structures
  • Computing Conditional Invariants

20
Resources
  • Daikon website
  • http//pag.lcs.mit.edu/daikon/download/
  • Contains links to
  • Papers
  • Source Code
  • User Manual
  • Developers Manual

21
Questions???
Write a Comment
User Comments (0)
About PowerShow.com