Title: Dynamically Discovering Likely Program Invariants to Support Program Evolution
1Dynamically Discovering Likely Program Invariants
to Support Program Evolution
- Michael D. Ernst, Jake Cockrell,
- William G. Griswold, David Notkin
- Presented by Nick Rutar
2Program Invariants
- Useful in software development
- Protect programmers from making errant changes
- Verify properties of a program
- Can be explicitly stated in programs
- Programmers can annotate code with invariants
- This can take time and effort
- Many important invariants will be missed
3- Could there be a way to dynamically discover
program invariants???
4Daikon An Invariant Detector
- Pick a source program (Daikon is language
independent) - Instrument source program to trace variables of
interest - Run instrumented program over test cases
- Infer variants over
- Instrumented variables (variables present in
source) - Derived variables
- Created variables that might be of interest
5Derived Variables
- From any Sequence s
- Length size(s)
- Extremal elements s0, s1, s-1, s-2
- From a numeric sequence
- sum(s), min(s), max(s)
- Any Sequence s and numeric variable(i)
- Element at index si, si-1
- Subsequences s0i, s0i-1
- From Function Invocations
- Number of calls so far
6Example Program(taken from The Science of
Programming)
- i, s 0
- do i ? n ?
- i, s i 1, s bi
- Precondition
- n 0
- Postcondition
- s (? j 0 j lt n bj)
- Loop Invariant
- 0 i n and
- s (? j 0 j lt i bj)
7Daikon results from the program(100 randomly
generated input arrays of length 7-13)
- ENTER
- N size(B)
- N in 7 13
- B - All elements -100
- EXIT
- N I orig(N) size(B)
- B orig(B)
- S sum(B)
- N in 7 13
- B - All elements -100
- LOOP
- N size(B)
- S sum(B0 I -1)
- N in 7 13
- I in 0 13
- I N
- B - all elements in -100.100
- sum(B) in -556.539
- B0 nonzero in -99.96
- B-1 in -88.99
- N ! B-1
- B0 ! B-1
boxes indicate generated invariants that match
expected ones
8Architecture of the Daikon tool
Invariants
Original Program
Instrumented Program
Detect Invariants
Data Trace
Run
Instrument
Test Suite
9Instrument
Original Program
Instrumented Program
- Daikon has instrumenters for Java, C, and Lisp
- Source to Source Translation
- Determines which variables are in scope
- Inserts code to dump the variables into an output
file - Creates a declaration file
- Variables being instrumented
- Types in the original program
- Representations in the trace file
- Sets of variables that may be sensibly compared
- Operates only on scalar numbers and arrays of
numbers. - Scalar numbers includes characters and booleans
- Any other type is converted to one of these forms
10Instrumented Program
Run
Data Trace
- At each program point of interest
- Instrumented Program writes to a data trace file
- All variables in scope
- Global Variables
- Procedure Arguments
- Local Variables
- Return Values (at procedure exits)
- Modification bit
- Whether a value has been set since last time
- For small programs runtime may be I/O bound
11Data Trace
Detect Invariants
Invariants
- Single variable invariants (numeric or sequence)
- Constant value x a (variable is a constant)
- Uninitialized x uninit (variable is never set)
- Modulus x a mod b (x mod b a always holds)
- Multiple variables up to 3 (numeric or sequence)
- Linear relationship y ax b.
- Reversal x is the reverse of y
- Invariants over x - y, x y
- These are just a few
- Complete list can be found in the paper
- Domain-Specific invariants can easily be coded in
12Run Time of Daikon
- Informally, can be characterized as
- Time O( (vars³ x falsetime
- trueinvs x testsuite) x program)
- vars is the number of variables at a program
point (in scope) - Most invariants are falsified quickly
- Only true invariants are checked for the entire
run - Potentially cubic because invariants involve at
most 3 variables - falsetime is the (small constant) time to falsify
a potential invariant - trueinvs is the (small) number of true invariants
at a program point - testsuite is the size of the test suite
- Must balance accuracy versus runtime
- program is the number of instrumented program
points - The default is proportional to the size of the
program - Users can control the extent of instrumentation
13Invariant Stability
- Size of Test Suite
- Too Small
- Small number of invariants
- More false invariants
- Too large
- Increases runtime linearly
- Interesting vs. Uninteresting
- Different size test suites will have more/less
invariants - Uninteresting
- Difference in a bound on a variables range
- Different small set of possible values
- Interesting everything else
14Invariant differences(2500-element test suite)
Invariant Type/Test Cases 500 1000 1500 2000
Identical Unary 2129 2419 2553 2612
Missing Unary 125 47 27 14
Diff Unary 442 230 117 73
Interesting 57 18 10 8
Uninteresting 385 212 107 65
Identical binary 5296 9102 12515 14089
Missing Binary 4089 1921 1206 732
Diff Binary 109 45 24 19
Interesting 22 21 15 13
Uninteresting 87 24 9 6
15Invariants and Program Correctness
- Compare invariants detected across programs
- Correct versions of programs have more invariants
than incorrect ones - Examination of 424 intro C programs from U of
Washington - Given of students, amount of money, of
pizzas, calculates whether the students can
afford the pizzas. - Chose eight relevant invariants
- people 150
- pizzas 110
- pizza_price 9,11
- excess_money 0...40
- slices 8 pizza
- slices 0 (mod 8)
- slices_per 0,1,2,3
- slices_left ? people - 1
16Relationship of Grade and Goal Invariants
Invariants Detected
Grade 2 3 4 5 6
12 4 2 0 0 0
14 9 2 5 2 0
15 15 23 27 11 3
16 33 40 42 19 9
17 13 10 23 27 7
18 16 5 29 27 21
17Other Applications of Invariants
- Inserted as assert statements for testing
- Double-check existing documentation
- Check against existing assert statements
- Useful when program self-checks are ineffective
- Discovering Bugs
- Generate test cases or validate existing test
suites - Could possibly direct a correctness proof
18Ongoing and Future Work
- Increasing Relevance
- Invariant is relevant if it assists programmer
- Repress invariants logically implied by others
- Unrelated variables dont need to be compared
- Ignore variables not assigned since last time
- Viewing and Managing Invariants
- Overwhelming for a programmer to sort through
- Various tools for selective reporting of
invariants - Ordering by category
- Retrieves invariants based on supplied property
- List of invariants by program point
19More Ongoing Work
- Improving Performance
- Balance between invariant quality and runtime
- Number of Derived Variables used
- Richer Invariants
- Invariants over Pointer based data structures
- Computing Conditional Invariants
20Resources
- Daikon website
- http//pag.lcs.mit.edu/daikon/download/
- Contains links to
- Papers
- Source Code
- User Manual
- Developers Manual
21Questions???