Title: Dynamically Discovering Likely Program Invariants to Support Program Evolution
1Dynamically Discovering Likely Program Invariants
to Support Program Evolution
- Michael D. Ernst, Jake Cockrell,
- William G. Griswold, David Notkin
- Presented by Nick Rutar
2Program Invariants
- Useful in software development
- Protect programmers from making errant changes
- Verify properties of a program
- Can be explicitly stated in programs
- Programmers can annotate code with invariants
- This can take time and effort
- Many important invariants will be missed
3Daikon - Dynamic Invariant Detector
- Dynamic -- From Program Executions
- Step 1 Instrument Source Program
- Trace Variables of Interest
- Step 2 Run Instrumented Program Over Test Suite
- Step 3 Infer Invariants from
- Instrumented Variables
- Derived Variables
4Example Program(taken from The Science of
Programming)
- i 0
- s 0
- do i ? n ?
- i i 1
- s s bi
- Precondition
- n 0
- Postcondition
- s (? j 0 j lt n bj)
- Loop Invariant
- 0 i n and
- s (? j 0 j lt i bj)
5Daikon results from the program(100 randomly
generated input arrays of length 7-13)
- ENTER
- N size(B)
- N in 7 13
- B - All elements -100
- EXIT
- N I orig(N) size(B)
- B orig(B)
- S sum(B)
- N in 7 13
- B - All elements -100
- LOOP
- N size(B)
- S sum(B0 I -1)
- N in 7 13
- I in 0 13
- I N
- B - all elements in -100.100
- sum(B) in -556.539
- B0 nonzero in -99.96
- B-1 in -88.99
- N ! B-1 (negative)
- B0 ! B-1 (negative)
6Instrumentation
- Insert instrumentation points
- Procedure Entry
- Procedure Exit
- Loop Heads
- Writes to a file values for
- All variables in scope
- Global Variables
- Procedure arguments
- Local Variables
- Procedures return value
- Available for Platforms
- LISP
- C/C
- Java (from Daikon website)
- Eclipe plug-in available
- Perl (from Daikon website)
7Inferring invariants
- System checks for the following (x,y,z variables
a,b,c computed constants) - Any variable
- constant or small number of values
- Numeric variable
- range (a x b)
- modulus nonmodulus
- Multiple numbers
- linear relationship (such as x ay bz c)
- functions (all those in standard lib, e.g. x
abs(y)) - comparisons (x lt y, x y, x y)
- invariants over x y and x -y
- Sequence
- sortedness
- invariants over all elements (e.g., every element
lt 100) - Multiple sequences
- subsequence lexicographic relationship
- Sequence and scalar
- membership
8Inferring invariants (continued)
- Each potential variant is tested
- When invariant doesnt hold, not tested again
- Negative Invariants
- Relationships that are expected but dont occur
from input - Probability limit decides if invariants are
included - Derived Variables
- Expressions treated same as regular variables
- Include
- From any array first and last elements, length
- From numeric array sum, min, max
- From array and scalar element at that
index(ai), subarray up to, and subarray beyond,
that index - From function invocation number of calls so far
9Using Invariants
- Modified Siemens replace (500 LOC) program
- Takes in regular expression and replacement
string as input - Copies input stream to output stream replacing
matched strings - Added input pattern ltpatgt to ltpatgtltpatgt
- Use invariants for glimpse on how program runs
- Found occurrences where initial belief was
contradicted - Prevented introducing bugs based on flawed
knowledge of code - Found instance of unreported array bounds error
10Using invariants (continued)
- Everything learned from replace could have been
learned by combination of - Reading the code
- Static Analyses
- Selected Program Instrumentation
- Invariants give benefits that other approaches do
not - Inferred invariants are abstraction of larger
amount of data - Flags raised with unexpected invariants or
expected invariants not appearing - Queries against database build intuition about
source of invariant - Inferred invariants provide basis for programmer
inferences - Invariants provide beneficial degree of
serendipity
11Results - Time
- Ran tests with between 500-3000 test inputs for
replace - Inferred 71 variables per inst point in replace
- 6 original, 65 derived, 52 scalars, 19 sequences
- On average, 10 derived for every original
- 1000 test cases
- Produce 10,120 samples per instrumentation point
- System takes 220 seconds to infer invariants
- 3000 test cases
- 33,801 samples
- Processing takes 540 seconds
- Invariant detection time grows quadratically with
the number of variables over which invariants are
checked - Time grows linearly with test suite size
12Invariant Stability
- Relationship between test size suite and
invariants - Across test suites
- Identical - invariant same between two test
suites - Missing - invariant is present in one test
suite, but not other - Different - invariant is different between two
test suites - Interesting - Worthy of further study to
determine relevance - Uninteresting - Peculiarity in the data
- S1 in 0 98 (99 values)
- S1 gt 0 (96 values)
13Invariant differences(2500-element test suite)
14Invariants and Program Correctness
- Compare invariants detected across programs
- Correct versions of programs have more invariants
than incorrect ones - Examination of 424 intro C programs from U of
Washington - Given of students, amount of money, of
pizzas, calculates whether the students can
afford the pizzas. - Chose eight relevant invariants
- people 150
- pizzas 110
- pizza_price 9,11
- excess_money 0...40
- slices 8 pizza
- slices 0 (mod 8)
- slices_per 0,1,2,3
- slices_left ? people - 1
15Relationship of Grade and Goal Invariants
Invariants Detected
16Future Work (from 2001 paper)
- Increasing Relevance
- Invariant is relevant if it assists programmer
- Repress invariants logically implied by others
- Viewing and Managing Invariants
- Overwhelming for a programmer to sort through
- Various tools for selective reporting of
invariants - Improving Performance
- Balance between invariant quality and runtime
- Number of Derived Variables used
- Richer Invariants
- Invariants over Pointer based data structures
- Computing Conditional Invariants
17Resources
- Daikon website
- http//pag.csail.mit.edu/daikon/
- Contains links to
- Papers
- Source Code
- User Manual
- Developers Manual
18Questions???