Title: Program Analysis Using Randomization
1Program Analysis (Using Randomization)
- Sumit Gulwani, George Necula
- (U.C. Berkeley)
2What kind of Analysis?
- Any analysis that can be modelled as checking
equivalence of two expressions in a program - Of course, Equalities
- x y
- (2x 5 4y 5) (x 2y)
- Certain kind of Disequalities
- x y c gt x ? y
- x c_1 E c_2 (0 lt c_2 lt c_1) gt x ? 0
- Certain kind of Inequalities
- x E2 gt x 0
3Applications
- What use is inferring equalities,
disequalities and inequalities? - Translation Validation
- Compiler Optimizations
- Eliminating redundant computations, branches,
memory reads. - Partial Evaluation
- Program Verification
- Discovering useful loop invariants
- Interactive Debugging and Testing of Programs
4Randomized Strategy
-
- Define a mapping F Expressions -gt Polynomials
such that - P1 P2 gt E1 E2 (Soundness)
- E1 E2 modulo a set of certain transformation
rules T - gt P1 P2 (Completeness w.r.t. T)
- P1 P2 can be determined by random testing with
small error probability (Probabilistic Soundness)
5Checking Polynomial Equivalence
- P1 P2
- Vi(P1) Vi(P2)
- P1 P2 c
- Vi(P1) - Vi(P2) c
- P1 c1 P2 c2 (0 lt c2 lt c1)
- c1 GCD Vi(P1) Vi1(P1)
- c2 Vi(P1) c1
- P1 P22
- Vi(P1) is a perfect square
6Mapping Expressions to Polynomials
e n xi xt e e e ? e M(e)
Def(Xt) e e -gt P xt ? P
n ? n
e1 ? P1 e2 ? P2 e1 e2 ? P1 P2
e1 ? P1 e2 ? P2 e1 ? e2 ? P1 ? P2
7Memory read
- Mx v1
- My v2
- Mz v3
- T My
- T sel(upd(upd(upd(M0,x,v1),y,v2),z,v3), y)
- T if (y z) then v1
- else if (y y) then v2
- else if (y x) then v3
- else M0y
8Redundant Writes - 1
My v2 Mz v3 T My
a1 ? a2 Ma1,v1(a2) ? v1
9Redundant Writes - 2
My v1 Mz v3 T My
a1 ? a2 Ma1,v1(a2) ? M(a2)
10Redundant Writes - 3
My v1 Mz v3 T My
a1 ? a2 Ma1,v1a2,v2 ? Ma2v2
11Interchanging Writes
- My v1
- M2z v2
- M2z1 v3
- M4z3 v4
- T My
My v1 M2z1 v3 M2z v2 M4z3 v4 T
My
My v1 M2z1 v3 M4z3 v4 M2z v2 T
My
a1 ? a2 Ma1,v1a2,v2 ? Ma2,v2a1,v1
12Mapping Memory Reads to Polynomials
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
13Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 v5 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 v3 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y v2 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y Ay,x v1 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z Ay,y1 Ay,y Ay,x M0y
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
14Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 v5 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 v3 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y v2 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y Ay,x v1 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z Ay,y1 Ay,y Ay,x M0y
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
15Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 v5 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 v3 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
16Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 v5 Ay,2z1 Ay,4z3
Ay,2z1 Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z1
Ay,2z Ay,y1 Ay,y v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
17Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 Ay,2z v4 Ay,2z1
Ay,4z3 Ay,2z1 Ay,2z Ay,y1 Ay,y v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
18Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 Ay,2z v4 Ay,2z1
Ay,4z3 Ay,2z1 Ay,2z v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
19Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z1 Ay,2z v4 Ay,2z1
Ay,4z3 Ay,2z1 Ay,2z v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
20Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z v4 Ay,2z1 Ay,4z3
Ay,2z v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
21Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z1 Ay,4z3 Ay,2z v4 Ay,2z1 Ay,4z3
Ay,2z v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
22Mapping Memory Reads to Polynomials
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z v2
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
23Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1
Ay,4z3 Ay,y v2 Ay,2z Ay,2z1 Ay,4z3 Ay,y
M0y
24Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1
Ay,4z3 Ay,y v2 Ay,2z Ay,2z1 Ay,4z3 Ay,y
M0y
25Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1
Ay,4z3 Ay,y v2
26Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1
Ay,4z3 v2
27Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1
Ay,4z3 v2
28Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z Ay,2z1 v7
Ay,2z1 Ay,4z3 v6 Ay,2z Ay,2z1 Ay,4z3 v2
29Mapping Memory Reads to Polynomials
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z v4 Ay,2z1 v7 Ay,2z1
Ay,4z3 v6 Ay,2z Ay,2z1 Ay,4z3 v2
30Mapping Memory Reads to Polynomials
- Mx v1
- Myv2
- My1v3
- M2z v4
- M2z1 v5
- M4z3 v6
- M2z1 v7
- T My
Myv2 M4z3 v6 M2z1 v7 M2z v4 T
My
T Ay,2z1 v7 Ay,2z1 Ay,4z3 v6
Ay,2z v4 Ay,2z1 Ay,4z3 Ay,2z v2
T Ay,2z v4 Ay,2z1 v7 Ay,2z1
Ay,4z3 v6 Ay,2z Ay,2z1 Ay,4z3 v2
31Reducing to Normal Form
- Eliminate terms with false conditionals
- Ignores earlier writes to the same/aliased
locations - Eliminate terms with contradictory conditionals
- Ignores writes to unaliased locations
- Remove true/dependent conditionals from terms
- Ignores the sequence of writes
32Detecting true / false / dependent conditionals
- (x e) is always true iff x - e 0
- (x e) is always false iff x - e ? 0
- We have a system of one equality of the form (x
e) and several inequalities of the form (x !
ei). - Let E e - ei
- If E 0 then (x e) gt NOT (x ! ei)
- If E ? 0 then (x e) gt (x ! ei)
33Time Complexity
- Worst Case - Quadratic
- Might be Linear in Practice
- Use Horners like rule to evaluate the polynomial
- While identifying false conditionals which need
to be removed (because of their dependency on the
true conditional), only few false conditionals
need to be taken into account - Big numbers can be avoided by replacing the
product of false conditionals by the number of
false conditionals.
34Equivalent Conditionals
M4y2 v1 M4x2 v2 T M4y2
T Ay,x v2 Ay,x v1
T A4y2,4x2 v2 A4y2,4x2 v1
- Equivalent conditionals should be mapped to the
same random variable
35Detecting Equivalent Conditionals
- (e1-e2) e (e3 e4) gt (e1 e2) (e3 e4)
for some e ? 0 - P1 P2 P (P3 P4) where P ? 0
- Vi(P1 - P2) Vi(P3 - P4) 0
- Vi(P) Vi(P1 P2) / Vi(P3 P4)
- Open Questions
- Can we detect that an integer divides any integer
from a known set of n integers in less than O(n)
time? - Can we detect if a polynomial is divisible by
some constant?
36Looking at values
My v1 Mz v2 1 Mx 2v3 T 3 My 5
- My 3 v1 5
- Mz 3 v2 8
- Mx 6 v3 5
- T My
T 3 (Ay,x 2v3 Ay,xAy,z (v21) Ay,x
Ay,z (3v15) ) 5
T Ay,x (6v35) Ay,xAy,z(3v28)
Ay,x Ay,z (3v15)
T 5 5 (Ay,x Ay,xAy,z Ay,xAy,z)
T 5 5 (Ay,x Ay,x (Ay,z Ay,z)
T 5 5 (Ay,x Ay,x ) T
37Looking at values
My v1 Mx y T My
T Ay,x y Ay,x v1
T Ay,x (3y-2x) Ay,x v1
T Ay,x ((3y-2x) (y-x)) Ay,x v1 Ay,x
(y (y-x)) Ay,x v1
T Ay,x (y (y-x)) Ay,x v1
38Polynomial Equivalence under a Predicate
- P1 P2 (modulo E 0)
- ? P1 P2 P E
- ? Vi(P1 P2) Vi(E) 0
- ? Vi(P1) Vi(E) Vi(P2) Vi(E)
- Open Question
- Is there an efficient algorithm for the
generalized problem involving multiple predicates?
39 Dags (Loop Free Programs)
Ay,z v5 Ay,x ()
My v1
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
40 Dags (Loop Free Programs)
E2 (Ay,z v5 Ay,x ()) E2 (Ay,z v6 Ay,z ())
My v1
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
41 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) ()
My v1
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
42 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
Ay,x v3 Ay,x Ay,z v2 Ay,xAy,z()
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
43 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,z())
E1(Ay,y v4 Ay,y ())
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
44 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,z())
E1(Ay,y v4 Ay,y ())
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
45 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,z()) E1v4
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
46 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
T1 E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,z())
E1v4
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
47 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
T1 E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,zT2)
E1v4
E1 ?
Mz v2 Mx v3
My v4
T2 Ay,y v1 Ay,y()
E2 ?
Mz v6
Mz v5
T My
48 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
T1 E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,zT2)
E1v4
E1 ?
Mz v2 Mx v3
My v4
T2 Ay,y v1 Ay,y()
E2 ?
Mz v6
Mz v5
T My
49 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z E2
Ay,z) T1
My v1
T1 E1(Ay,x v3 Ay,x Ay,z v2 Ay,xAy,zT2)
E1v4
E1 ?
Mz v2 Mx v3
My v4
T2 v1
E2 ?
Mz v6
Mz v5
T My
50 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z
E2 Ay,z) E1v4 (E2 Ay,z E2 Ay,z) E1Ay,x
v3 (E2 Ay,z E2 Ay,z) E1 Ay,x Ay,z v2
(E2 Ay,z E2 Ay,z) E1 Ay,xAy,zv1
My v1
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Mz v6
Mz v5
T My
51 Dags (Loop Free Programs)
T E2 Ay,z v5 E2 Ay,z v6 (E2 Ay,z
E2 Ay,z) E1v4 (E2 Ay,z E2 Ay,z) E1Ay,x
v3 (E2 Ay,z E2 Ay,z) E1 Ay,x Ay,z v2
(E2 Ay,z E2 Ay,z) E1 Ay,xAy,zv1
My v1
E1 ?
Mz v2 Mx v3
My v4
E2 ?
Time Complexity?
Mz v6
Mz v5
T My
52Related Techniques
- Value Numbering
- Detects only equalities
- Relies on Structural Equivalence
- Completely Sound, Much less Complete
- Random Testing
- Cannot prove absence of bugs
- Exponential Complexity
53Random Testing on Memory Programs
- Program P1 (a, b, c) Program P2(a, b, c)
- Mem(a) 3 Mem(a) 3
- Mem(b) 2 Mem(b) 2
- Mem(c) Mem(a) Mem(c) Mem(a)
- t Mem(b) t 22
- t 5t2 t s Mem(a)
- return t return t
- Only one case will show the error on random
testing. Which one? (In general, O(nn)
possibilities!) - a b c
- a b ? c
- a c ? b
- b c ? a
- a ? b ? c
- Not easy to generate even a particular possibility
54Comparison with Symbolic Analysis
- e1 e1
- Expected Winner Symbolic Analysis
- Problem with Randomization
- may be doing unnecessary work
- e1 e2 e3 e4
- Expected Winner Randomization
- Interactive queries can be answered in constant
time - Easily parallelizable
55Future Work
- Extending the technique to handle loops
- Several interesting open problems
- Implement it for Translation Validation
56Conclusion
- Randomization is a very simple and invaluable
technique - Randomization can work wonders where traditional
verification techniques are prohibitively costly
The intriguing possibility that axioms of
randomness may constitute a useful fundamental
source of truth independent of, but supplementary
to, the standard axiomatic structure of
mathematics suggests that probabilistic
algorithms ought to be sought vigorously. - J.T.
Schwartz