Title: A PolynomialTime Algorithm for Global Value Numbering
1A Polynomial-Time Algorithm for Global Value
Numbering
- SAS 2004
- Sumit Gulwani George C. Necula
2Global Value Numbering
- Goal Discover equivalent expressions in
procedures - Applications
- Compiler optimizations
- Copy propagation, Constant propagation, Common
sub-expression elimination, Induction variable
elimination etc. - Program verification
- Discover loop invariants, verify program
assertions - Discover equivalent computations across programs
- Translation validation, Plagiarism detection
tools
3Global Value Numbering
- Challenge Undecidable Problem
- Simplification Assumptions
- Operators are uninterpreted (will not discover u
x) - Conditionals are non-deterministic (will not
discover v x) - Will discover x y
4Non-trivial Example
x b y b z F(b)
x a y a z F(a)
assert(x y) assert(z F(y))
5Related Work
- Algorithms that work on SSA form of the program
- AWZ Algorithm (POPL 1988)
- Polynomial, Incomplete
- RKS Algorithm (SAS 1999)
- Polynomial, Incomplete, Improvement on AWZ
- Dataflow analysis or Abstract interpretation
based algorithm - Kildalls Algorithm (POPL 1973)
- Exponential, Complete
- Our Algorithm (this paper)
- Polynomial, Complete
6Non-trivial Example
x ?(a,b) y ?(a,b) z ?(F(a),F(b)) F(y)
F(?(a,b))
x b y b z F(b)
x a y a z F(a)
assert(x y) assert(z F(y))
- AWZ Algorithm ? functions are uninterpreted
- fails to discover second assertion
- RKS Algorithm uses rewrite rules for
normalization - Does not discover all assertions in little more
involved examples. - Rewrite rules not applied exhaustively (exp
applications o.w.) - Rules are pessimistic in handling loops
7Outline
- Strong Equivalence DAG (SED)
- The Assignment Operation
- The Join Operation
- Pruning an SED
- Fixed Point Computation
8Representing Equivalences
a 1 b 2 x F(1,2)
a,1 b,2 x, F(1,2)
9Representing Equivalences
a 1 b 2 x F(1,2)
a,1 b,2 x, F(1,2), F(a,2), F(1,b),
F(a,b)
Such an explicit representation can be
exponential.
10Strong Equivalence DAG (SED)
- A data structure for representing equivalences.
- Nodes n ltSet of variables, Typegt
- Type c, ?, F(n1,n2)
- For every variable x, exactly one node ltV,tgt s.t.
x 2 V - called Node(x)
- Terms(n) set of equivalent expressions
- Terms(ltV, ?gt) V
- Terms(ltV, cgt) V c
- Terms(ltV, F(n1,n2)gt) V
- F(e1,e2) e1 2 Terms(n1),
e2 2 Terms(n2)
11SED Example
e, F
d,c, F
a, 2
b, ?
- Terms(n1) a, 2
- Terms(n2) b
- Terms(n3) c, d, F(a,b), F(2,b)
- Terms(n4) e, F(c,b), F(d,b), F(F(a,b),b),
F(F(2,b),b) - Note that e F(d,b) since F(d,b) 2
Terms(Node(e))
12Abstract Interpretation based algorithm
G
Assignment Node
x e
G Assignment(G,x e)
G
Conditional Node
G2 G
G1 G
G1
G2
Join Node
G Join(G1, G2)
13Example
x 1 y 1 z F(1,1)
x 2 y 2 z F(2,2)
L1
L2
L3
u F(x,y)
L4
Assert(u z)
14Outline
- Strong equivalence DAG (SED)
- The assignment operation
- The join operation
- Pruning an SED
- Fixed point computation
15The Assignment Operation
- Assignment(G, x e)
- It is the strongest postcondition of equivalences
represented by G w.r.t the assignment x e - Delete label x from Node(x) in G
- Let nltV,tgt be the node in G s.t. e 2 Terms(n)
- (Add such a node to G if it does not already
exists) - Add x to node n.
16Example The Assignment Operation
F
u, F
G0 Assignment(G, u F(z,x))
17Outline
- Strong Equivalence DAG (SED)
- The Assignment Operation
- The Join Operation
- Pruning an SED
18The Join Operation
- Join(G1, G2)
- Product Construction of G1 and G2
- If nltV1,t1gt 2 G1 and mltV2,t2gt 2 G2, then
- (n,m) ltV1 Å V2, t1 t t2gt 2 Join(G1,G2)
- Definition of t1 t t2
- c t c c
- F(n1,n2) t F(m1,m2) F ((n1,m1),(n2,m2))
- t1 t t2 ?, otherwise
19Example The Join Operation
y1, F
y2, F
F
F
y6,?
y7,?
y4,y5 ?
y3,?
G1
G2
G Join(G1,G2)
G
20Outline
- Strong equivalence DAG (SED)
- The assignment operation
- The join operation
- Pruning an SED
- Fixed point computation
21Motivation The Prune Operation
- If GJoin(G1,G2), then Size(G) can be Size(G1)
Size(G2) - There are programs with n joins such that size of
the SEDs after joins is exponential in n.
Discovering equivalences among all expressions
Discovering equivalences among program expressions
vs.
For the latter, it is sufficient to discover
equivalences among all terms of size at most t at
each program point (where t variables size
of program). Thus, SEDs can be pruned to have a
small size (k t)
22The Prune Operation
- Prune(G,k)
- For each node ltV,tgt such that V ? , check if it
is a root of some DAG with all leaves labelled
with at least one variable of size k. - If not, then delete all the nodes that are
reachable from only ltV,tgt
23Example The Prune Operation
G
24Outline
- Strong equivalence DAG (SED)
- The assignment operation
- The join operation
- Pruning an SED
- Fixed point computation
25Fixed Point Computation and Complexity
- The lattice of sets of such equivalences has
height at most k - Complexity
- Dominated by the cost of join operations
- Each join operation O(k2 N)
- This requires doing pruning while computing join
- of join operations O(j k)
- Total cost O(k3 N j)
- k of variables
- N size of program
- j of join points in program
26Conclusion