Title: Structural Invariants
1Structural Invariants
- Ranjit Jhala Rupak Majumdar Ru-Gang Xu
2Generating Invariants
- Verification Conditions (VC)
- Generic Appicable to any user specified
assertion - Precise Capture all path correlations
- Manual Intervention Requires Annotations
- Dataflow Analysis and Abstract Interpretation
- Automatic Little user intervention
- Specialized Uses a fixed abstraction
- Inprecise Merges paths
3Structure Invariants
- Lightweight VC Generation Technique
- Precise Does not capture all path correlations,
but capture structural idioms - Automatic Simple approximations of loop
invariants - Scalable Leverages well-optimized compiler
techniques - Generic Prove a wide range of safety properties
4Plan
- 1. Preliminaries
- 2. Dominator Invariants
- 3. f-Strengthening
- 4. Extensions
- 5. Experiments
5Example
- Conditional locking on a predicate p
0 lock 0 1 if (p) 2 assert(lock
0) 3 lock1 4 5 . . .
- Stmt 0 dominates Stmt 1 All paths going to stmt
1 must go through stmt 0 - Between stmt 0 and stmt 1, lock does not get
modified the value of lock after stmt 0 is same
as the value of lock at stmt 2
6Two Compiler Algorithms
- Dominator Tree captures control flow information
- For two stmts n, n we say n dominates n if for
every path to n goes through n - n is a immediate dominator of niff for every
dominator of n is also a dominator of n - A dominator tree is a tree whose nodes are
statements where each parent immediate dominates
its children - Static Single Assignment captures dataflow
information - Each variable is syntactically assigned once
- f-assignments deal with joins
- x f (x1, x2 . . . xn)
7Example
0 lock 0 1 if (p) 2 assert(lock
0) 3 lock1 4 5 . . .
n0
n1,true
n5
n1,false
SSA Form
n2
0 locko 0 1 if (p) 2 assert(lock0
0) 3 lock11 4 5 lock2 f(lock0,
lock1) 6. . .
n3
Dominator Tree
8Dominator Invariants
n0
lock0 0
n1,true
n5
n1,false
p
p
lock2 f(lock0, lock1)
n2
assert(lock0 0)
n3
lock1
n0 ? n1,true (lock0 0) ? p
(lock0 0)
gt
9Dominator Invariants
- Theorem
- For a node n, DInv(n) n ? (?n? D(n)n)is
an - n-invariant
- After executing a node n, n holds
- If n dominates n then along every path to n,
- Then there is a point where n holds
- After the last occurrence of n, the only nodes
visited are those that are dominated by n - None of the variables in n are modified
10Example Conditional Locking
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n3
n8
Dominator Invariants are insufficient
n9
(lock0 0) ? lock2 f(lock0, lock1) ? p
gt
(lock2 1)
11f-Strengthening
0 locko 0 1 if (p) 2 assert(lock0
0) 3 lock11 4 5 lock2 f(lock0,
lock1) 6. . .
Recursively compute the dominator invariant of
each predecessor of a f-node ((lock2lockn) ?
DInv(n4)) ? ((lock2lock1) ? DInv(n1,false))
12f-Strengthening
0locko 0
1,true p
2assert(lock0 0) 3lock1 1
1,false p
5lock2 f(lock0, lock1)
DInv(n4) (lock1 1) ? p ? (lock0 1)
DInv(n1,false) p ? (lock0 1)
n0 is the immediate dominator of n5 so it
dominates both n4 and n1,false
DInv(n,n) n ? (?n?D(n)?D(n)n)
13f-Strengthening
entry
CFG
n
n
Idom(n)
n x2 f(x1, x2)
n
((xx x1) ? DInv(Idom(n),n) ? (xx x2) ?
DInv(Idom(n),n)) ? Dinv (entry, Idom(n))
n
n
Dominator Tree
14k-Structural Invariants (k-SI)
Idom(n)
(n?n) Y((n,n,k) n ? (?n?D(n)?D(n)
n ?G(n,k))
k
k-1
n
Dealing with f-nodes
k-1
k
k-1
k-2
G(n,k)) ?nj?pred(n)(F(n,j) ?
Y(Idom(n),nj,k-1))
k-1
k-2
k-2
k
n
n
Dominator Tree
15k-Structural Invariants (k-SI)
Idom(n)
- k-SI unfolds the nesting structure of the program
- k is the branch-width senstivity of the analysis
k
k-1
n
k-1
k
k-1
k-2
Setting k to the number of nodes give us the
exact SI
k-1
k-2
k-2
k
n
n
Dominator Tree
16Example Conditional Locking
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n3
n8
2-SI is sufficient
n9
Y((n0,n8,2) gt (lock2 1)
17Example Conditional Locking
n0
Y(n0,n8,2) n8?n7,true?n5 ? G(n5,2)
?n0 G(n5,2) (F(n5,1) ? Y(n0,
n1,false,1)) ? (F(n5,2) ? Y(n0, n3,1))
n5
n1,true
n1,false
n2
n7,true
n7,false
n11
n3
n8
n9
Y(n0,n8,2) p ? (((lock2 lock0)?p) ? ((lock2
lock1)?p))) ? (lock0 0)
18Interprocedural k-SI Callees
For assertions within a function g that calls f
- f is called, we define l f(e1, e2, . . . en)
as - (?L. Y(nfe,nfx,k)) l/ret, e1/x1, e2/x2 . . .
en/xn - Recursively construct the k-SI for the exit node
of f - Rename all local variables of f
- Subsitute formal with actuals
- Subsitute the return value
If recursive, l f(e1, e2, . . . en) is true
19Interprocedural k-SI Callers
For assertions within a function f that is called
by g
- f has callers, we generalize dominators by adding
edges from every call site x f(. . .) to the
entry node nfe. - If n dominates n, then every path from the entry
node of main to n passes through n - The algorithm k-SI for transitive callers is the
same as the intraprocedural algorithm
20Abstract Summarization
An abstract summary S of a f is a subset of P x
P such that the execution of f in a start
satisfying p ends in a state satisfying p, we
have (p,p) ?S
Using summaries Replace calls l f(e1, e2, .
. . en) with (?L. ?(p,p) ? S(p ? p). l/ret,
e1/x1, e2/x2 . . . en/xn
Making summaries (p,p) ? P x P p ?
Y(nfe,nfx,k) ? p is satisfiable
21Abstract Summarization
- If ?P and ?P are not both equivalent to true, we
add - assertion (?L. ?P ). l/ret, e1/x1, e2/x2 . .
.en/xn in front of each call l f(e1, e2, . .
. en) for checking preconditions - assertion ?P at the exit nodes of functions to
check the postcondition
22k-SI with Pointers
p q 5
Points to Analysis q -gt a,b p -gt (c,d)
if (q a) tmp a If (q b)
tmp b if (p c) c tmp 5 if (p
d) d tmp 5
- Run may-points-to Analysis
- Substitute dereferences with the possible memory
being pointed to - Run k-SI
23Limitations
Dealing with loops if n is a fl-node then
G(n,k)) true
x0 1 while() L x1 fl(x0,x3) if (x1
1) x2 1 x3 f(x1,x2) x4
f(x0,x3) assert(x4 1)
k-SI will lose the value of x at L, making x
unconstrained at the assert.
Dataflow analysis that tracks (x1) will prove
the assertion.
However, only 13 false positive out of 684 total
asserts were due to this limitation.
24Implementation
- psi an assertion checker for C programs using
structural invariants - CIL Library
- Flow-insensitive May Alias Analysis
- Simplify Theorem Prover
25Experiments
Tagged Unions a predicate must hold when a field
is accessed
Locks lock / unlock in strict alternation
Setuid permissions set before syscalls
26Experiments
Scalable k-SI runs at least a magitude faster
than complex tools such as BLAST Effective For
simple properties, k-SI has similar number of FP
as BLAST
BLAST
k-SI
27Experiments
- Precision Tradeoffs
- Path sensitivity is necessary
- Past k2, FP does not decrease. Complex control
flow is rare and usually irrelevent
28Summary
- k-SI is a scalable, lightweight algorithm for
finding invariant that prove useful properties of
programs
- Transform to SSA
- Handle f-nodes as disjunction
- Depth of that disjunction is a tunable parameter
- Use an automatic theorem prover to check whether
the assertion holds.
Although 2-SI is simple, for many programs, 2-SI
is sufficent to prove the specified property
29Questions?