Title: ESP [Das et al PLDI 2002]
1ESP Das et al PLDI 2002
CleanL TSys DSL
DFA WP/SP MC ATP
- Interface usage rules in documentation
- Order of operations, data access
- Resource management
- Incomplete, wordy, not checked
- Violated rules ) crashes
- Failed runtime checks
- Unreliable software
2ESP Das et al PLDI 2002
CleanL TSys DSL
DFA WP/SP MC ATP
C Program
Rules
ESP
Safe
Not Safe
3ESP Das et al PLDI 2002
CleanL TSys DSL
DFA WP/SP MC ATP
- ESP is a program analysis that keeps track of
object state at each program point - e.g. is file handle open or closed?
- Challenge scale to large programs
- One of scalability issues merge nodes
- Always analyze both sides of merge node )
exponential (or non-terminating) program analyses - ESP has a heuristic for handling merges that
- avoids exponential blow-up and runs fast in
practice - maintains enough precision to verify programs
4Prop Sim example stdio usage in gcc
void main () if (dump) fil
fopen(dumpFile,w) if (p) x 0 else
x 1 if (dump) fclose(fil)
5Prop Sim example stdio usage in gcc
void main () if (dump) Open if (p)
x 0 else x 1 if (dump)
Close
6Prop Sim example stdio usage in gcc
void main () if (dump) Open if (p)
x 0 else x 1 if (dump)
Close
7Example no path-sensitivity
8Example no path-sensitivity
uninit
uninit,Opened
uninit,Opened
error,uninit,Opened
9Example full path-sensitivity
10Example full path-sensitivity
uninit
uninitdumpT
OpeneddumpT
OpeneddumpT,pT
OpeneddumpT,pT,x0
OpeneddumpT,pT,x0
uninitdumpT,pT,x0
11Example ESP technique
12Example ESP technique
uninit
OpeneddumpT
uninitdumpF
OpeneddumpT
uninitdumpF
uninitdumpT
uninitdumpF
uninitdumpT
uninit
13Case study stdio usage in gcc
- cc1 from gcc version 2.5.3 (Spec95)
- Does cc1 always print to opened files?
- cc1 is a complex program
- 140K non-blank, non-comment lines of C
- 2149 functions, 66 files, 1086 globals
- Call graph includes one 450 function SCC
14Experimental results
- Precision
- Verification succeeds for every file handle
- No transitions to error no false errors
- Scalability
- Average per handle 72.9 seconds, 49.7 MB
- Single 1GHz PIII laptop with 512 MB RAM
- Proved that
- Each of the 646 calls to fprintf in the source
code prints to a valid, open file
15ESP follow-up
- ESP has since been run on large real-world
applications - ESP/X local intra-procedural version
- PSE post-mortem analysis
- run ESP backwards to figure out what cause a crash
16Recap and conclusion
17Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
18Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
19Flow-sensitive intraproc dataflow analysis
- Iterative dataflow analysis
- flow functions, lattice-theoretic formulation
- Termination
- monotonic flow functions finite height lattice
- Meet over all paths vs. meet over all feasible
paths vs. dataflow analysis - For distributive problems, MOP dataflow analysis
20Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
21Program representations
- Simple
- AST
- CFG
- More advanced
- Dataflow Graph
- Control Dependence Graph
- Program Dependence Graph
- SSA
22Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
23Interprocedural analysis
- Context insensitive
- caller summaries and callee summaries
- Context-sensitive
- call-strings as context (k-CFA, call-strings)
- dataflow info as context
- bottom-up, complete summaries
- top-down, partial summaries (partial transfer
functions)
24Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
25Flow-insensitive analysis
- Keep only one piece of information for the entire
program/procedure - Loses precision, but improves space consumption
26Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
27Path-sensitive analysis
- Enhance dataflow to try to keep paths separate
- Two kinds of path-sensitive analysis
- aim towards MOP
- aim towards removing infeasible paths (branch
correlations)
28Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
29Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
30Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
31Pointer analysis
- Started with simple naïve intraproc analysis with
allocation site summaries - To scale to large programs
- make naïve pointer analysis flow insensitive
(Andersen) - make each node have only one outgoing edge, which
makes it near linear time (Steensgaard) - add one level of flow to regain some precision
(One-level flow)
32Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
33Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
34Program analysis and program reliability
- Property simulation
- path sensitive analysis in polynomial time
- uses clever heuristic for merges
- algorithm behind ESP
- Predicate abstraction and iterative refinement
- given set of predicates, compute predicates that
hold at each program point - iteratively refine set of predicates
- core of BLAST and SLAM
35Course overview
- Cross-cutting issues
- Correctness
- Ordering transformations and analyses
- Dataflow analysis and variations
- iterative dataflow analysis
- program representations
- interprocedural
- flow-insensitive
- path-sensitive
- Applications
- Pointer analysis
- Optimizing OO languages
- Program reliability
- Rhodium
36Looking forward (discussion)
- What are the current hot topics in compilers and
program analysis? - Compilers and program analysis in 20 years from
now?
37Looking forward Concurrency
- Hardware trends are making exploiting concurrency
more and more important - Language features and compiler technology to
express and exploit concurrency - Current examples
- race detection
- primitives for concurrency and efficient
implementations (eg atomic primitive)
38Looking forward Scalability
- Scale to large programs while retaining precision
- Current examples
- Use scalable constraint solvers such as SAT
(SATURN) - Use compact representations such as BDDs
39Looking forward Verification
- Tradeoffs between
- automation
- scalability
- precision
- domain-specificity
- Current examples
- ESP, BLAST, SLAM, Rhodium
40Looking forward Extensibility
- Removing barrier to entry to the compiler
- New models of using compilers for
- domain-specific checkers
- domain-specific optimizations
- Current examples
- Rhodium, Collider