Title: Static Analysis: Data-Flow Analysis I
1Static AnalysisData-Flow Analysis I
( AMP09 Advanced Models Programs, 2009 )
- Claus Brabrand
- IT University of Copenhagen
- ( brabrand_at_itu.dk )
2Agenda (23/3, 09)
- Introduction
- Undecidability, Reduction, and Approximation
- Data-flow Analysis
- Quick tour running example
- Control-Flow Graphs
- Control-flow, data-flow, and confluence
- Science-Fiction Math
- Lattice theory, monotonicity, and fixed-points
- Putting it all together
- Example revisited
3Breaks?
Ill aim for more (but shorter) breaks
4Funny Point
- PS told you that arrays are just ptrs (in
C) - E1E2 gets turned into (E1E2)
- e.g.,
- But we also know that is commutative
- i.e.,
- Thus
-
- i.e., you may as well write 7a instead of
a7 - which is the exact same thing try it out! -)
?
a7
(a7)
?x,y x y y x
?
a7
(a7)
?
?
(7a)
7a
5C is a really weird lang
include ltstdio.hgt include ltstdlib.hgt int
main(char args) Â Â int a (int)
malloc(3sizeof(int)) Â Â a2 7 int
weird 2a   printf(youd think thered be a
type error, but this program compiles
w/o warning and prints out seven (d)
just fine -)\n", weird) Â Â return 1
6Notes on Static Analysis
- Lecture Notes on Static Analysis
- by Michael I. Schwartzbach
- (BRICS, Uni. Aarhus)
- Chapter 1, 2, 4, 5, 6 (until p. 19)
- (Excl. pointers)
Claims to be "not overly formal", but the math
involved can be quite challenging (at times)
7Static Analysis
- Purpose (of static analysis)
- Gather information (on running behavior of
program) - ?program points
- Usage (of static analysis)
- Basis for subsequent
Error detection
information ?program points
Static Analysis
Optimization
8Analyses for Optimization
- Example Analyses
- Constant Propagation Analysis
- Precompute constants (e.g., replace 5xz by
42) - Live Variables Analysis
- Dead-code elimination (e.g., get rid of unused
variable z) - Available Expressions Analysis
- Avoid recomputing already computed exprs (cache
results) -
-
-
9Analyses for Finding Errors
- Example Analyses
- Symbol Checking
- Catch (dynamic) symbol errors
- Type Checking
- Catch (dynamic) type errors
- Initialized Variable Analysis
- Catch unintialized variables
-
-
-
-
10Quizzzzz Optimization?
- If you want a fast C-program, should you use
- LOOP 1
- LOOP 2 (optimized by programmer)
- i.e., array-version or optimized
pointer-version ?
for (i 0 i lt N i) ai ai
2000 ai ai / 10000
?
or
b a for (i 0 i lt N i) b b
2000 b b / 10000 b
?
11Answer
Youll learn how to make similar analyses next
two weeks
- Results (of running the programs)
- Compilers use highly sophisticated static
analyses for optimization! - Recommendation focus on writing clear code(for
people and for compilers to understand)
12(No Transcript)
13Conceptual Motivation
- Undecidability
- Reduction principle
- Approximation
14Rices Theorem (1953)
- Examples
- does program P always halt?
- is the value of integer variable x always
positive? - does variable x always have the same value?
- which variables can pointer p point to?
- does expression E always evaluate to true?
- what are the possible outputs of program P?
Any interesting problem about the runtime
behavior of a program is undecidable --
Rices Theorem paraphrased (1953)
) written in a turing-complete language
15Undecidability (self-referentiality)
- Consider "The Book-of-all-Books"
- This book contains the titles of all books that
do not have a self-reference (i.e. don't contain
their title inside) - Finitely many books i.e.
- We can sit down figure out whether to include
or not... - Q What about "The Book-of-all-Books"
- Should it be included or not?
- "Self-referential paradox" (many guises)
- e.g. "This sentence is false"
"The Bible"
"War and Peace"
"Programming Languages, An Interp.-Based
Approach"
...
The Book-of-all-Books
?
?
16Termination Undecidable!
- Assume termination is decidable (in Java)
- i.e. ? some program, halts Program ? bool
- Q Does P0 loop or terminate...? )
- Hence halts cannot exist!
- i.e., "Termination is undecidable"
bool halts(Program p) ...
-- P0.java --
bool halts(Program p) ... Program p0
read_program("P0.java") if (halts(p0))
loop() else halt()
) for turing-complete languages
17Rices Theorem (1953)
- Examples
- does program P always halt?
- is the value of integer variable x always
positive? - does variable x always have the same value?
- which variables can pointer p point to?
- does expression E always evaluate to true?
- what are the possible outputs of program P?
Any interesting problem about the runtime
behavior program is undecidable -- Rices
Theorem paraphrased (1953)
) written in a turing-complete language
reduction
18solve always-pos ? solve halts
- Assume x-is-always-pos(P) is decidable
- Given P (heres how we could solve halts(P))
- Construct (clever) reduction program R
- Run supposedly decidable analysis
- res
- Deduce from result
- if (res) then P loops! else P halts -)
- THUS x-is-always-pos(P) must be undecidable!
-- R.java --
int x 1 P / insert program P here -) / x
-1
x-is-always-positive(R)
19Reduction Principle
- Reduction principle (in short)
- Example
- Exercise
- Carry out reduction whole explanation for
- which variables can pointer p point to?
?(P) undecidable ? solve ?(P) ? solve
?(P) ?(P) undecidable
reduction
halts(P) undecidable ? solve
x-is-always-pos(P) ? solve halts(P) x-is-al
ways-pos(P) undecidable
20Answer
- Assume which-var-q-points-to(P) is decidable
- Given P (heres how to (cleverly) decide
halts(P)) - Construct (clever) reduction program R
- Run which-var-p-points-to(R) res
- If (null ? res) P halts! else P loops! -)
- THUS
- which-var-q-points-to(P) must be undecidable!
-- R.java --
ptr q 0xffff P / insert program P / q null
21Undecidability
- Undecidability means that
- no-one can decide this line (for all programs)!
- However(!)
?
loops
halts
22Side-Stepping Undecidability
- Compilers use safe approximations (computed via
static analyses) such that
However, just because its undecidable, doesnt
mean there arent (good) approximations! Indeed,
the whole area of static analysis works on
side-stepping undecidability
error
error
ok
ok
Okay!
Dunno?
Error!
Dunno?
23Side-Stepping Undecidability
- Unsafe approximation
- For testing it may be okay to abandon safety
and use unsafe approximations
However, just because its undecidable, doesnt
mean there arent (good) approximations! Indeed,
the whole area of static analysis works on
side-stepping undecidability
loops
halts
unsafe approximation
error
ok
Here are some programs for you to (manually)
consider !
24Slack
- Undecidability means therell always be a
slack - However, still useful(possible interpretations
of Dunno?) - Treat as error (i.e., reject program)
- Sorry, program not accepted!
- Treat as warning (i.e., warn programmer)
- Here are some potential problems
loops
.
.
halts
.
Okay!
Dunno?
25Soundness Completeness
- Soundness
- Analysis reports no errors? Really are no errors
- Completeness
- Analysis reports an error? Really is an error
error
error
ok
ok
Complete analysis
Sound analysis
or alternative (equivalent) formulation, via
contra-position
?
P ? Q
?Q ? ?P
- Really are error(s) ? Analysis reports
error(s)
- Really no error(s) ? Analysis reports no
error(s)
26Example Type Checking
- Will this program have type error (when run)?
-
- Undecidable (because of reduction)
- Type error ? ltEXPgt evaluates to true
void f() var b if (ltEXPgt) b
42 else b true /
some code / if (b) ... // error is b is
'42'
Undecidable (in all cases)
i.e., what ltEXPgt evaluates to (when run)
27Example Type Checking
- Hence, languages use static requirements
-
- All variables must be declared
- And have only one type (throughout the program)
- This is (very) easy to check (i.e.,
"type-checking")
void f() bool b // instead of var b
/ some code / if (ltEXPgt) b 42
else b true / some
more code /
Static compiler error Regardless of what
ltEXPgtevaluates to when run
28Agenda (23/3, 09)
- Introduction
- Undecidability, Reduction, and Approximation
- Data-flow Analysis
- Quick tour running example
- Control-Flow Graphs
- Control-flow, data-flow, and confluence
- Science-Fiction Math
- Lattice theory, monotonicity, and fixed-points
- Putting it all together
- Example revisited
295 Crash Course on Data-Flow Analysis
( AMP09 Advanced Models Programs, 2009 )
- Claus Brabrand
- IT University of Copenhagen
- ( brabrand_at_itu.dk )
30Data-Flow Analysis
- IDEA
- We (only) need 3 things
- A control-flow graph
- A lattice
- Transfer functions
- Example (integer) constant propagation
Simulate runtime execution at
compile-timeusing abstract values
31Control-flow graph
- We (only) need 3 things
- A control-flow graph
- A lattice
- Transfer functions
int x 1
Given program
int x 1 int y 3 if (...) x x2
else x lt-gt y print(x,y)
int y 3
?
...
true
false
x x2
x lt-gt y
print(x,y)
32A Lattice
- We (only) need 3 things
- A control-flow graph
- A lattice
- Transfer functions
- Lattice L of abstract values of interestand
their relationships (i.e. ordering ?) - Induces least-upper-bound operator
- for combining information
could be anything
top
-3 -2 -1 0 1 2 3
bottom
we havent analyzed yet
33Data-Flow Analysis
- We (only) need 3 things
- A control-flow graph
- A lattice
- Transfer functions
x y
?E . Ex ? 1
int x 1
1
1
int y 3
?E . Ey ? 3
3
1
3
1
...
3
1
1
3
1
3
?E . Ex ? E(y), y ? E(x)
?E . Ex ? E(x) ? 2
x x2
x lt-gt y
3
3
1
3
3
print(x,y)
34Agenda (23/3, 09)
- Introduction
- Undecidability, Reduction, and Approximation
- Data-flow Analysis
- Quick tour running example
- Control-Flow Graphs
- Control-flow, data-flow, and confluence
- Science-Fiction Math
- Lattice theory, monotonicity, and fixed-points
- Putting it all together
- Example revisited
35Control Structures
- Control Structures
- Statements (or Exprs) that affect flow of
control - if-else
-
-
- if
-
-
if ( Exp ) Stm1 else Stm2
syntax
semantics
The expression must be of type boolean if it
evaluates to true, Statement-1 is executed,
otherwise Statement-2 is executed.
if ( Exp ) Stm
syntax
semantics
The expression must be of type boolean if it
evaluates to true, the given statement is
executed, otherwise not.
36Control Structures (contd)
while ( Exp ) Stm
syntax
semantics
The expression must be of type boolean if it
evaluates to false, the given statement is
skipped, otherwise it is executed and afterwards
the expression is evaluated again. If it is still
true, the statement is executed again. This is
continued until the expression evaluates to false.
for (Exp1 Exp2 Exp3) Stm
syntax
semantics
Equivalent to
Exp1 while ( Exp2 ) Stm Exp3
37Control-flow graph
int x 1
Given program
int x 1 int y 3 if (agtb) x x2
else x lt-gt y print(x,y)
int y 3
?
agtb
true
false
x x2
x lt-gt y
print(x,y)
38Exercise Draw a Control-Flow Graph for
public static void main ( String args )
int mi, ma if (args.length 0)
System.out.println("No numbers") else
mi ma Integer.parseInt(args0)
for (int i1 i lt args.length i)
int obs Integer.parseInt(argsi)
if (obs gt ma) ma obs
else if (mi lt obs) mi obs
System.out.println(min" mi
"," "max"
ma)
if
else
for
if
else
if
39Control-Flow Graph
int mi, ma
args.length 0
true
false
System.out.println("No numbers")
mi ma Integer.parseInt(args0)
int i1
i lt args.length
true
false
int obs Integer.parseInt(argsi)
obs gt ma
true
false
ma obs
mi lt obs
true
false
mi obs
i
System.out.println(min" mi "," "max"
ma)
40Control Structures (contd2)
- do-while
-
- ? conditional expression
-
- lazy disjunction (aka., short-cut ?)
-
- lazy conjunction (aka., short-cut ?)
-
- switch
-
Exercise
do Stm while ( Exp )
Exp1 ? Exp2 Exp3
Exp1 Exp2
Exp1 Exp2
Swb
case Exp Stm break
switch ( Exp ) Swb
default Stm break
41Control Structures (contd3)
- try-catch-finally (exceptions)
-
- return / break / continue
-
- method invocation
- e.g.
- recursive method invocation
- e.g.
- virtual dispatching
- e.g.
try Stm1 catch ( Exp ) Stm2 finally Stm3
return
return Exp
break
continue
f(x)
f(x)
f(x)
42Control Structures (contd4)
- function pointers
- e.g.
- higher-order functions
- e.g.
- dynamic evaluation
- e.g.
- Some constructions (and thus languages) require a
separate control-flow analysisfor determining
control-flow in order to dodata-flow analysis
(f)(x)
?f.?x.(f x)
eval(some-string-which-has-been-dynamically-comput
ed)
43Agenda (23/3, 09)
- Introduction
- Undecidability, Reduction, and Approximation
- Data-flow Analysis
- Quick tour running example
- Control-Flow Graphs
- Control-flow, data-flow, and confluence
- Science-Fiction Math
- Lattice theory, monotonicity, and fixed-points
- Putting it all together
- Example revisited