Static Analysis: Data-Flow Analysis I - PowerPoint PPT Presentation

About This Presentation
Title:

Static Analysis: Data-Flow Analysis I

Description:

'Science-Fiction Math': Lattice theory, monotonicity, and fixed-points. Putting it all together... 'C' is a really weird lang... #include stdio.h #include ... – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 44
Provided by: ClausBr2
Category:
Tags: analysis | data | flow | static

less

Transcript and Presenter's Notes

Title: Static Analysis: Data-Flow Analysis I


1
Static AnalysisData-Flow Analysis I
( AMP09 Advanced Models Programs, 2009 )
  • Claus Brabrand
  • IT University of Copenhagen
  • ( brabrand_at_itu.dk )

2
Agenda (23/3, 09)
  • Introduction
  • Undecidability, Reduction, and Approximation
  • Data-flow Analysis
  • Quick tour running example
  • Control-Flow Graphs
  • Control-flow, data-flow, and confluence
  • Science-Fiction Math
  • Lattice theory, monotonicity, and fixed-points
  • Putting it all together
  • Example revisited

3
Breaks?
  • Breaks, Breaks, ...?

Ill aim for more (but shorter) breaks
4
Funny Point
  • PS told you that arrays are just ptrs (in
    C)
  • E1E2 gets turned into (E1E2)
  • e.g.,
  • But we also know that is commutative
  • i.e.,
  • Thus
  • i.e., you may as well write 7a instead of
    a7
  • which is the exact same thing try it out! -)

?
a7
(a7)
?x,y x y y x
?
a7
(a7)
?
?
(7a)
7a
5
C is a really weird lang
include ltstdio.hgt include ltstdlib.hgt int
main(char args)    int a (int)
malloc(3sizeof(int))    a2 7 int
weird 2a    printf(youd think thered be a
type error, but this program compiles
w/o warning and prints out seven (d)
just fine -)\n", weird)    return 1
6
Notes on Static Analysis
  • Lecture Notes on Static Analysis
  • by Michael I. Schwartzbach
  • (BRICS, Uni. Aarhus)
  • Chapter 1, 2, 4, 5, 6 (until p. 19)
  • (Excl. pointers)

Claims to be "not overly formal", but the math
involved can be quite challenging (at times)
7
Static Analysis
  • Purpose (of static analysis)
  • Gather information (on running behavior of
    program)
  • ?program points
  • Usage (of static analysis)
  • Basis for subsequent

Error detection
information ?program points
Static Analysis
Optimization
8
Analyses for Optimization
  • Example Analyses
  • Constant Propagation Analysis
  • Precompute constants (e.g., replace 5xz by
    42)
  • Live Variables Analysis
  • Dead-code elimination (e.g., get rid of unused
    variable z)
  • Available Expressions Analysis
  • Avoid recomputing already computed exprs (cache
    results)

9
Analyses for Finding Errors
  • Example Analyses
  • Symbol Checking
  • Catch (dynamic) symbol errors
  • Type Checking
  • Catch (dynamic) type errors
  • Initialized Variable Analysis
  • Catch unintialized variables

10
Quizzzzz Optimization?
  • If you want a fast C-program, should you use
  • LOOP 1
  • LOOP 2 (optimized by programmer)
  • i.e., array-version or optimized
    pointer-version ?

for (i 0 i lt N i) ai ai
2000 ai ai / 10000
?
or
b a for (i 0 i lt N i) b b
2000 b b / 10000 b
?
11
Answer
Youll learn how to make similar analyses next
two weeks
  • Results (of running the programs)
  • Compilers use highly sophisticated static
    analyses for optimization!
  • Recommendation focus on writing clear code(for
    people and for compilers to understand)

12
(No Transcript)
13
Conceptual Motivation
  • Undecidability
  • Reduction principle
  • Approximation

14
Rices Theorem (1953)
  • Examples
  • does program P always halt?
  • is the value of integer variable x always
    positive?
  • does variable x always have the same value?
  • which variables can pointer p point to?
  • does expression E always evaluate to true?
  • what are the possible outputs of program P?

Any interesting problem about the runtime
behavior of a program is undecidable --
Rices Theorem paraphrased (1953)
) written in a turing-complete language
15
Undecidability (self-referentiality)
  • Consider "The Book-of-all-Books"
  • This book contains the titles of all books that
    do not have a self-reference (i.e. don't contain
    their title inside)
  • Finitely many books i.e.
  • We can sit down figure out whether to include
    or not...
  • Q What about "The Book-of-all-Books"
  • Should it be included or not?
  • "Self-referential paradox" (many guises)
  • e.g. "This sentence is false"

"The Bible"
"War and Peace"
"Programming Languages, An Interp.-Based
Approach"
...
The Book-of-all-Books
?
?
16
Termination Undecidable!
  • Assume termination is decidable (in Java)
  • i.e. ? some program, halts Program ? bool
  • Q Does P0 loop or terminate...? )
  • Hence halts cannot exist!
  • i.e., "Termination is undecidable"

bool halts(Program p) ...
-- P0.java --
bool halts(Program p) ... Program p0
read_program("P0.java") if (halts(p0))
loop() else halt()
) for turing-complete languages
17
Rices Theorem (1953)
  • Examples
  • does program P always halt?
  • is the value of integer variable x always
    positive?
  • does variable x always have the same value?
  • which variables can pointer p point to?
  • does expression E always evaluate to true?
  • what are the possible outputs of program P?

Any interesting problem about the runtime
behavior program is undecidable -- Rices
Theorem paraphrased (1953)
) written in a turing-complete language
reduction
18
solve always-pos ? solve halts
  • Assume x-is-always-pos(P) is decidable
  • Given P (heres how we could solve halts(P))
  • Construct (clever) reduction program R
  • Run supposedly decidable analysis
  • res
  • Deduce from result
  • if (res) then P loops! else P halts -)
  • THUS x-is-always-pos(P) must be undecidable!

-- R.java --
int x 1 P / insert program P here -) / x
-1
x-is-always-positive(R)
19
Reduction Principle
  • Reduction principle (in short)
  • Example
  • Exercise
  • Carry out reduction whole explanation for
  • which variables can pointer p point to?

?(P) undecidable ? solve ?(P) ? solve
?(P) ?(P) undecidable
reduction
halts(P) undecidable ? solve
x-is-always-pos(P) ? solve halts(P) x-is-al
ways-pos(P) undecidable
20
Answer
  • Assume which-var-q-points-to(P) is decidable
  • Given P (heres how to (cleverly) decide
    halts(P))
  • Construct (clever) reduction program R
  • Run which-var-p-points-to(R) res
  • If (null ? res) P halts! else P loops! -)
  • THUS
  • which-var-q-points-to(P) must be undecidable!

-- R.java --
ptr q 0xffff P / insert program P / q null
21
Undecidability
  • Undecidability means that
  • no-one can decide this line (for all programs)!
  • However(!)

?
loops
halts
22
Side-Stepping Undecidability
  • Compilers use safe approximations (computed via
    static analyses) such that

However, just because its undecidable, doesnt
mean there arent (good) approximations! Indeed,
the whole area of static analysis works on
side-stepping undecidability
error
error
ok
ok
Okay!
Dunno?
Error!
Dunno?
23
Side-Stepping Undecidability
  • Unsafe approximation
  • For testing it may be okay to abandon safety
    and use unsafe approximations

However, just because its undecidable, doesnt
mean there arent (good) approximations! Indeed,
the whole area of static analysis works on
side-stepping undecidability
loops
halts
unsafe approximation
error
ok
Here are some programs for you to (manually)
consider !
24
Slack
  • Undecidability means therell always be a
    slack
  • However, still useful(possible interpretations
    of Dunno?)
  • Treat as error (i.e., reject program)
  • Sorry, program not accepted!
  • Treat as warning (i.e., warn programmer)
  • Here are some potential problems

loops
.
.
halts
.
Okay!
Dunno?
25
Soundness Completeness
  • Soundness
  • Analysis reports no errors? Really are no errors
  • Completeness
  • Analysis reports an error? Really is an error

error
error
ok
ok
Complete analysis
Sound analysis
or alternative (equivalent) formulation, via
contra-position
?
P ? Q
?Q ? ?P
  • Really are error(s) ? Analysis reports
    error(s)
  • Really no error(s) ? Analysis reports no
    error(s)

26
Example Type Checking
  • Will this program have type error (when run)?
  • Undecidable (because of reduction)
  • Type error ? ltEXPgt evaluates to true

void f() var b if (ltEXPgt) b
42 else b true /
some code / if (b) ... // error is b is
'42'
Undecidable (in all cases)
i.e., what ltEXPgt evaluates to (when run)
27
Example Type Checking
  • Hence, languages use static requirements
  • All variables must be declared
  • And have only one type (throughout the program)
  • This is (very) easy to check (i.e.,
    "type-checking")

void f() bool b // instead of var b
/ some code / if (ltEXPgt) b 42
else b true / some
more code /
Static compiler error Regardless of what
ltEXPgtevaluates to when run
28
Agenda (23/3, 09)
  • Introduction
  • Undecidability, Reduction, and Approximation
  • Data-flow Analysis
  • Quick tour running example
  • Control-Flow Graphs
  • Control-flow, data-flow, and confluence
  • Science-Fiction Math
  • Lattice theory, monotonicity, and fixed-points
  • Putting it all together
  • Example revisited

29
5 Crash Course on Data-Flow Analysis
( AMP09 Advanced Models Programs, 2009 )
  • Claus Brabrand
  • IT University of Copenhagen
  • ( brabrand_at_itu.dk )

30
Data-Flow Analysis
  • IDEA
  • We (only) need 3 things
  • A control-flow graph
  • A lattice
  • Transfer functions
  • Example (integer) constant propagation

Simulate runtime execution at
compile-timeusing abstract values
31
Control-flow graph
  • We (only) need 3 things
  • A control-flow graph
  • A lattice
  • Transfer functions

int x 1
Given program
int x 1 int y 3 if (...) x x2
else x lt-gt y print(x,y)
int y 3
?
...
true
false
x x2
x lt-gt y
print(x,y)
32
A Lattice
  • We (only) need 3 things
  • A control-flow graph
  • A lattice
  • Transfer functions
  • Lattice L of abstract values of interestand
    their relationships (i.e. ordering ?)
  • Induces least-upper-bound operator
  • for combining information

could be anything
top
-3 -2 -1 0 1 2 3
bottom
we havent analyzed yet
33
Data-Flow Analysis
  • We (only) need 3 things
  • A control-flow graph
  • A lattice
  • Transfer functions

x y
?E . Ex ? 1
int x 1
1
1
int y 3
?E . Ey ? 3
3
1
3
1
...
3
1
1
3
1
3
?E . Ex ? E(y), y ? E(x)
?E . Ex ? E(x) ? 2
x x2
x lt-gt y
3
3
1
3
3
print(x,y)
34
Agenda (23/3, 09)
  • Introduction
  • Undecidability, Reduction, and Approximation
  • Data-flow Analysis
  • Quick tour running example
  • Control-Flow Graphs
  • Control-flow, data-flow, and confluence
  • Science-Fiction Math
  • Lattice theory, monotonicity, and fixed-points
  • Putting it all together
  • Example revisited

35
Control Structures
  • Control Structures
  • Statements (or Exprs) that affect flow of
    control
  • if-else
  • if

if ( Exp ) Stm1 else Stm2
syntax
semantics
The expression must be of type boolean if it
evaluates to true, Statement-1 is executed,
otherwise Statement-2 is executed.
if ( Exp ) Stm
syntax
semantics
The expression must be of type boolean if it
evaluates to true, the given statement is
executed, otherwise not.
36
Control Structures (contd)
  • while
  • for

while ( Exp ) Stm
syntax
semantics
The expression must be of type boolean if it
evaluates to false, the given statement is
skipped, otherwise it is executed and afterwards
the expression is evaluated again. If it is still
true, the statement is executed again. This is
continued until the expression evaluates to false.
for (Exp1 Exp2 Exp3) Stm
syntax
semantics
Equivalent to
Exp1 while ( Exp2 ) Stm Exp3
37
Control-flow graph
int x 1
Given program
int x 1 int y 3 if (agtb) x x2
else x lt-gt y print(x,y)
int y 3
?
agtb
true
false
x x2
x lt-gt y
print(x,y)
38
Exercise Draw a Control-Flow Graph for
public static void main ( String args )
int mi, ma if (args.length 0)
System.out.println("No numbers") else
mi ma Integer.parseInt(args0)
for (int i1 i lt args.length i)
int obs Integer.parseInt(argsi)
if (obs gt ma) ma obs
else if (mi lt obs) mi obs
System.out.println(min" mi
"," "max"
ma)
if
else
for
if
else
if
39
Control-Flow Graph
int mi, ma
  • CFG

args.length 0
true
false
System.out.println("No numbers")
mi ma Integer.parseInt(args0)
int i1
i lt args.length
true
false
int obs Integer.parseInt(argsi)
obs gt ma
true
false
ma obs
mi lt obs
true
false
mi obs
i
System.out.println(min" mi "," "max"
ma)
40
Control Structures (contd2)
  • do-while
  • ? conditional expression
  • lazy disjunction (aka., short-cut ?)
  • lazy conjunction (aka., short-cut ?)
  • switch

Exercise
do Stm while ( Exp )
Exp1 ? Exp2 Exp3
Exp1 Exp2
Exp1 Exp2
Swb
case Exp Stm break
switch ( Exp ) Swb
default Stm break
41
Control Structures (contd3)
  • try-catch-finally (exceptions)
  • return / break / continue
  • method invocation
  • e.g.
  • recursive method invocation
  • e.g.
  • virtual dispatching
  • e.g.

try Stm1 catch ( Exp ) Stm2 finally Stm3
return
return Exp
break
continue
f(x)
f(x)
f(x)
42
Control Structures (contd4)
  • function pointers
  • e.g.
  • higher-order functions
  • e.g.
  • dynamic evaluation
  • e.g.
  • Some constructions (and thus languages) require a
    separate control-flow analysisfor determining
    control-flow in order to dodata-flow analysis

(f)(x)
?f.?x.(f x)
eval(some-string-which-has-been-dynamically-comput
ed)
43
Agenda (23/3, 09)
  • Introduction
  • Undecidability, Reduction, and Approximation
  • Data-flow Analysis
  • Quick tour running example
  • Control-Flow Graphs
  • Control-flow, data-flow, and confluence
  • Science-Fiction Math
  • Lattice theory, monotonicity, and fixed-points
  • Putting it all together
  • Example revisited
Write a Comment
User Comments (0)
About PowerShow.com