Title: Control Flow Analysis
1Control Flow Analysis
- Mooly Sagiv
- http//www.math.tau.ac.il/sagiv/courses/pa.html
- Tel Aviv University
- 640-6706
- Sunday 18-21 Scrieber 8
- Monday 10-12 Schrieber 317
- Textbook Chapter 3(SimplifiedOO)
2Goals
- Understand the problem of Control Flow Analysis
- in Functional Languages
- In Object Oriented Languages
- Function Pointers
- Learn Constraint Based Program Analysis Technique
- General
- Usage for Control Flow Analysis
- Algorithms
- Systems
- Similarities between Problems Techniques
3Outline
- A Motivating Example (OO)
- The Control Flow Analysis Problem
- A Formal Specification
- Set Constraints
- Solving Constraints
- Adding Dataflow information
- Adding Context Information
- Back to the Motivating Example
- Conclusions
4A Motivating Example
- class Vehicle Object int position 10
- void move(x1 int)
- position position x1
- class Car extends Vehicle int passengers
- void await(v Vehicle)
- if (v.position lt position)
- then v.move(position - v.position)
- else self.move(10)
- class Truck extends Vehicle
- void move(x2 int)
- if (x2 lt 55) position position x2
- void main Car c Truck t Vehicle v1
- new c
- new t
- v1 c
- c.passangers 2
- c.move(60)
- v1.move(70)
- c.await(t)
5The Control Flow Analysis (CFA) Problem
- Given a program in a functional programming
language with higher order functions(functions
can serve as parameters and return values) - Find out for each function invocation which
functions may be applied - Obvious in C without function pointers
- Difficult in C, Java and ML
- The Dynamic Dispatch Problem
6An ML Example
let f fn x gt x 1 g fn y gt y 2
h fn z gt z 3 in (f g) (f h)
7An ML Example
let f fn x gt / g, h / x 1 g fn
y gt y 2 h fn z gt z 3 in (f g)
(f h)
8The Language FUN
- Notations
- e ? Exp // expressions (or labeled terms)
- t ? Term // terms (or unlabeled terms)
- f, x ? Var // variables
- c ? Const // Constants
- op ? Op // Binary operators
- l ? Lab // Labels
- Abstract Syntax
- e tl
- t c x fn x ? e // function
definition fun f x ? e // recursive
function definition e1 e2 // function
applications if e0 then e1 else e2
let x e1 in e2 e1 op e2
9A Simple Example
((fn x ? x1)2 (fn y ? y3)4)5
10An Example which Loops
(let g fun f x ? (f1 (fn y ? y2)3)4
)5 (g6
(fn z ? z7)8)9)10
11The 0-CFA Problem
- Compute for every program a pair (C, ?) where
- C is the abstract cache associating abstract
values with labeled program points - ? is the abstract environment associating
abstract values with variables - Formally
- v ? Val P(Term) // Abstract values
- ? ? Env Var ? Val // Abstract environment
- C ? Cache - Lab ? Val // Abstract Cache
- For function application (t1l1 t2l2)l C(l1)
determine the function that can be applied - These maps are finite for a given program
- No context is considered for parameters
12Possible Solutions for ((fn x ? x1)2 (fn y ?
y3)4)5
13(let g fun f x ? (f1 (fn y ? y2)3)4
)5 (g6
(fn z ? z7)8)9)10
Shorthand sf ? fun f x ? (f1 (fn y ? y2)3)4 idy
? fn y ? y2 idz ? fn z ? z7
C(1) sf C(2) C(3) idy C(4)
C(5) sf C(6) sf C(7) C(8)
idy C(9) C(10) ?(x) idy , idy
?(y) ?(z)
14Relationship to Dataflow Analysis
- Expressions are side effect free
- no entry/exit
- A single environment
- Represents information at different points via
maps - A single value for all occurrences of a variable
- Function applications act similar to assignments
- Definition - Function abstraction is created
- Use - Function is applied
15A Formal Specification of 0-CFA
- A Boolean function ? define when a solution is
acceptable - (C, ?) ? e means that (C, ?) is acceptable for
the expression e - Define ? by structural induction on e
- Every function is analyzed once
- Every acceptable solution is sound (conservative)
- Many acceptable solutions
- Generate a set of constraints
- Obtain the least acceptable solution by solving
the constraints
16Syntax Directed 0-CFA(Simple Expressions)
const (C, ?) ? cl always var (C, ?) ? xl if
? (x) ? C (l)
17Syntax Directed 0-CFAFunction Abstraction
fn (C, ?) ? (fn x ? e)l if (C, ?) ?e
fn x ? e ? C(l) fun (C, ?) ? (fun f x ?
e)l if (C, ?) ?e fun x ? e ? C(l) fun
x ? e ? ?(f)
18Syntax Directed 0-CFAFunction Application
app (C, ?) ? (t1l1 t2l2)l if (C, ?) ?
t1l1 (C, ?) ? t2l2 for all fn x ? t0l0
?C(l) C (l2) ? ? (x) ?C(l0) ? C(l) for
all fun x ? t0l0 ?C(l) C (l2) ? ? (x) ?C(l0)
? C(l)
19Syntax Directed 0-CFAOther Constructs
if (C, ?) ? (if t0l0 then t1l1 else
t2l2)l if (C, ?) ? t0l0 (C, ?) ? t1l1
(C, ?) ? t2l2 C(l1) ? C(l) C(l2) ?
C(l) let (C, ?) ? (let x t1l1 in t2l2)l
if (C, ?) ? t1l1 (C, ?) ? t2l2 C(l1) ?
? (x) C(l2) ? C(l) op (C, ?) ? (t1l1 op
t2l2)l if (C, ?) ? t1l1 (C, ?) ? t2l2
20Possible Solutions for ((fn x ? x1)2 (fn y ?
y3)4)5
21Set Constraints
- A set of rules of the form
- lhs ? rhs
- t ? rhs ? lhs ? rhs (conditional constraint)
- lhs, rhs, rhs are
- terms
- C(l)
- ?(x)
- The least solution (C, ?) can be found
iterativelly - start with empty sets
- add terms when needed
- Efficient cubic graph based solution
22Syntax Directed Constraint Generation (Part I)
C ?cl ? C ?xl ? ? (x) ? C (l)
C ?(fn x ? e)l ? C ? e ? ? fn x ? e ?
C(l) C ?(fun x ? e)l ? C ? e ? ? fun x ?
e ? C(l) ? fun x ? e ? ?( f)
C ? (t1l1 t2l2)l ? C ? t1l1 ? ? C ?t2l2 ?
? t ? C(l) ? C (l2) ? ? (x) tfn x ? t0l0
?Term ? t ? C(l) ? C (l0) ? C
(l) tfn x ? t0l0 ?Term ? t ? C(l) ? C
(l2) ? ? (x) tfun x ? t0l0 ?Term ? t ?
C(l) ? C (l0) ? C (l) tfun x ? t0l0 ?Term
23Syntax Directed Constraint Generation (Part II)
C ?(if t0l0 then t1l1 else t2l2)l ? C ?
t0l0 ? ? C ? t1l1 ? ? C ?t2l2 ? ? C(l1) ?
C (l) ? C(l2) ? C (l) C? (let x t1l1
in t2l2)l ? C ? t1l1 ? ? C ?t2l2 ? ?
C(l1) ? ? (x) ? C(l2) ? C(l) C?
(t1l1 op t2l2)l ? C ? t1l1 ? ? C ?t2l2 ?
24Set Constraints for ((fn x ? x1)2 (fn y ? y3)4)5
25Iterative Solution to the Set Constraints for
((fn x ? x1)2 (fn y ? y3)4)5
26Adding Data Flow Information
- Dataflow values can affect control flow analysis
- Example(let f (fn x ? (if (x1 gt 02)3
then (fn y ? y4)5
else (fn z ? 56)7
)8
)9in ((f10 311)12 013)14)15
27Adding Data Flow Information
- Add a finite set of abstract values per program
Data - Update Val P(Term?Data)
- ? ? Env Var ? Val // Abstract environment
- C ? Cache - Lab ? Val // Abstract Cache
- Generate extra constraints for data
- Obtained a more precise solution
- A special of case of product domain (4.4)
- The combination of two analyses may be more
precise than both - For some programs may even be more efficient
28Adding Dataflow Information (Sign Analysis)
- Sign analysis
- Add a finite set of abstract values per program
Data P, N, TT, FF - Update Val P(Term?Data)
- dc is the abstract value that represents a
constant c - d3 p
- d-7 n
- dtrue tt
- dfalse ff
- Every operator is conservatively interpreted
29Syntax Directed Constraint Generation (Part I)
C ?cl ? dc ? C (l) C ?xl ? ? (x) ? C (l)
C ?(fn x ? e)l ? C ? e ? ? fn x ? e ?
C(l) C ?(fun x ? e)l ? C ? e ? ? fun x ?
e ? C(l) ? fun x ? e ? ?( f)
C ? (t1l1 t2l2)l ? C ? t1l1 ? ? C ?t2l2 ?
? t ? C(l) ? C (l2) ? ? (x) tfn x ? t0l0
?Term ? t ? C(l) ? C (l0) ? C
(l) tfn x ? t0l0 ?Term ? t ? C(l) ? C
(l2) ? ? (x) tfun x ? t0l0 ?Term ? t ?
C(l) ? C (l0) ? C (l) tfun x ? t0l0 ?Term
30Syntax Directed Constraint Generation (Part II)
C ?(if t0l0 then t1l1 else t2l2)l ? C ?
t0l0 ? ? C ? t1l1 ? ? C ?t2l2 ? ? dt ? C
(l0) ? C(l1) ? C (l) ? df? C (l0) ? C(l2) ?
C (l) C? (let x t1l1 in t2l2)l ? C ?
t1l1 ? ? C ?t2l2 ? ? C(l1) ? ? (x) ?
C(l2) ? C(l) C? (t1l1 op t2l2)l ? C
? t1l1 ? ? C ?t2l2 ? ?
C(l1) op C(l2) ? C(l)
31Adding Context Information
- The analysis does not distinguish between
different occurrences of a variable(Monovariant
analysis) - Example(let f (fn x ? x1) 2 in ((f3 f4)5
(fn y ? y6) 7)8)9 - Source to source can help (but may lead to code
explosion) - Example rewrittenlet f1 fn x1 ? x1 in let
f2 fn x2 ? x2 in (f1 f2) (fn y ? y)
32Simplified K-CFA
- Records the last k dynamic calls (for some fixed
k) - Similar to the call string approach
- Remember the context in which expression is
evaluated - Val is now P(Term)?Contexts
- ? ? Env Var ?Contexts ? Val
- C ? Cache - Lab?Contexts ? Val
331-CFA
- (let f (fn x ? x1) 2 in ((f3 f4)5 (fn y ?
y6) 7)8)9 - Contexts
- - The empty context
- 5 The application at label 5
- 8 The application at label 8
- Polyvariant Control FlowC(1, 5) ? (x, 5)
C(2, ) C(3, ) ? (f, )
((fn x ? x1), )C(1, 8) ? (x, 8)
C(7, ) C(8, ) C(9, )
((fn y ? y6), )
34The Motivating Example
- class Vehicle Object int position 10
- void move(x1 int)
- position position x1
- class Car extends Vehicle int passengers
- void await(v Vehicle)
- if (v.position lt position)
- then v.move(position - v.position)
- else self.move(10)
- class Truck extends Vehicle
- void move(x2 int)
- if (x2 lt 55) position position x2
- void main Car c Truck t Vehicle v1
- new c
- new t
- v1 c
- c.passangers 2
- c.move(60)
- v1.move(70)
- c.await(t)
35Missing Material
- Efficient Cubic Solution to Set Constraints
www.cs.berkeley.edu/Research/Aiken/bane.html - Experimental results for OO www.cs.washington.edu
/research/projects/cecil - Operational Semantics for FUN (3.2.1)
- Defining acceptability without structural
induction - More precise treatment of termination (3.2.2)
- Needs Co-Induction (greatest fixed point)
- Using general lattices as Dataflow values
instead of powersets (3.5.2) - Lower-bounds
- Decidability of JOP
- Polynomiality
36Conclusions
- Set constraints are quite useful
- A Uniform syntax
- Can even deal with pointers
- But semantic foundation is still based on
abstract interpretation - Techniques used in functional and imperative (OO)
programming are similar - Control and data flow analysis are related