Title: CS 791: Code Analysis Techniques
1CS 791 Code Analysis Techniques
- Slide Set 4
- C. M. Overstreet
- Old Dominion University
- Spring 2005
2Old Assignment
- For Monday look for free code analysis tools,
environments. - Example SUIF
3Objective understand system dependency graphs
- One of many code representation forms
- Attempt to subdivide problem
- Parsing typically source code ? parse tree or
abstract syntax tree - Code analysis find a form that facilitates
analysis - Faster
- Easier
4Horwitz, Reps, Binkley Example 1
- Program Main
- sum 0
- i 1
- while i lt 11 do
- call A( sum, i )
- od
- end
- Procedure A(x, y)
- call Add( x, y )
- call Inc( y )
- return
- Procedure Add( a, b )
- a a b
- return
- Procedure Inc( z )
- call Add( z, 1 )
- return
- Slice criterion
- lt return in Inc, z gt
- Whats the slice with Weiser?
5More precise slice
- Program Main
- i 1
- while i lt 11 do
- call A( i )
- od
- end
- Procedure A( y )
- call Inc( y )
- return
- Procedure Add( a, b )
- a a b
- return
- Procedure Inc ( z )
- call Add( z, 1 )
- return
6Prog. Dependency Graph, GP Nodes
- Assignment statements
- Control predicates
- The entry vertex
- For possibly undefined vars, new initial defn of
var node - final use of var node
7Edges--Control Dependencies
- Control dependencies
- Source is Entry node or Predicate
- Labeled true or false
- v1 c v2 means when v1 matches edge label
c, v2 will eventually be executed if the program
terminates - For simple language of paper, control
dependencies represent programs nesting. - If v1 is (think
reachability) - entry, label is true for any v2 that is not
nested. - predicate of while, label is true
- predicate of if one edge is true the other false
8Edges--Data Dependencies
- Edge between v1 and v2 if programs behavior
depends on their order of execution. - Two kinds
- Flow dependency
- Def-order dependency
9Edgesflow dependencies
- Edge from v1 to v2 iff all
- 1. v1 defines var x
- 2. v2 uses x
- 3. Control can reach v2 after v1 with no
intervening definition of x - Symbol ?f
- Flow dependencies are either
- Loop carried , ?lc(L) if also
- 4. Theres a path back to predicate of L with
prop. 3. - 5. Both v1 and v2 are in L
- Loop independent , ?li if besides 1, 2, and 3,
- Theres a path with no backedge to the predicate
that also satisfies 3.
10def-order dependency
- Edge from v1 to v2 iff all
- 1. v1 and v2 both define the same variable
- 2. v1 and v2 are in the same branch of any
conditional - 3. Theres a component v3 such that
- v1?f v3 and v2?f v3
- 4. v1 occurs to the left of v2 in the programs
abstract syntax tree - This def-order dependence is witnessed by v3,
denoted v1?do(v3) v2
11Misc
- We deal with a very simple language
- Scalar vars
- Assignment statements
- Conditional statements
- While loops
- Restricted output statement called end
12Consider fig. 1 from paper
13Consider fig. 3 from paper
14Simple slices (no ftn calls)
- G/s slice of G with respect to s
- Set defns
- V(G/s) w w?V(G) ? w ?c,f s
- that is, all vertices of G that can reach s by
flow and/or control edges) - V(G/S) V(G/(?i si)) ?i V(G/si)
15Worklist algorithm
- procedure MarkVerticesOf Slice(G,S)
- declare
- G a program dependence graph
- S a set of vertices of G
- WorkList a set of vertices in G
- v,w vertices in G
-
- WorkList S
- while WorkList ??
- select and remove vertex v from WorkList
- mark v
- for each unmarked vertex w such that w ?f v
or w ?c v is in E(G) - add w into WorkList
-
-
16Edges in slice
- E(G/S)
- (v ?f w) (v ?f w)? E(G) ? v, w ? V(G/S)
- ? (v ?c w) (v ?c w)? E(G) ? v, w ? V(G/S)
- ? (v ?cdo(u) w) (v ?do(u) w)? E(G) ? v, u, w
? V(G/S) -
17Now add procedures
- New language consists of
- a Main and several procedures
- procedures end with a return with no values
returned - parameter passing by value-result
- and
- no passing of globals
- no repeated vars in parm list
- i.e., no P(x,x)
18System dependency graph
- One program dependency graph for main
- Several procedure dependency graphs, one per
procedure - Two new edge edge types to
- represent direct dependency between a call site
and a procedure - represent transitive dependencies due to call
sites
19Pesky parameters add new vertices. Pretend
- For input
- Calling procedure copies actual parms into a new
tmp variable in called procedure (a node,
actual-in) with control edge to it from call site - Called procedure copies that into formal parms in
called procedure (a node, actual-out) - For output, do it again
- Called procedure copies formals into tmp
formal-in - Calling procedure copies value from tmp into
formal-out - Also call-site node, so 5 new node types in graph
20New edges
- Control edge from call node to both actual-in and
actual-out nodes - Formal-in and formal out nodes have control edge
from called procs entry node - Parameter-in edge from each actual-in node to
formal-in node - Parameter-out edge from each formal-out node to
actual-out node
21Consider fig. 4, pg 38
22Infeasible paths in graph consider the path
- Main x_in sum
- A x x_in
- A a_in x
- Add a a_in
- Add a a b
- Add a_out a
- Inc z a_out
- Inc z_out z
- A y z_out
- A y_out y
- Main i y_out
- Why isnt this a feasible path?
23Goal
- Determine data flows which occur through
procedure calls - For now, change language so that all data flow
occurs through proc. parameters - Find the transitive flow from proc. inputs to
proc. outputs - If possible, recast all questions as a
reachability problem - Easily solved by tracing edges in a directed
graph
24Grammar Graphs
- Horwitz et al. use attribute grammar
- Productions
- calling tree
- Attributes
- inputs are inherited attributes
- outputs are synthesized attributes
- Consider figs. 5 6
- What do the edges in the graph mean?
- Look at fig. 4 again.
25Find real data flows through proc. calls
- Using the Procedure Dependency Graph
- From the proc call node, follow the paths from
the proc. inputs to wherever - But the only paths that matter are the ones that
lead back to the same proc. outputs