Data Flow Analysis ctd' - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Data Flow Analysis ctd'

Description:

It won't be executed, but removing it reduces size of object code ... a a = a ( idempotent ) a b = b a ( commutative ) a ( b c ) = ( a b ) c ( associative ) ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 28

Provided by: barbara179

Category:

more less

Transcript and Presenter's Notes

Title: Data Flow Analysis ctd'

1
Data Flow Analysis ctd.

Data Flow Problems
Formal Data Flow Systems

2
Dead Code Elimination

Code is useless if it can never be executed
If we cannot reach a node from start of flowgraph
it is useless
It wont be executed, but removing it reduces
size of object code
Code is dead if it does not contribute to the
results of the program in any way.
Dead code may be inadvertently written, produced
as a result of code modification (upgrades) or
generated by the compiler during optimizations.
Dead code elimination optimization attempts to
recognize and eliminate dead code in a program.
We look at a strategy for doing so in next slides

3
Dead Code Elimination

Goal Remove all statements that do not
contribute to the output (or interact with the
environment) in any way. We only consider WRITE
statements in the following.
Strategy to mark useful statements
Let STAT be the set of statements, OUTPUT be the
set of WRITE statements in STAT, and the sets
UD(S,v) for all S ? STAT and v ? VAR, with v ?
USE(S) be available.
Let KEEP be the set of useful statements
Set mark to FALSE for each statement
Initialize KEEP OUTPUT

This is a somewhat simplistic definition of dead
code.
4
Dead Code Elimination

While KEEP is not empty, select and remove an
arbitrary statement S from KEEP
mark(S) TRUE
for all v in USE(S) do
for all S in UD(S, v) do
if mark(S) FALSE, add S to KEEP
end for
end for
end while

Idea work backwards from data output and keep
all calculations that contribute to their
computation.
5
Live Variables Problem

Recall A variable v is live at the exit of a
basic block n iff there is a path from n to some
basic block n such that there is an outward
exposed use of v in n, and the path is
definition free for v.
v . .
.. v
The live variables problem is to determine, for
each node n in the flowgraph, the set of live
variables at the exit of n.

6
Live Variables Problem

If a variable is live, then it is used
subsequently.
This supports register allocation If a variable
is not live, then its value does not need to be
saved in a register.
We require a single-exit flowgraph to solve this
problem, since we need to track variable uses
back to their definitions.
So we start at the exit node

7
Modeling Data Flow Problems

We can save considerable programming effort by
using a framework to solve all of the dataflow
problems needed in a compiler.
Dataflow frameworks require a representation of
the program and a strategy for saving data flow
information associated with this representation.
We assume that a program is represented by the
flowgraph (as well as the IR ?).
Since there is a flowgraph for each individual
procedure, data flow information is gathered on a
per-procedure basis.

8
Modeling Data Flow Problems

Our general strategy for modeling data flow
problems expects a flowgraph and the following
specification of the data flow problem to be
solved
availability of the initial information required
specification of the effect of individual basic
blocks on the information, and
the effect of a join in the flowgraph (i.e. what
happens when several paths in the flowgraph meet)

The idea we gather data flow information
systematically for all the statements in the
procedure.
9
Modeling Data Flow Problems

Given these, we can propagate information
through the flowgraph.
updated info
what
info?
start info
updated info

10
Example Reaching Definitions Problem
RD(n) S DEF(S) is not empty and S reaches
n, a set of statements that define variables.

Start info begin with empty sets for each basic
block
Effect of a basic block If S1 is set of reaching
definitions that enters n, then any statement in
S1 that is preserved in n will be a reaching
definition for successor blocks.
In addition, all outward exposed definitions in
n will be reaching definitions for successor
blocks.
S1 S v ..
Sv ...
Sa .. S a
Sa...

11
Example Reaching Definitions Problem
RD(n) S DEF(S) is not empty and S reaches
n, a set of statements that define variables.

Effect of a join if two nodes n1 and n2 join at
n, where S1 is the set of reaching definitions of
n1 and S2 is the set of reaching definitions of
n2, then
the definitions in S1 and the definitions in S2
all reach n. So we form S1 U S2
S a
Sa..
Sb ..
S b ...

12
Example Reaching Definitions Problem

Solution of problem
Propagate information through flowgraph
iteratively.
Terminate when no new information is created for
any node in graph.
This means we have found all definitions in
program that reach a given node
Longest path from start node to end node is upper
bound on number of iterations required

13
Monotone Data Flow System

A uniform framework for modeling almost all data
flow problems is a Monotone Data Flow System
(MDS)
Unifies and simplifies implementation of a
variety of data flow problems
construct monotone data flow system,
use iterative algorithm to propagate information
in the flowgraph
To use this framework we must define
functions that describe the effect of a basic
block on the solution
the effect of a join in the flowgraph.

14
Semi-Lattices
MDS based on a semi-lattice Actually, a bounded
semi-lattice with 0 and 1 elements

Set L with binary meet operation such that for
all a,b,c L
a a a ( idempotent )
a b b a ( commutative )
a ( b c ) ( a b ) c
( associative )
A semilattice has a
zero element iff for some element 0, a 0
0 for all a L
one element iff for some element 1, a 1
a, for all a L

15
Partial Order in a Semi-Lattice

We may define a partial order on a semi-lattice
as follows
Given a semi-lattice ( L, ) and arbitrary
elements a, b L.
a b ?? a b a
is a partial order on L.
We use gt and lt in the usual way.

16
Bounded Semi-Lattice

Let a1, a2, , an be a sequence of elements from
semilattice L.
This sequence is a chain iff aj gt aj1 for 1
j n-1
A semi-lattice is bounded iff for every a L,
there is some natural number ca such that the
length of every chain beginning with a is at most
ca.
Thus each chain is of finite length.

17
Bounded Semi-Lattices

We may extend the meet operation to an arbitrary
number of elements of a semi-lattice
j m aj a1 a2 am
We may further extend to countably
infinite sets.
If L is bounded, the limit exists and is equal to
that of a finite set of elements.

18
Bounded Semi-Lattices Example

For a set M, (P (M), ) is a bounded
semi-lattice with a 0 and a 1 element.
In this case, is the set-theoretic relation
.
For a set M, (P (M), ) is a bounded
semi-lattice with a 0 and a 1 element.
In this case, is the set-theoretic relation
-1

U
U
19
Bounded Semi-Lattices Example

For a set M, (P (M), ) is a bounded
semi-lattice with a 0 and a 1 element.
In this case, is the set-theoretic relation
.
For a set M, (P (M), ) is a bounded
semi-lattice with a 0 and a 1 element.
In this case, is the set-theoretic relation
-1

U
B
A B iff A B A
U
A
A
A B iff A B A
U
B
U
20
Bounded Semi-Lattices Example

Natural numbers and the meet operation given by
min is also a bounded semi-lattice.
We can use the usual definition of , since
a b iff min(a,b) a.
If we use max as the meet operation, then we
would need to use the ordering instead.

21
Monotone Data Flow System

A monotone data flow system is founded on a
bounded semi-lattice (L, ) with a 0 and a 1
element.
It has functions that model the effect of a basic
block on data flow information.
Most data flow problems operate on semi-lattices
where sets (of statements, variables) are the
elements, and
the meet operation is the union or intersection
of such sets.
We use the properties of such semi-lattices to
write algorithms with known behavior.

22
Effects of a Basic Block

These are modeled by a function f L ? L. We
require such functions in a monotone data flow
system to be monotonic.
A total function f L ? L is monotonic if and
only if
for all a, b L, f ( a b ) f (
a ) f ( b )
Alternatively, a function is monotonic iff
for all a, b L, a b ? f(a) f(b)

23
Effects of a Basic Block

The iterative algorithm will produce precise
results if the functions are distributive.
Otherwise, they may not be precise.
A total function f L ? L is distributive if and
only if
for all a, b L, f ( a b ) f (
a ) f ( b )

If we know that our problem is distributive, the
algorithm will find all the available data flow
information. Otherwise, it might not.
24
Fixpoint of Monotone Data Flow System

Our plan is to repeatedly apply monotone
functions to update the effect of a basic block
upon data flow information associated with nodes
So we must be sure that this process terminates.
This is guaranteed by the greatest fixpoint
theorem.
A fixpoint of a monotonic function f L ? L is a
value
a L such that f ( a ) a.

25
Greatest Fixpoint Theorem

The theorem
let L be a bounded semi-lattice with 0 and 1
elements, and
let f L ? L be a monotonic function.
Then there is a t 0 such that ft1(1)
ft(1).
ft(1) is the greatest fixpoint of f.
The proof see next slide

26
Greatest Fixpoint Theorem

1 f(1), since 1 x for all x L
Since f is monotonic, f( 1 ) ( f (1) )
So 1 f (1) f ( f (1) ) ... is a
chain which is bounded, and for some t, f ( ft (
1 )) ft (1)
Hence ft ( 1 ) is a fixpoint of f
Now assume a is an arbitrary fixpoint of f.
Then f ( a ) a
Since a 1, f(a) f(1), we repeatedly apply
monotonicity definition a ft ( a ) ft( 1
)
Thus ft ( 1 ) is the greatest fixpoint of f

27
Computing Greatest Fixpoint

Given a bounded semi-lattice (L, ) with 0 and
1, and a monotonic function f L ? L, the
greatest fixpoint of f is computed as follows
a 1
while f ( a ) lt a
a f ( a )
end while
fixpoint a

Write a Comment

User Comments (0)