CMPUT680 - Winter 2006 - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

CMPUT680 - Winter 2006

Description:

Lock-Step Algorithms. Lazy Algorithms. CMPUT 680 - Compiler Design and Optimization ... Lock-Step Algorithms. Performs the reachability computation incrementally ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 79
Provided by: csUal
Category:

less

Transcript and Presenter's Notes

Title: CMPUT680 - Winter 2006


1
CMPUT680 - Winter 2006
  • Topic G Static Single-Assignment Form
  • José Nelson Amaral
  • http//www.cs.ualberta.ca/amaral/courses/680

2
Reading Material
Chapter 19 of the Tiger book (with a grain of
salt!!). Bilardi, G., Pingali, K., The Static
Single Assignment Form and its
Computation, unpublished? (citeseer). Cytron,
R., Ferrante, J., Rosen, B. K., Wegman, M. N.,
Zadeck, F. K., An Efficient Method of
Computing Static Single Assignment Form,
ACM Symposium on Principles of Programming
Languages (PoPL), pp. 25-35, Austin, TX,
Jan., 1989. Cytron, R., Ferrante, J., Rosen, B.
K., Wegman, M. N., Efficiently Computing
Static Single Assignment Form and the Control
Dependence Graph, ACM Transactions on
Programming Languages and Systems (TOPLAS),
Vol. 13, No. 4, October, 1991, pp.
451-490. Sreedhar, V. C., Gao, G. R., A Linear
Time Algorithm for Placing ?-Nodes, ACM
Symposium on Principles of Programming Languages
(PoPL), pp. 62-73, 1995.
3
Static Single-Assignment Form
Each variable has only one definition in the
program text.
This single static definition can be in a loop
and may be executed many times. Thus even in
a program expressed in SSA, a variable can
be dynamically defined many times.
4
Advantages of SSA
  • Simpler dataflow analysis
  • No need to use use-def/def-use chains, which
    requires N?M space for N uses and M definitions
  • SSA form relates in a useful way with dominance
    structures. SSA simplifies algorithms that
    construct interference graphs.

5
SSA Form in Control-Flow Path Merges
Is this code in SSA form?
b ? Mx a ? 0
B1
No, two definitions of a appear in the code (in
B1 and B3)
B2
if blt4
How can we transform this code into a code in SSA
form?
B3
a ? b
We can create two versions of a, one for B1 and
another for B3.
c ? a b
B4
6
SSA Form in Control-Flow Path Merges
But which version should we use in B4 now?
b ? Mx a1 ? 0
B1
B2
if blt4
B3
a2 ? b
c ? a? b
B4
7
SSA Form in Control-Flow Path Merges
But which version should we use in B4 now?
b ? Mx a1 ? 0
B1
B2
if blt4
B3
a2 ? b
a3 ? ?(a2,a1) c ? a3 b
B4
8
A Loop Example
a ? 0
a1 ? 0 b0 ? undef c0 ? undef
b ? a1 c ? cb a ? b2 if a lt N
a3 ? ?(a1,a2) b1 ? ?(b0,b2) c2 ? ?(c0,c1) b2 ?
a31 c1 ? c2b2 a2 ? b22 if a lt N
return
?(b0,b2) is not necessary because b1 is never
used. But the phase that generates ? functions
does not know it. Unnecessary ? functions are
later eliminated by dead code elimination.
return
9
The ? Function
How can we implement a ? function that
knows which control path was taken?
Answer 1 We dont!! The ? function is used
only to connect use to definitions
during optimization, but is never implemented.
Answer 2 If we must execute the ? function, we
can implement it by inserting MOVE
instructions in all control paths.
10
Criteria for Inserting ? Functions
We could insert one ? function for each
variable at every join point (a point in the CFG
with more than one predecessor). But that would
be wasteful.
What criteria should we use to insert a ?
function for a variable a at node z of the CFG?
Intuitively, we should add a function ? if there
are two definitions of a that can reach the point
z through distinct paths.
11
Path Convergence Criterion (Cytron-Ferrante/89)
Insert a ? function for a variable a at a node z
if all the following conditions are true 1.
There is a block x that defines a 2. There is a
block y ? x that defines a 3. There is a
non-empty paths x?z and y?z 4. Paths x?z and y?z
dont have any nodes in common other than
z 5. The node z does not appear within both x?z
and y?z prior to the end, but it may appear
in one or the other.
Note The start node contains an implicit
definition of every variable.
12
?-Candidates are Join Nodes
Notice that according to the path
convergence criterion, the node z that will
receive the ? function must be a join node. z is
the first node that joins the paths Pxz and Pyz.
13
Iterated Path-Convergence Criterion
The ? function itself is a definition of a.
Therefore the path-convergence criterion is a
set of equations that must be satisfied.
while there are nodes x, y, z satisfying
conditions 1-5 and z does not contain a
? function for a do insert a? ?(a0, a1, , an) at
node z
14
The SSA Conversion Problem
For each variable x defined in a CFG G(V,E),
given the set of nodes S ? V that contain a
definition for x, find the minimal set, J(S) of
nodes that requires a ?(xi,xj) function.
By definition, the START node defines all the
variables, therefore ? S ? V, START ? S.
If we need to compute ? nodes for
several variables, it may be efficient to
precompute data structures based on the CFG.
15
Processing Time for SSA Conversion
The performance of an SSA conversion
algorithm should be measured by the processing
time Tp, the preprocessing space Sp, and the
query time Tq.
(Shapiro and Saint 1970) outline an algorithm
(Reif and Tarjan 1981) extend the
Lengauer-Tarjan dominator algorithm to
compute ?-nodes.
(Cytron et al. 1991) show that SSA conversion
can use the idea of dominance frontiers,
resulting on an O(V2) algorithm.
(Sreedhar and Gao, 1995) An O(E) algorithm,
but in private commun. with Pingali in
1996 admits that it is in practice 5
times slower than Cytron et al.
16
Processing Time for SSA Conversion
Bilardi, Pingali, 1999 present a generalized
framework and a parameterized Augmented
Dominator Tree (ADT) algorithm that allows for a
space-time tradeoff. They show that Cytron et
al. and Gao-Shreedhar are special cases of the
ADT algorithm.
  • Bilardi and Pingali describe three strategies to
    compute
  • ?-placement
  • Two-Phase Algorithms
  • Lock-Step Algorithms
  • Lazy Algorithms

17
Two-Phase Algorithms
First build the entire Dominance Frontier Graph,
then find the nodes reachable from S
Simple DF Graph may be quite large
18
Lock-Step Algorithms
Performs the reachability computation
incrementally while the DF relation is computed.
CFG
  • Avoid storing the DF Graph.
  • Perform computations at all
  • nodes of the graph, even though
  • most are irrelevant
  • Inneficient when computing the
  • ?-nodes for many variables.

DF Computation Reachability
S
J(S)
19
Lazy Algorithms
Lazily compute only the portion fo the DF Graph
that is needed. Carefully select a portion of the
DF Graph to compute eagerly (before it is
needed).
CFG
DF Computation
A Two-Phase Algorithm is an extreme case of a
lazy algorithm.
DF Graph SubGraph
Reachability
S
J(S)
20
Computing a Dominator Tree
(n of nodes m of edges)
(Lowry and Medlock, 1969) Introduce the problem
and give
an O(n4) algorithm.
(Lengauer and Tarjan, 1979) Give a complicated
O(m?(m.n)) algorithm ?(m.n) is the inverse
Ackermanns function.
(Harel, 1985) Give a linear time algorithm.
(Alstrup, Harel and Thorup, 1997) Give a simpler
version of Harels algorithm.
21
Dominance Property of the SSA Form
In SSA form definitions dominate uses, i.e. 1.
If x is used in a ? function in block n, then
the definition of x dominates every
predecessor of n. 2. If x is used in a non-?
statement in block n, then the definition of
x dominates n.
22
The Dominance Frontier
A node x dominates a node w if every path from
the start node to w must go through x.
A node x strictly dominates a node w if x
dominates w and x ? w.
The dominance frontier of a node x is the set
of all nodes w such that x dominates a
predecessor of w, but x does not strictly
dominates w.
23
Example
1
13
What is the dominance frontier of node 5?
24
Example
1
13
First we must find all nodes that node 5
dominates.
25
Example
1
13
A node w is in the dominance frontier of node 5
if 5 dominates a predecessor of w, but 5 does
not strictly dominates w itself. What is the
dominance frontier of 5?
26
Example
1
2
9
5
3
12
8
4
13
A node w is in the dominance frontier of node 5
if 5 dominates a predecessor of w, but 5 does
not strictly dominates w itself. What is the
dominance frontier of 5?
27
Example
1
2
9
5
3
12
8
4
DF(5) 4, 5, 12, 13
13
A node w is in the dominance frontier of node 5
if 5 dominates a predecessor of w, but 5 does
not strictly dominates w itself. What is the
dominance frontier of 5?
28
Dominance Frontier Criterion
Dominance Frontier Criterion If a node x
contains a definition of variable a, then any
node z in the dominance frontier of x needs a ?
function for a.
Can you think of an intuitive explanation for
why a node in the dominance frontier of another
node must be a join node?
29
Example
1
If a node (12) is in the dominance frontier of
another node (5), than there must be at least
two paths converging to (12).
2
9
5
3
12
8
4
These paths must be non-intersecting, and one of
them (5,7,12) must contain a node
strictly dominated by (5).
13
30
Dominator Tree
To compute the dominance frontiers, we first
compute the dominator tree of the CFG.
There is an edge from node x to node y in the
dominator tree if node x immediately dominates
node y.
I.e., x dominates y?x, and x does not dominate
any other dominator of y.
Dominator trees can be computed using the
Lengauer-Tarjan algorithm(1979). See sec. 19.2 of
Appel.
31
Example Dominator Tree
1
2
9
5
Dominator Tree
3
1
12
8
4
4
5
12
9
2
13
3
10
11
13
Control Flow Graph
6
7
8
32
Local Dominance Frontier
Cytron-Ferrante define the local dominance
frontier of a node n as DFlocaln
successors of n in the CFG that are not
strictly dominated by n
33
Example Local Dominance Frontier
In the example, what are the local
dominance frontier of nodes 5, 6 and 7?
1
2
9
5
DFlocal5 ? DFlocal6 4,8 DFlocal7
8,12
3
12
8
4
13
Control Flow Graph
34
Dominance Frontier Inherited From Its Children
The dominance frontier of a node n is formed by
its local dominance frontier plus nodes that are
passed up by the children of n in the dominator
tree.
The contribution of a node c to its parents
dominance frontier is defined as
Cytron-Ferrante, 1991 DFupc nodes in the
dominance frontier of c that are
not strictly dominated by the
immediate dominator of c
35
Example Local Dominance Frontier
1
In the example, what are the contributions of
nodes 6, 7, and 8 to its parent dominance
frontier?
2
9
5
3
12
8
First we compute the DF and the
immediate dominator of each node DF6 4,8,
idom(6) 5 DF7 8,12, idom(7) 5 DF8
5,13, idom(8) 5
4
13
Control Flow Graph
36
Example Local Dominance Frontier
First we compute the DF and the
immediate dominator of each node DF6 4,8,
idom(6) 5 DF7 8,12, idom(7) 5 DF8
5,13, idom(8) 5
1
2
9
5
3
12
8
4
Now we check for the DFup condition DFup6
4 DFup7 12 DFup8 5,13
13
Control Flow Graph
37
A note on implementation
We want to represent these sets
efficiently DF6 4,8 DF7 8,12 DF8
5,13
If we use bitvectors to represent these
sets DF6 0000 0001 0001 0000 DF7 0001
0001 0000 0000 DF8 0010 0000 0010 0000
38
Strictly Dominated Sets
We can also represent the strictly dominated sets
as vectors SD1 0011 1111 1111 1100 SD2
0000 0000 0000 1000 SD5 0000 0001 1100
0000 SD9 0000 1100 0000 0000
Dominator Tree
1
4
5
12
9
2
13
3
10
11
6
7
8
39
A note on implementation
If we use bitvectors to represent these
sets DF6 0000 0001 0001 0000 DF7 0001
0001 0000 0000 DF8 0010 0000 0010 0000
SD5 0000 0001 1100 0000
DFupc DF6 SD5
DFupc nodes in the dominance frontier
of c that are not strictly dominated by
the immediate dominator of c
40
Dominance Frontier Inherited From Its Children
The dominance frontier of a node n is formed by
its local dominance frontier plus nodes that are
passed up by the children of n in the dominator
tree. Thus the dominance frontier of a node n is
defined as Cytron-Ferrante, 1991
41
Example Local Dominance Frontier
1
What is DF5?
Remember that DFlocal5 ? DFup6
4 DFup7 12 DFup8 5,13 DTchildren5
6,7,8
2
9
5
3
12
8
4
13
Control Flow Graph
42
Example Local Dominance Frontier
1
What is DF5?
Remember that DFlocal5 ? DFup6
4 DFup7 12 DFup8 5,13 DTchildren5
6,7,8
2
9
5
3
12
8
4
13
Control Flow Graph
Thus, DF5 4, 5, 12, 13
43
Join Sets
In order to insert ?-nodes for a variable x that
is defined in a set of nodes Sn1, n2, , nk
we need to compute the iterated set of join nodes
of S.
Given a set of nodes S of a control flow graph G,
the set of join nodes of S, J(S), is defined as
follows J(S) z ? G ? two paths Pxz and Pyz
in G that have z as its
first common node, x ? S
and y ? S
44
Iterated Join Sets
Because a ?-node is itself a definition of a
variable, once we insert ?-nodes in the join set
of S, we need to find out the join set of S ?
J(S). Thus, Cytron-Ferrante define the iterated
join set of a set of nodes S, J(S), as the
limit of the sequence
45
Iterated Dominance Frontier
We can extend the concept of dominance
frontier to define the dominance frontier of a
set of nodes as
Exercise Find an example in which the IDF of a
set S is different from the DF of the set!
46
Location of ?-Nodes
Given a variable x that is defined in a set of
nodes Sn1, n2, , nk the set of nodes that
must receive ?-nodes for x is J(S).
Thus we are mostly interested in computing the
iterated dominance frontier of a set of nodes.
47
Algorithms to Compute Dominance Frontier
The algorithm to insert ?-nodes, due to Cytron
and Ferrante (1991), computes the dominance
frontier of each node in the set S before
computing the iterated dominance frontier of the
set.
In the worst case, the combination of the
dominance frontier of the sets can be quadratic
in the number of nodes in the CFG. Thus,
Cytron-Ferrantes algorithm has a complexity
O(N2).
In 1994, Shreedar and Gao proposed a
simple, linear algorithm for the insertion of
?-nodes.
48
Sreedhar and Gaos DJ Graph
1
2
9
5
Dominator Tree
3
1
12
8
4
4
5
12
9
2
13
3
10
11
13
Control Flow Graph
6
7
8
49
Sreedhar and Gaos DJ Graph
1
D nodes
2
9
5
Dominator Tree
3
1
12
8
4
4
5
12
9
2
13
3
10
11
13
Control Flow Graph
6
7
8
50
Sreedhar and Gaos DJ Graph
D nodes
1
J nodes
2
9
5
Dominator Tree
3
1
12
8
4
4
5
12
9
2
13
3
10
11
13
Control Flow Graph
6
7
8
51
Shreedar-Gaos Dominance Frontier Algorithm
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
4
5
12
9
2
13
3
10
11
What is the DF5?
6
7
8
52
Shreedar-Gaos Dominance Frontier Algorithm
Initialization DF5 ?
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
SubTree(5) 5, 6, 7, 8
4
5
12
9
2
13
3
10
11
6
7
8
53
Shreedar-Gaos Dominance Frontier Algorithm
Initialization DF5 ?
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
SubTree(5) 5, 6, 7, 8 There are three
edges originating in 5 5?6, 5?7, 5?8 but they
are all D-edges
4
5
12
9
2
13
3
10
11
6
7
8
54
Shreedar-Gaos Dominance Frontier Algorithm
Initialization DF5 ? After visiting 6 DF
4
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
SubTree(5) 5, 6, 7, 8 There are two
edges originating in 6 6?4, 6?8 but 8.level gt
5.level
4
5
12
9
2
13
3
10
11
6
7
8
55
Shreedar-Gaos Dominance Frontier Algorithm
Initialization DF5 ? After visiting 6 DF
4 After visiting 7 DF 4,12
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
SubTree(5) 5, 6, 7, 8 There are two
edges originating in 7 7?8, 7?12 again
8.level gt 5.level
4
5
12
9
2
13
3
10
11
6
7
8
56
Shreedar-Gaos Dominance Frontier Algorithm
Initialization DF5 ? After visiting 6 DF
4 After visiting 7 DF 4,12 After visiting
8 DF 4, 12, 5, 13
DominanceFrontier(x) 0 DFx ? 1 foreach
y ? SubTree(x) do 2 if((y ? z J-edge)
and 3 (z.level ? x.level)) 4
then DFx DFx ? z
1
SubTree(5) 5, 6, 7, 8 There are two
edges originating in 8 8?5, 8?13 both satisfy
cond. in steps 2-3
4
5
12
9
2
13
3
10
11
6
7
8
57
Shreedhar-Gaos ?-Node Insertion Algorithm
Using the D-J graph, Shreedhar and Gao propose a
linear time algorithm to compute the
iterated dominance frontier of a set of
nodes. An important intuition in Shreedhar-Gaos
algorithm is If two nodes x and y are in S,
and y is an ancestor of x in the dominator
tree, then if we compute DFx first, we do
not need to recompute DFx when computing
DFy.
58
Shreedhar-Gaos ?-Node Insertion Algorithm
Shreedhar-Gaos algorithm also use a work list of
nodes hashed by their level in the dominator tree
and a visited flag to avoid visiting the same
node more than once. The basic operation of the
algorithm is similar to their dominance-frontier
algorithm, but it requires a careful
implementation to deliver the linear-time complexi
ty.
59
Dead-Code Elimination in SSA Form
Because there is only one definition for
each variable, if the list of uses of the
variable is empty, the definition is dead.
When a statement v? x ? y is eliminated because v
is dead, this statement must be removed from the
list of uses of x and y. Which might cause those
definitions to become dead. Thus we need to
iterate the dead code elimination algorithm.
60
Simple Constant Propagation in SSA
If there is a statement v ? c, where c is a
constant, then all uses of v can be replaced for
c. A ? function of the form v ? ?(c1, c2, , cn)
where all ci are identical can be replaced for v
? c. Using a work-list algorithm in a program in
SSA form, we can perform constant propagation in
linear time
In the next slide we assume that x, y, z are
variables and a, b, c are constants.
61
Linear Time Optimizations in SSA form
Copy propagation The statement x ? ?(y) or the
statement x ? y can be deleted and y can
substitute every use of x. Constant folding If
we have the statement x ? a ? b, we can evaluate
c ? a ? b at compile time and replace the
statement for x ? c Constant conditions The
conditional if a lt b goto L1 else L2 can be
replaced for goto L1 or goto L2, according to
the compile time evaluation of a lt b, and the
CFG, use lists, adjust accordingly Unreachable
Code eliminate unreachable blocks.
62
Single Assignment Form
B1
i ? 1 j ? 1 k? 0
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
B2
if klt100
B4
B3
return j
if jlt20
B6
B5
B7
63
Single Assignment Form
B1
i ? 1 j ? 1 k1? 0
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
B2
if klt100
B4
B3
return j
if jlt20
B6
B5
B7
64
Single Assignment Form
B1
i ? 1 j ? 1 k1? 0
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
B2
if klt100
B4
B3
return j
if jlt20
B6
B5
B7
k4 ? ?(k3,k5)
65
Single Assignment Form
B1
i ? 1 j ? 1 k1? 0
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
B2
k2 ? ?(k4,k1) if klt100
B4
B3
return j
if jlt20
B6
B5
B7
k4 ? ?(k3,k5)
66
Single Assignment Form
B1
i ? 1 j ? 1 k1? 0
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
B2
k2 ? ?(k4,k1) if k2lt100
B4
B3
return j
if jlt20
B6
B5
B7
k4 ? ?(k3,k5)
67
Single Assignment Form
i1 j1 k0 while(klt100)
if(jlt20) ji
kk1
else jk
kk2
return j
68
ExampleConstant Propagation
69
ExampleDead-code Elimination
70
Constant Propagation and Dead Code Elimination
71
ExampleIs this the end?
But block 6 is never executed! How can we find
this out, and simplify the program?
B2
j2 ? ?(j4,1) k2 ? ?(k4,0) if k2lt100
B4
SSA conditional constant propagation finds the
least fixed point for the program and allows
further elimination of dead code.
B3
return j2
if j2lt20
B6
B5
j4 ? ?(1,j5) k4 ? ?(k3,k5)
B7
See algorithm on pg. 454-455 of Appel.
72
ExampleDead code elimination
B2
j2 ? ?(j4,1) k2 ? ?(k4,0) if k2lt100
B4
B3
return j2
if j2lt20
B6
B5
j4 ? ?(1,j5) k4 ? ?(k3,k5)
B7
73
Example Single Argument ?-Function Elimination
B4
74
Example Constant and Copy Propagation
B4
75
Example Dead Code Elimination
76
Example ?-Function Simplification
77
Example Constant Propagation
78
Example Dead Code Elimination
Write a Comment
User Comments (0)
About PowerShow.com