Interprocedural Optimization

About This Presentation

Title:

Interprocedural Optimization

Description:

The call graph captures some information on the flow of control and data through the program. ... Enables us to avoid reloading procedures during this process ... – PowerPoint PPT presentation

Number of Views:116

Avg rating:3.0/5.0

Slides: 55

Provided by: barbara179

Category:

more less

Transcript and Presenter's Notes

Title: Interprocedural Optimization

1
Interprocedural Optimization

Interprocedural Data Flow Analysis
Optimizations

2
Interprocedural Analysis and Optimization

Take effects of procedure calls into account when
analysis is performed
Improve code across procedure boundaries
Interprocedural analysis gathers information
about entire program instead of single procedure
Interprocedural optimizations are transformations
that involve more than one procedure in a program

3
Why do we need IP Optimizations?

int a,b
main() f()
a read()
ba a ab
f()
print(a,b)
How is the data flow affected by this function
call?

4
Goal of Interprocedural Dataflow Analysis

So far, we have seen how to obtain dataflow
information within a procedure.
It is based on outward exposed definitions and
uses, as well as preserved statements in a
statement
We used worst case information for procedure and
function calls
We assumed that they use and define all arguments
passed to them, as well as all global variables
This may be highly inaccurate

One goal of interprocedural analysis is to extend
our previous intraprocedural analysis to deal
with call statements and function calls.
5
Information obtained from IPA

Interprocedural analysis may answer these
questions
Which procedures invoke a given procedure?
What variables may be modified as the side
effects of a procedure call?
What parameters of a procedure have constant
values when it is called from a particular call
site?
What range of arguments are associated with a
formal parameter?

IPA can also help us to understand and improve
the overall program structure, possibly by
creating new procedures or extracting (hoisting)
code from an existing one.
6
Interprocedural Optimizations

Interprocedural analysis enables a variety of
optimizations
Inline substitution (inlining)
Procedure cloning, specialization
Interprocedural register allocation
Interprocedural constant propagation
Interprocedural code motion
Interprocedural dead code elimination
Array privatization and other techniques that are
important for parallelizing code

7
Interprocedural Analysis

Requires saving information on compilation
unites, their relationships and information about
them
An expensive addition to the compiler
Can slow down compilation by as much as a factor
of 10
Does not always provide corresponding benefit
Very useful for parallelizing compilers

8
Topics in Interprocedural Analysis

Callgraph construction to represent function and
procedure invocations
Dataflow analysis
Interprocedural side effect analysis
Array region analysis to summarize results
Interprocedural constant propagation
Interprocedural alias analysis
Symbolic analysis

9
Interprocedural Dataflow Analysis

Goal Take effects of procedure calls into
account when dataflow analysis is performed
We create a flowgraph for each procedure and a
call graph to represent the flow between
procedures
The call graph captures some information on the
flow of control and data through the program.
Formal arguments and global variables may be
defined and used in a procedure. The results are
visible in the calling procedure.

We must associate the formals with actuals
We would like to gather definitions, uses and
preserved definitions for a statement that makes
a procedure call.
10
Call Graph

Each procedure is represented once in call
graph. Individual invocations (call sites) are
not represented.

program main call a ( x, n) call b (y, m) call a
(y, n) end program main subroutine a ( xx,
nn) call b ( xx, nn) end subroutine a subroutine
b ( yy, mm) end subroutine b
b
a
More detailed representations may be required to
solve certain problems.
11
The Call Graph

More formally, let P be the set of procedures and
functions in a program
The Call Graph for the program is
a directed graph G (N, E)
where each node corresponds to a subprogram in P,
so
there is a 1-1 correspondence between n ? N and p
? P
and (p, q) ? E iff p contains a statement which
may directly activate q.
Easy to construct unless there are formal
procedure parameters.

p1
p2
p3
p4
12
Practical Strategy for Building Call Graph

Compile program units separately
Gather information needed to construct call graph
(and call chains)
Details of call sites and the arguments
Information required to deal with any formal
procedure parameters
Create call graph after all procedures are parsed
Enables us to avoid reloading procedures during
this process
Some optimizations may be performed prior to the
construction of the call graph
But worst case assumptions will be applied at
call sites
So usually only transformations that standardize
aspects of the code

13
Arguments at Call Sites

Each procedure invocation is a call site.
actual arguments and formal arguments
(parameters)
an ordered list of actual arguments is a
procedure vector, and
An ordered list of formal arguments is a (formal)
parameter vector
At run time the list of formal arguments are
dynamically bound to a list of actual arguments
when the procedure is invoked.
All arguments have a (different) binding each
time the procedure is executed

program main call a ( x, n) call b (y,
m) subroutine a ( xx, nn) call b ( xx, nn) end
subroutine a
14
Terminology, Notation

The set of all calls in procedure p is calls(p).
formal(p) is the set of formal parameters (formal
arguments) of p.
The (formal) parameter vector of p is the list of
formal parameters in the order of their
occurrence in the declaration it is denoted by
pvp.
Many different combinations of actual arguments
(parameters) may be associated with the formal
arguments of a procedure

15
Example
calls(main) demo(3,4), proc(1)

main
call demo ( 3, 4 )
call proc (1)
end
subroutine demo ( x, y )
call proc ( x )
call proc ( y )
end
subroutine proc ( z )
end

main
demo
proc
calls(demo) proc(x), proc(y)
calls(proc)
16
Example

main
call demo ( 3, 4 )
call proc (1)
end
subroutine demo ( x, y )
call proc ( x )
call proc ( y )
end
subroutine proc ( z )
end

main formal(main)
demo formal(demo) x, y pvdemo(x,y)
proc formal (proc) zpvproc (z)
procedure_vector (proc) (1), (x), (y)
17
Procedure Parameters

Now for the hard case.
Many languages permit procedures (and functions)
to be passed to other procedures as arguments
If these are present, we must be careful to
consider all possible bindings of the formal
procedure parameters when constructing the call
graph

subroutine aproc ( a, b, c, n, bproc) real
a(n), b(n) call myproc ( a, n ) call bproc (a,
n) call myproc ( b, n )
bproc is a formal parameter there is no
procedure with that name. Instead, it will bind
to a real procedure, which will be invoked with
arguments a, n.
18
Binding of Arguments

subroutine aproc ( a, b, c, n, bproc)
real a(n), b(n)
call myproc ( a, n )
call bproc (b, n)
Let aproc be called with procedure vectors
( x, y, z, 100, oneproc ) and ( q, r, z, 250,
twoproc )
The formal parameters correspond 1-1 to actual
arguments
So in the first procedure vector, oneproc
corresponds to bproc and in the second twoproc
does.
So procedure calls in aproc have the following
arguments, (first two are direct, last two are
not)
myproc ( x, 100 ), myproc ( q, 250), oneproc ( y,
100), twoproc (r, 250)

aproc
myproc
twoproc
oneproc
19
Example

subroutine anotherproc ( a, b, fproc, n )
real a(n), b(n)
call fproc ( a, n )
call fproc ( b, n )
.
If the calls to anotherproc are as follows
call anotherproc ( x, y, this, 100), and
call anotherproc ( q, r, that, 250)
What actual procedure calls are made by
anotherproc ?

fproc is a formal procedure parameter!
20
Example

subroutine anotherproc ( a, b, fproc, n )
real a(n), b(n)
call fproc ( a, n )
call fproc ( b, n )
.
If the calls to anotherproc are as follows
call anotherproc ( x, y, this, 100), and
call anotherproc ( q, r, that, 250)
anotherproc makes the following calls

fproc is a formal procedure parameter!
call this (x,100) call this (y,100) call that
(q,250) call that (r,250)
anotherproc
this
that
21
Construction of the Call Graph

Main idea Create one node for each procedure.
Whenever procedure p calls procedure q, enter a
directed edge (p, q) into graph.
If there are no procedure parameters, we are
done. Otherwise, we must add edges for those
procedures that are associated with formal
procedure parameters.
To do so, we must find all procedures which may
be associated with a given formal procedure
parameter, i.e. all bindings of that parameter.

22
Kinds of Procedure Calls

main
call suba ( 1, 2, oneproc )
call suba ( 1, 2, otherproc )
end
subroutine suba ( x, y, proc)
call subb ( x, proc )
call subb ( y, proc )
end
subroutine subb (a, myproc )
call myproc ( a )
end
subroutine oneproc ( x )
subroutine otherproc ( y )

Whenever we find a real procedure passed as an
argument, we must record that information. We
record two of them for main oneproc,
otherproc suba and subb do not pass real
procedures as argument. The procedure calls in
main and in suba are direct references these are
real procedures. However, subb has no direct
references.
23
Procedure Parameters
When we build the call graph, we have to be very
careful about the order in which we treat
procedures so that we find all the bindings for a
formal procedure parameter.

main
call suba ( 1, 2, oneproc )
call suba ( 1, 2, otherproc )
end
subroutine suba (x,y,proc)
call subb ( x, proc )
call subb ( y, proc )
end
subroutine subb (a,myproc)
call myproc ( a )
subroutine oneproc ( x )
subroutine otherproc ( y )

main suba P
suba subb P
subb myproc formal(subb)
oneproc
otherproc
24
Processing Formal Procedure Calls

main
call suba ( 1, 2, oneproc )
call suba ( 1, 2, otherproc )
end
subroutine suba ( x, y, proc)
call subb ( x, proc )
call subb ( y, proc )
end
subroutine subb (a, myproc )
call myproc ( a )
subroutine oneproc ( x )
subroutine otherproc ( y )

We want to visit procedures in the call graph
after we have dealt with the procedures that call
them. So in our algorithm, we have to make sure
that procedures passed as arguments are not
processed too soon. We manage this by assigning
levels to procedures based on their calling
distance from the main program. This is the
only tricky part of the algorithm.
25
Construction of Call Graph
Our algorithm cannot handle irreducible graphs.
If we detect cycles in the graph we have to clone
a node.

Initialization
Nodes P, the set of procedures in the program.
Edges Initially empty.
Mark nodes as not visited.
Initialize level to 0.
level(p) is the length of the longest path from
MAIN to procedure p at any stage during the
construction of the callgraph.
We require that the procedures called by a given
procedure will have a higher number than their
caller.
We add temporary edges during the construction of
the graph in order to ensure that these orderings
are respected.
We need to check that the ordering is maintained
any time we add an edge.

26
Construction of Call Graph

Step 1
Construct the set calls(p), for each p in P set
up procvectors
For each q in calls(p), if q P, these calls
are not formal procedures. Create permanent edge
from p to q in the callgraph..
Update levels as needed when an edge is entered.
procvector ( MAIN )
Whenever call p ( a ) is a call to p P and a
does not contain formal arguments, add a to
procvectors(p)

27
Example
0
main

program main
call aproc ( x, y, z, m, cproc)
subroutine aproc ( a, b, c, n, bproc)
real a(n), b(n)
call myproc ( a, n )
call bproc (a, n)
call myproc ( b, n )
subroutine myproc ( d, q)
call cproc ( d, q )
subroutine cproc ( d, q)

pv(aproc) (x,y,z,m,cproc)
aproc
1
28
Construction of Call Graph

Step 2
For each procedure p, construct the set
referrals(p) the set of procedures that are
passed as arguments and not actually called
For each q referrals(p), enter a temporary
edge (p,q) into the call graph. Update the levels
if necessary
This ensures we visit procedures in an
appropriate order

29
Example
0
main

program main
call aproc ( x, y, z, m, cproc)
subroutine aproc ( a, b, c, n, bproc)
real a(n), b(n)
call myproc ( a, n )
call bproc (a, n)
call myproc ( b, n )
subroutine myproc ( d, q)
call cproc ( d, q )
subroutine cproc ( d, q)

pv(aproc) (x,y,z,m,cproc)
aproc
1
cproc
1
30
Construction of Call Graph

Step 3
While there are procedures marked as not visited,
select one, p, with minimal level
for each call x(y) in procedure p, where y
(y1,..,yk)
for each a (a1,,an) in procvectors(p)
Substitute each formal parameter in x(y) by the
corresponding ai to obtain q(b). This no longer
contains formal parameters.
If x is a formal parameter, enter a permanent
edge (p,q) in callgraph.
Update levels if needed.
If there is a procedure in the argument list b,
enter a temporary edge (q,b) into callgraph
Add new procedure vectors for q to procvectors(q)
Remove any temporary edges from p and mark p as
visited

31
Example
0
main

program main
call aproc ( x, y, z, m, cproc)
subroutine aproc ( a, b, c, n, bproc)
real a(n), b(n)
call myproc ( a, n )
call bproc (a, n)
call myproc ( b, n )
subroutine myproc ( d, q)
call cproc ( d, q )
subroutine cproc ( d, q)

referrals(main) cproc
1
aproc
myproc
2
2
32
Example
main
0

program main
call aproc ( x, y, z, m, cproc)
subroutine aproc ( a, b, c, n, bproc)
real a(n), b(n)
call myproc ( a, n )
call bproc (a, n)
call myproc ( b, n )
pv(aproc) (x,y,z,m,cproc)
myproc(a,n)? myproc(x,m)
bproc(a,n)? cproc(x.m)
myproc(b,n)? myproc(y,m)

aproc
1
myproc
2
3
33
Example 2

main
call suba ( 1, 2, oneproc )
call suba ( 1, 2, otherproc )
end
subroutine suba ( x,y,proc)
call subb ( x, proc )
call subb ( y, proc )
end
subroutine subb (a,myproc)
call myproc ( a )
subroutine oneproc ( x )
subroutine otherproc ( y )

main procvector(main)
0
suba procvector(suba) (1,2,oneproc),
(1,2,otherproc)
1
subb. procvector(subb) ?
2
oneproc
0
0
otherproc
34
Interprocedural Definitions and Uses

We do not want to gather this information each
time we encounter a procedure call
So we gather information once for each procedure
and then use it each time we encounter a call to
it
To do this systematically, we must handle
procedures in a certain order (reverse invocation
order)
Unfortunately, in general it is not possible to
save complete information on all accesses made in
a procedure for use elsewhere.

How do we summarize set of data referenced in
procedure?
Different strategies imply different levels of
accuracy.

35
Kinds of Interprocedural Analysis

Summary analysis Any strategy that determines
and stores an approximation to the set of data
defined and the set of data used in a procedure.
There are several alternative methods for
deriving this information.
Flow sensitive analysis (MUST problems) take all
control flow paths in the procedure into account.
Flow-insensitive analyses (MAY problems) do not.
We can also classify the task of creating the
call graph
Construction of the call graph is a
flow-insensitive problem.

Most of the time, a compiler will gather
flow-insensitive information.
36
MAY_DEF/USE Information

MAY_DEF(p) set of all global variables and
formal parameters of p that may be defined as a
result of execution of a procedure p.
MAY_USE(p) set of all global variables and
formal parameters of p whose value may be used in
p.
Control flow in p is not taken into account in
this definition.
Information is computed and made available to
procedures that invoke this one.

37
Example
May define and use a and b may define and use c

main
call demo ( a, b )
call proc (c)
end
subroutine demo ( x, y )
call proc ( x )
call proc ( y )
end
subroutine proc ( z )
if(..) z z
end

main
proc
demo
May define and use x and y
May define and use z
38
Computation of MAY_DEF Information
NB. Call chains are not explicitly in call graph.

To form the set of all variables that may be
modified (or used) as the result of the
invocation of a procedure, we must visit the
procedures called by a given routine in advance.
This is a bottum-up process Information is
propagated from a called procedure to its
callers.
The algorithm to compute this information for
each procedure must traverse the call chains in
reverse of invocation order
It first determines the set of direct definitions
DIR_DEF(p) for each procedure p. Each element of
this set is visible to the calling procedures.
Details depend on the programming language. E.g.
in Fortran, these are the global and formal
variables that may be defined (used) in
statements of p that are not procedure calls.

39
Direct Definitions
glob is global variable

main
glob 3
read val, zval
call suba ( val, k )
call suba ( zval, k )
glob
end
subroutine suba ( x, y)
.. x
call subb ( x )
y glob x
call subb ( y )
end
subroutine subb ( p )
glob p

Dir_def(main) glob, val, zval
Dir_def(suba) y
Dir_def(subb) glob
40
Propagating MAY Definitions
Glob is global variable

main
glob 3
read val, zval
call suba ( val, k )
call suba ( zval, k )
glob
end
subroutine suba ( x, y)
.. x
call subb ( x )
y glob x
call subb ( y )
end
subroutine subb ( p )
glob p

Dir_def(main) glob, val, zval
Dir_def(suba) y.
May_def(subb) Dir_def(subb) glob
Subb is leaf of call chain
41
Propagating MAY Definitions
Glob is global variable

main
glob 3
read val, zval
call suba ( val, k )
call suba ( zval, k )
end
subroutine suba ( x, y)
.. x
call subb ( x )
y glob x
call subb ( y )
end
subroutine subb ( p )
glob p

Dir_def(main) glob, val, zval
Dir_def(suba) y. May_def(suba) y, glob
May_def(subb) Dir_def(subb) glob
Subb is leaf of call chain
42
Propagating MAY Definitions
Glob is global variable

main
glob 3
read val, zval
call suba ( val, k )
call suba ( zval, k )
end
subroutine suba ( x, y)
.. x
call subb ( x )
y x
call subb ( y )
end
subroutine subb ( p )
glob p

Dir_def(main) glob, val, zval May_def(main)
glob, val, zval, k
Dir_def(suba) y. May_def(suba) y, glob
May_def(subb) Dir_def(subb) glob
Subb is leaf of call chain
43
Computation of MAY_DEF Information

Input Callgraph (N, E) calls(p), DIR_DEF(p),
p ? N.
Output MAY_DEF(p) for each p.
Initialize MAY_DEF(p) DIR_DEF(p) for p ? N.
Iterate until algorithm terminates
do for each p in N in reverse invocation order
do for each q such that (p, q) ? E
MAY_DEF(p) MAY_DEF(p) ? (GLOBAL ?
MAY_DEF(q) )
if MAY_DEF(q) ? formal(q) ? 0 then
do for every x(y) ? calls(p) such that x q
do for every element of procedure vector y
if pvq(k) ? MAY_DEF(q) y(k) ?
GLOBAL ? FORMAL
then MAY_DEF(p) MAY_DEF(p) ? y(k)
end if
end do
end do ..

variables defined in p and outwardly visible
start at leaves of graph
add global variables defined in procedure called
add actual params correspon-ding to modified
formals
44
Interprocedural Side Effect Analysis

Flow-insensitive side effect analysis
Determines, for each call site, which variables
(array regions) may be modified as a safe
approximation of side effect of a procedure call.
MAYDEF and MAYUSE
Flow-sensitive side effect analysis
context-sensitive analysis, identifies a side
effect if and only if the variable modification
occurs on every path through the called procedure
and, in turn, all procedures that it calls. Array
region described must also be exact.
MUSTDEF the set of variables (array regions)
that must be defined
Kill(p) is the set of all variables (array
regions) that must be modified on every path
through procedure p.

45
Interprocedural Optimizations

IPA provides information needed to support
procedure integration, e.g.
The number of call sites
Constant valued parameters
Site-independent constant-propagation analysis to
optimize bodies of procedures that are always
called with the same constant parameter(s)
Replace the constant valued parameters with their
values and perform global optimization

46
Interprocedural Optimizations

Site-specific constant-propagation analysis to
clone copies of a procedure body and optimize
them for specific call sites
Clone a copy of intermediate code of the
procedure then perform the same optimization
Such as eliminate unnecessary bounds checking
within procedure
Side-effect information to tailor the calling
conventions for a specific call site to optimize
caller versus callee register saving
No need to pass or receive constant parameters

47
Interprocedural Optimizations

Optimize call by value parameter passing of large
arguments that are not modified to be passed by
reference
Arrays, records
Use data flow analysis to improve intraprocedural
data flow information for procedure entries and
exit and for calls and returns
E.g. Loop invariant motion can be improved if we
can prove a call in a loop has no side effect on
expression determined to be loop invariant

48
Interprocedural Code Motion

Moving loops across a call

call foo(a, b, c, i, j)
do i1, 100 do j 1, 100 call foo(a, b,
c, i, j) enddo enddo
foo(x,y,z,i,j) do i1, 100 do j 1,
100 do k 1, 100
s(i,j)x(i,j) y(i,k) z(k,j) enddo
enddo enddo
foo(x,y,z,i,j) do k 1, 100
s(i,j)x(i,j) y(i,k) z(k,j) enddo
49
Flow-Insensitive Side Effect Analysis
Call multi-graph version

Algorithm considers aliased and alias-free
flow-insensitive side effects on parameters
passed by reference on the call multi-graph.
We first concentrate on alias-free
flow-insensitive side effect analysis.
Distinguish between side effects via parameter
passing and side effects on global variables.
Algorithm uses functions DEF, MOD, REF, USE.
Instructions are represented by ltprocedure
name, instruction numbergt pairs.
D implies non-aliased set, e.g. DMOD

50
Binding Graph

A data structure called the binding graph is used
to compute the set of formal parameters of p that
may be modified as side effects of invoking
procedure p.
Binding graph for a program P B ltN , E
gtnodes N represents the formal parameters in
Pedges E represents the bindings of a callers
formal parameters to a callees parameters

51
Flow-Insensitive Side Effect Analysis

MOD(p, i) the set of variable that may be
modified by executing the ith instruction in
procedure p.
REF(p, i) the set of variable that may be
referenced by executing the ith instruction in
procedure p.
USE(p, i) the set of variable that may be
referenced by the ith instruction in procedure p
before defined by it.
GMOD(p) is the set of all variables that may be
modified by an invocation of procedure p.
RMOD(p) is the set of formal parameters of p that
may be modified as side effects of invoking
procedure p.
LMOD(p, i) is the set of variables that may be
modified locally by executing the ith instruction
in procedure p (excluding the effects of
procedure calls that appear in that
instruction).
IMOD(p) is the set of variables that may be
modified by executing procedure p without
executing any calls within it.
IMOD(p) is the set of variables that may be
either directly modified by procedure p or passed
by reference to another procedure and that may be
modified as a side effect of it.

52
Flow-Insensitive Side Effect Analysis
n

DMOD(p, i) LMOD(p, i) b
(GMOD(q) )

n
p,i,q
q?callset(p, i)
n
n
GMOD(p) IMOD(p)
GMOD(q) n Nonlocal(q)
n
1inuminsts(p)
q?callset(p, i)
n
n
IMOD(p) IMOD(p)
b (RMOD(q) )
n
p,i,q
1icallsites(p)
q?callset(p, i)
n
IMOD(p) LMOD(p, i)
1inuminsts(p)
53
Computing of GMOD

Initial value for GMOD ( ) for procedure x
GMOD (x) IMOD(x)
x main, f, g, n, m , and h here
Final value for GMOD ( )
GMOD (g) IMOD (g) (GMOD(h) n Nonlocal
(h))
GMOD (f) IMOD (f) (GMOD(g) n Nonlocal
(g))
GMOD (main) IMOD (main) (GMOD(f) n
Nonlocal (f))

main
f
n
g
n
n
n
h
m
Call graph
54
Flow-Insensitive Alias Analysis

Steps
Construct extended formal parameters Pair Binding
Graph
For each formal parameter fp, compute the set
A(fp) of variable v that fp may be aliased to by
a call chain that binds v to fp (do not include
formal parameters)
Compute the formal parameters that may be aliased
to each other using a marking algorithm on the
Pair Binding Graph
Combine above results of formal parameter alias
and nonlocal variable alias