Title: ECE1724F Compiler Primer
1ECE1724FCompiler Primer
- http//www.eecg.toronto.edu/voss/ece1724f-04
- 2004 September 21
2Whats in an optimizing compiler?
CSC488 (467)
High-level language (C, C, Java)
Low-level language (mc68000, ia32, etc)
Front End
Optimizer
Code Generator
HLL
IR (Usually very naive)
IR (Better, we hope)
LLL
ECE540
3What are compiler optimizations?
Optimization the transformation of a program P
into a program P, that has the same
input/output behavior, but is somehow better.
- better means
- faster
- or smaller
- or uses less power
- or whatever you care about
- P is not optimal, may even be worse than P
4An optimizations must
- Preserve correctness
- the speed of an incorrect program is irrelevant
- On average improve performance
- P is not optimal, but it should usually be
better - Be worth the effort
- 1 person-year of work, 2x increase in compilation
time, a 0.1 improvement in speed? - Find the bottlenecks
- 90/10 rule 90 of the gain for 10 of the work
5Compiler Phases (Passes)
tokens
AST
IR
6Well talk about
- Lexing Parsing
- Control Flow Analysis
- Data Flow Analysis
- Some optimizations
7Lexing, Parsing andIntermediate Representations
8Lexers Parsers
- The lexer identifies tokens in a program
- The parser identifies grammatical phrases, or
constructs in the program - There are freely available lexer and parser
generators - The parser usually constructs some intermediate
form of the program as output
9Intermediate Representation
- The representation or language on which the
compiler performs its optimizations - As many IRs as compiler suites
- 2x as many IRs as compiler suites (Muchnick)
- Some IRs are better for some optimizations
- different information is maintained
- easier to find certain types of information
10Why Use an IR?
C
MIPS
C
Sun SPARC
Java
IR
IA32 / Pentium
Fortran
IA64 / Itanium
Voss
PowerPC
- Good Software Engineering
- Portability
- Reuse
11Example
float a2010 aij2
(a) High-Level
(b) Medium-Level
(c) Low-Level
12High-Level Abstract Syntax Tree (AST)
int f(a,b) int a,b int c c a 2
print(b,c)
13Control Flow Analysis
14Purpose of Control Flow Analysis
- Determine the control structure of a program
- determine possible control flow paths
- find basic blocks and loops
- Intraprocedural within a procedure
- Interprocedural across procedures
- Whole program
- Maybe just within the same file
cc c file1.c cc c file2.c cc o myprogram
file1.o file2.o -l mylib
15All about Control flow analysis
- Finding basic blocks
- Creating a control flow graph
- Finding dominators
- dominators, proper dominators, direct dominators
- Finding post-dominators
- Finding loops
16Basic Blocks
- A Basic Block (BB) is a maximal section of
straight-line code which can only be entered
via the first instruction and can only be existed
via the last instruction.
S1 read L S2 n 0 S3 k 0 S4 m 1 S5 k k
m S6 c k L S7 if (c) goto S11 S8 n n
1 S9 m m 2 S10 goto S5 S11 write n
17Control Flow Graphs
- The Control Flow Graph (CFG) of a program is a
directed graph G(N, E) whose nodes N represent
the basic blocks in the program and whose edges E
represent transfers of control between basic
blocks.
S1 read L S2 n 0 S3 k 0 S4 m 1 S5 k k
m S6 c k L S7 if (c) goto S11 S8 n n
1 S9 m m 2 S10 goto S5 S11 write n
18Control Flow Graphs (continued)
- Given G (N, E) and a basic block b Î N.
- The successors of b, denoted by succ(b), is the
set of basic blocks that can be reached from b by
traversing one edge succ(b) n Î N
(b,n) Î E - The predecessors of b, denoted by pred(b), is the
set of basic blocks that can reach b by
traversing one edge pred(b) m Î N
(m,b) Î E
- An entry node in G is one which has no
predecessors. - An exit node in G is one which has no successors.
19Dominators
- Let G(N, E) denote a CFG. Let n, n Î N.
- n is said to dominate n, denoted n dom n, iff
every path from Entry to n contains n.
20Post-Dominators
- Let G(N, E) denote a CFG. Let n, n Î N. Then
- n is said to post-dominate n, denoted n pdom
n, iff every path from n to Exit contains n.
21Loops
- Goal find loops in CFG irrespective of input
syntax - DO, while, for, goto, etc.
- Intuitively, a loop is the set of nodes in a CFG
that form a cycle. - However, not every cycle is a loop.
- A natural loop has a single entry node h Î N
and a tail node t Î N, such that (t,h) Î E loop
can be entered only through h the loop contains
h and all nodes that can reach t without going
through h.
22Loop Pre-Headers
- Several optimizations require that code be moved
before the header. - It is convenient to create a new block called the
pre-header. - The pre-header has only the header as successor.
- All edges that formerly entered the header
instead enter the pre-header, with the exception
of edges from inside the loop.
23Data Flow Analysis
24Data Flow Analysis
- Goal make assertions about the data usage in a
program - Use these assertions to determine if and when
optimizations are legal - Local within a single basic block
- Analyze effect of each instruction
- Compose effect at beginning/end of BB
- Global within a procedure (across BBs)
- Consider the effect of control flow
- Inter-procedural across procedures
- References
- Muchnick, Chapter 8.
- Dragon Book, 608-611, 624-627, 631.
25Data Flow Analysis
- Compile-time reasoning about the run-time flow of
values in the program - Represent facts about the run-time behavior
- Represent effect of executing each basic block
- Propagate facts around the control flow graph
26Data Flow Analysis
- Formulated as a set of simultaneous equations
- Sets attached to the nodes and edges
- Lattice to describe the relation between values
- Usually represented as a bit or bit vectors
- Solve equations using iterative framework
- Start with initial guess of facts at each node
- Propagate until stabilizes at maximal fixed
point. - Would like meet over all paths (MOP) solution
27Basic Approach
Must be conservative!
28Example Reaching Definitions
Problem statement for each basic block b find
which of all definitions in the program reach the
boundaries of b.
Definition A definition of a variable x is an
instruction that assigns (or may assign) a value
to x.
- Reaches A definition d of variable x reaches a
point p in the program if there exists a path
from the point immediately following d to p such
that d is not killed by another definition of x
along this path.
29Reaching Definitions Gen Set
Gen(b) the set of definitions that appear in a
basic block b and reach its end.
Entry
BB 1
a 5 c 1 a a 1 c a?
c c c
BB 2
BB 3
a c - a c 0
Exit
Finding Gen(b) is doing local reaching
definitions analysis.
30Reaching Definitions Kill Set
Kill(b) Set of definitions in other basic blocks
that are killed in b (i.e., by instructions in
b). For each variable v defined in b, the kill
set contains all definitions of v in other basic
blocks.
Entry
BB 1
a 5 c 1 a a 1 c a?
c c c
BB 2
BB 3
a c - a c 0
Exit
31Reaching Definitions Data Flow Equations
RDin(b) Set of definitions that reach the
beginning of b. RDout(b) Set of definitions
that reach the end of b.
32Reaching Definitions - Solving the Data Flow
Equations
F
33Other data flow problems
- Reaching definitions
- Live variables
- Available expressions
- Very busy expressions
34Optimizations
35Simple Optimizations
- Constant Folding
- Algebraic Simplifications
36Redundancy optimizations
- Common subexpression elimination
- Forward substitution (reverse of CSE)
- Copy propagation
- Loop-invariant code motion
37Common Subexpression Elimination
- An occurrence of an expression is a common
subexpression if there is another occurrence that
always precedes it in execution order and whose
operands remain unchanged between these
evaluations. - i.e. the expression has been already computed and
the result is still valid. - Common Subexpression Elimination replaces the
recomputations with a saved value. - reduces the number of computations
38Forward Substitution
- Replace a copy by reevaluation of the expression
- Why?
- perhaps holds a register too long, causes spills
- See that you have a store of an expression to a
temporary followed by an assignment to a
variable. If the expression operands are not
changed to point of substitution replace with
expression.
t1 b 2 a t1
a b 2
c b 2 d a b
c t1 d a b
39Copy Propagation
- A copy instruction is an instruction in the form
x y. - Copy propagation replaces later uses of x with
uses of y provided intervening instructions do
not change the value of either x or y. - Benefit saves computations, reduces space
enables other transformations.
40Copy Propagation (cont)
- To propagate a copy statement in the form s x
y, we must - Determine all places where this definition of x
is used. - For each such use, u
- s must be the only definition of x reaching u
and - on every path from s to u, there are no
assignments to y.
41Loop-Invariant Code Motion
- A computation inside a loop is said to be
loop-invariant if its execution produces the same
value as long as control stays within the loop. - Loop-invariant code motion moves such
computations outside the loop (into the loop
pre-header). - Benefit eliminates redundant computations.
42Entry
I 0
(i n)?
j 0
(j m)?
i i1
Exit
43Removing Deadcode
44Deadcode Elimination
- A variable is dead if it is not used on any path
from the location in the code where it is defined
to the exit point of the routine in question. - An instruction is dead if it computes a dead
variable. - A local variable is dead if it is not used before
the procedure exit - A variable with wider visibility may require
interprocedural analysis unless it is reassigned
on every possible path to the procedure exit.
45(No Transcript)
46Loop Optimizations
47Well-behaved loops
- Fortran and Pascal have all well-behaved loops
- For C, only a subset are well-behaved, defined as
-
- for (exp1 exp2 exp3)
- stmt
- where exp1 assigns a value to an integer
variable i - exp2 compares i to a loop constant
- exp3 increments or decrements i by a loop
constant - Similar if-goto loops can also be considered
well-behaved
48Induction-Variable Optimizations
- induction-variables are variables whose
successive values form an arithmetic progression
over some part of a program, usually a loop. - A loops iterations are usually counted by an
integer variable that increases/decreases by a
constant amount each iteration - Other variables, e.g. subscripts, often follow
patterns similar to the loop-control variables.
49Induction Variables Example
- INTEGER A(100)
- INTEGER A(100) T1 202
- DO I 1,100 DO I 1,100
- A(I) 202 2I T1 T1 2
- ENDDO A(I) T1
- ENDD
- I has an initial value 1, increments by 1, and
ends as 100 - A(I) is initially assigned 200, decreases by 2,
and ends as 2 - The address of A(I) is initially addr a,
increases by 4 each iteration, and ends as (addr
a) 396 - addr a(i) (addr a) 4 i - 4
50Induction Variables Example
- t1 202
- i 1
- L1 t2 i 100
- if t2 goto L2
- t1 t1 2
- t3 addr a
- t4 t3 4
- t5 4 i
- t6 t4 t5
- t6 t1
- i i 1
- GOTO L1
- L2
-
-
- i is used to count iterations and calculate A(I)
- Induction variable optimizations improve if
preceded by constant propagation
51Register Allocation
52Register Allocation
- Register allocation improves code
- accessing faster memory
- fewer instructions
- But
- There are a limited number of machine registers
- Some registers can only hold certain types of
data - So, which variables to we allocate to registers?
- Register allocation is extremely important
- has huge impact on performance
- its NP-Complete (not solvable in polynomial
time) - Use heuristics
53Approaches to Register Allocation
- Global Register Allocation Using Usage Counts
- Assume R registers are available. For each loop
nest, allocate registers to the R variables which
show the largest estimated benefit from being
kept in a register. Little or no cross nest
allocation is done. - Register Allocation by Graph Coloring
- currently the most common method
- known about since 1971 but was impractical in
early compilers - Chaitin came up with 1st implementation in 1981
- Briggs proposed an optimistic extension to it
around 1989 - express overlap of the lifetimes of vars with an
interference graph - try to color this graph with R colors
- generate spill code when necessary to make the
graph R-colorable
54Other important optimizations
- Instruction scheduling
- what types of ops to use on an architecture and
how to order them in a BB - Parallelization (Dependence Analysis)
- Locality Optimizations
- change ordering of loops and instructions to
benefit cache behavior
55Places to look for more info
- Compilers Principles, Techniques and Tools,
Aho, Sethi and Ullman, Addison Wesley (!!! 1986
!!!) The Dragon Book - Advanced Compiler Design and Implementation,
Steven S. Muchnick, Morgan Kaufmann (1997) - And take ECE540 next semester
56First Topic Empirical Optimization
- An Overview of Empirical Optimization
- A Comparison of Empirical and Model-Driven
Optimization (volunteer
) - A Case Study Using Empirical Optimization for a
Large Engineering Application (volunteer
) - High-Level Adaptive Program Optimization with
ADAPT
57(No Transcript)
58(No Transcript)
59(No Transcript)