Compilers Modern Compiler Design - PowerPoint PPT Presentation

1 / 120
About This Presentation
Title:

Compilers Modern Compiler Design

Description:

... machine instruction templates, and linearize the result ... allocation, use pseudo-registers during the linearization ... for the linearized code ... – PowerPoint PPT presentation

Number of Views:434
Avg rating:3.0/5.0
Slides: 121
Provided by: wan145
Category:

less

Transcript and Presenter's Notes

Title: Compilers Modern Compiler Design


1
CompilersModern Compiler Design
  • 5. Code Generation

Interpretation Code Generation
NCYU C. H. Wang
2
Overview
3
Interpretation
  • An interpreter is a program that consider the
    nodes of the AST in the correct order and
    performs the actions prescribed for those nodes
    by the semantics of the language.
  • Two varieties
  • Recursive
  • Iterative

4
Interpretation
  • Recursive interpretation
  • operates directly on the AST attribute grammar
  • simple to write
  • thorough error checks
  • very slow 1000x speed of compiled code
  • Iterative interpretation
  • operates on intermediate code
  • good error checking
  • slow 100x speed of compiled code

5
Recursive Interpretation
6
Self-identifying data
  • must handle user-defined data types
  • value pointer to type descriptor
  • array of subvalues
  • example complex number

3.0
re
4.0
im
7
Complex number representation
8
Iterative interpretation
  • Operates on threaded AST
  • Active node pointer
  • Flat loop over a
  • case statement

9
Sketch of the main loop
10
Example for demo compiler
11
Code Generation
  • Compilation produces object code from the
    intermediate code tree through a process called
    code generation
  • Tree rewriting
  • Replace nodes and subtrees of the AST by target
    code segments
  • Produce a linear sequence of instructions from
    the rewritten AST

12
Example of code generation
  • a(b4cd2)9

13
Machine instructions
  • Load_Addr MRi, C, Rd
  • Loads the address of the Ri-th element of the
    array at M into Rd, where the size of the
    elements of M is C bytes
  • Load_Byte (MRo)Ri, C, Rd
  • Loads the byte contents of the Ri-th element of
    the array at M plus offset Ro into Rd, where the
    other parameters have the same meanings as above

14
Two sample instructions with their ASTs
15
Code generation
  • Main issues
  • Code selection which template?
  • Register allocation too few!
  • Instruction ordering
  • Optimal code generation is NP-complete
  • Consider small parts of the AST
  • Simplify target machine
  • Use conventions

16
Object code sequence
  • Load_Byte (bRd)Rc, 4, Rt
  • Load_Addr 9Rt, 2, Ra

17
Trivial code generation
18
Code for (7(15))
19
Partial evaluation
20
New Code
21
Simple code generation
  • Consider one AST node at a time
  • Two simplistic target machines
  • Pure register machine
  • Pure stack machine

stack
SP
vars
BP
22
Pure stack machine
  • Instructions

23
Example of pp5
  • Push_Local p
  • Push_Const 5
  • Add_Top2
  • Store_Local p

24
Pure register machine
  • Instructions

25
Example of pp5
  • Load_Mem p, R1
  • Load_Const 5, R2
  • Add_Reg R2, R1
  • Store_Reg R1, p

26
Simple code generation for a stack machine
  • The AST for bb 4 (ac)

27
The ASTs for the stack machine instructions
28
The AST for bb - 4(ac) rewritten
29
Simple code generationfor a stack machine (demo)
  • example bb 4ac
  • threaded AST

-


b
b
4

a
c
30
Simple code generationfor a stack machine (demo)
  • example bb 4ac
  • threaded AST

Sub_Top2
-
Mul_Top2
Mul_Top2


b
b
4
Mul_Top2
Push_Local b
Push_Local b
Push_Const 4

a
c
Push_Local a
Push_Local c
31
Simple code generationfor a stack machine (demo)
Push_Local b Push_Local b Mul_Top2 Push_Const
4 Push_Local a Push_Local c Mul_Top2 Mul_Top2 Su
b_Top2
  • example bb 4ac
  • rewritten AST

32
Depth-first code generation
33
Stack configurations
34
Simple code generation for a register machine
  • The ASTs for the register machine instructions

35
Code generation with register allocation
36
Code generation with register numbering
37
Register machine code for bb - 4(ac)
38
Register contents
39
Weighted register allocation
  • It is advantageous to generate the code for the
    child that requires the most registers first
  • Weight
  • The number of registers required by a node

40
Register weight of a node
41
AST for bb-4(ac) with register weights
42
Weighted register machine code
43
Example
  • Parameter number N 2 3 1
  • Stored weight 4 2
    1
  • Registers occupied when 0 1 2
  • starting parameter N
  • Maximum per parameter 4 3 3
  • Overall maximum 4

44
Example Tree representation
45
Register spilling
  • Too few registers?
  • Spill registers in memory, to be retrieved later
  • Heuristic select subtree that uses all
    registers, and replace it by a temporary
  • example
  • bb 4ac
  • 2 registers

3
2
2
2
2
1
1
1
1
1
46
Register spilling
Load_Mem b, R1 Load_Mem b, R2 Mul_Reg R2,
R1 Store_Mem R1, T1 Load_Mem a, R1 Load_Mem c,
R2 Mul_Reg R2, R1 Load_Const 4, R2 Mul_Reg R1,
R2 Load_Mem T1, R1 Sub_Reg R2, R1
47
Another example
3
2
2
2
2
1
1
1
48
Algorithm
49
Machines with register-memory operations
  • An instruction
  • Add_Mem X, R1
  • Adding the contents of memory location X to R1

50
Register-weighted tree for a memory-register
machine
51
Code generation for basic blocks
  • Finding the optimal rewriting of the AST with
    available instruction templates is NP-complete.
  • Three techniques
  • Basic blocks
  • Bottom-up tree rewriting
  • Register allocation by graph coloring

52
Basic block
  • Improve quality of code emitted by simple
    code generation
  • Consider multiple AST nodes at a time
  • Generate code for maximal basic blocks that
    cannot be extended by including adjacent AST nodes

basic block a part of the control graph that
contains no splits (jumps) or combines (labels)
53
Example of basic block
  • A basic block consists of expressions and
    assignments
  • Fixed sequence () limits code generation
  • An AST is too restrictive

54
From AST to dependency graph
  • AST for the simple basic block

55
Simple algorithm to convert AST to a data
dependency graph
  • Replace arcs by downwards arrows (upwards for
    destination under assignment)
  • Insert data dependencies from use of V to
    preceding assignment to V
  • Insert data dependencies from the assignment to a
    variable V to the previous assignment to V
  • Add roots to the graph (output variables)
  • Remove -nodes and connecting arrows

56
Simple data dependency graph
57
Cleaned-up graph
58
Exercise
int n n a1 x (bc) n n
n1 y (bc) n
Convert the above codes to a data dependency graph
59
Answer
60
Common subexpression elimination
  • Simple example
  • xaa2ab bb
  • yaa-2ab bb
  • Three common subxpressions
  • double quads aa bb
  • double cross_prod 2ab
  • x quads cross_prod
  • y quads cross_prod

61
Common subexpression
  • Equal subexpression in a basic block are not
    necessarily common subexpressions
  • xaa2ab bb
  • ab0
  • yaa-2ab bb

62
Common subexpression example (1/3)
63
Common subexpression example (2/3)
64
Common subexpression example (3/3)
65
From dependency graph to code
  • Rewrite nodes with machine instruction templates,
    and linearize the result
  • Instruction ordering ladder sequences
  • Register allocation graph coloring

66
Linearization of thedata dependency graph
  • Example
  • (ab)c d
  • Definition of a ladder sequence
  • Each root node is a ladder sequence
  • A ladder sequence S ending in operator node N can
    be extended with the left operand of N
  • If operator N is commutative then S may also
    extended with the right operand of N

Load_Mem a, R1 Add_Mem b, R1 Mul_Mem, c,
R1 Sub_Mem d, R1
67
Code generated for a given ladder sequence
load_Mem b, R1 Add_Reg I1, R1 Add_Mem
c, R1 Store_Reg R1, x
68
Heuristic ordering algorithm
  • To delay the issues of register allocation, use
    pseudo-registers during the linearization
  • Select ladder sequence S without more than one
    incoming dependencies
  • Introduce temporary (pseudo-) registers for
    non-leaf operands, which become additional roots
  • Generate code for S, using R1 as the ladder
    register
  • Remove S from the graph
  • Repeat step 1 through 4 until the entire data
    dependency graph has been consumed and rewritten
    to code

69
Example of linearization
X1
70
The code for y, ,
  • Load_Reg X1, R1
  • Add_Const 1, R1
  • Multi_Mem d, R1
  • Store_Reg R1, y

71
Remove the ladder sequence y, ,
72
The code for x, , ,
  • Load_Reg X1, R1
  • Mult_Reg X1, R1
  • Add_Mem b, R1
  • Add_Mem c, R1
  • Store_Reg R1, x

73
The Last step
  • Load_Mem a, R1
  • Add_Const 1, R1
  • Load_Reg R1, X1

74
The results of code generation
75
Exercise
  • Generate code for the following dependency graph

x
y



-


2
76
Answers
R4
R2
R3
77
Register allocation for the linearized code
  • Map the pseudo-registers to memory locations or
    real registers

gcc compiler
78
Code optimization in the presence of pointers
  • Pointers cause two different problems for the
    dependency graph
  • ax y
  • p 3
  • b x y
  • ap y
  • b 3
  • c p q

x y is not a common subexpression if p
happens to point to x or y
p q is not a common subexpression if p
happens to point to b
79
Example (1/4)
  • Assignment under a pointer

80
Example (2/4)
Data dependency graph with an assignment under a
pointer
81
Example (3/4)
Cleaned-up graph
82
Example (4/4)
xR1
Target code
83
BURS code generation
  • In practice, machines often have a great variety
    of instructions, simple ones and complicated
    ones, and better code can be generated if all
    available instructions are utilized.
  • Machines often have several hundred different
    machine instructions, often each with ten or more
    addressing modes, and it would be very advantages
    if code generators for such machines could be
    derived from a concise machine description rather
    than written by hand.

84
BURS code generation
  • Simple instruction patterns (1/2)

85
BURS code generation
  • Simple instruction patterns (2/2)

86
Example Input tree
87
Naïve rewrite
  • Its cost is 17 units
  • 1 3 4 1 4 3 1 17

88
Code resulting
89
Top-down largest-fit rewrite
90
Discussions
  • How do we find all possible rewrites, and how do
    we represent them? It will be clear that we do
    not fancy listing them all!!
  • How do we find the best/cheapest rewrite among
    all possibilities, preferably in time linear in
    the size of the expression to be translated.

91
Bottom-up pattern matching
  • The dotted trees

92
Outline code for bottom-up pattern matching
93
Label set resulting
94
Instruction selection by dynamic programming
  • Bottom-up pattern matching with costs

5-gtreg 6-gtreg 7.1 8.1
Instructions selection
95
Cost evaluation
  • Lower
  • 5-gtreg_at_7
  • 6-gtreg_at_8 (134)
  • Higher
  • 6-gtreg_at_12 (174)
  • 8-gtreg_at_9 (135)
  • Top (?)
  • Exercise

96
Code generation by bottom-up matching
97
Code generation by bottom-up matching, using
commutativity
98
Pattern matching and instruction selection
combined
  • Two basic operands
  • State S1
  • -gt cst_at_0
  • 1-gtreg_at_1
  • State S2
  • -gt mem_at_0
  • 2-gtreg_at_3

99
States of the BURS
100
Creating the cost-conscious next-state table
  • The triplet , S1, S1S3
  • S3
  • 4-gtreg_at_3 (111)
  • , S1, S2 S5
  • S5
  • 3-gtreg_at_1034
  • 4-gtreg_at_1315
  • Exercise , S1, S5
  • Exercise , S1, S2
  • 5-gtreg_at_1067 (4)
  • 6-gtreg_at_1348
  • 7.1_at_0303 (0)
  • 8.1_at_0303 (0)

101
Cost conscious next table
102
Code generation using cost-conscious next-state
table
103
Register allocation by graph coloring
  • Procedure-wide register allocation
  • Only live variables require register storage
  • Two variables(values) interfere when their live
    ranges overlap

dataflow analysis a variable is live at node N
if the value it holds is used on some path
further down the control-flow graph otherwise it
is dead
104
A program segment for live analysis
105
Live range of the variables
106
Graph coloring
  • NP complete problem
  • Heuristic color easy nodes last
  • Find node N with lowest degree
  • Remove N from the graph
  • Color the simplified graph
  • Set color of N to the first color that is not
    used by any of Ns neighbors

107
Coloring process
3 registers
108
Preprocessing the intermediate code
  • Preprocessing of expressions
  • char lower_case_from_capital(char ch)
  • return ch (a A)
  • Constant expression evaluation
  • char lower_case_from_capital(char ch)
  • return ch 32

109
Arithmetic simplification
  • Transformations that replace an operation by a
    simpler one are called strength reductions.
  • Operations that can be removed completely are
    called null sequences.

110
Some transformations for arithmetic simplification
111
Preprocessing of if-statements and goto statements
  • When the condition in an if-then-else statement
    turns out to be constant, we can delete the code
    of the branch that will never be executed. This
    process is called dead code elimination.
  • If a goto or return statement is followed by code
    that has no incoming data flow, that code is
    dead and can be eliminated.

112
Stack representations
113
Stack representations (details)
IF
condition
ELSE
gt
x 7
y
0
FI
114
Preprocessing of routines
  • In-lining method

115
In-lining result
Advanced examples int n3 printf(squared\n,
nn) gt int n3 printf(squared\n,
33) gt int n3 printf(squared\n, 9)
Load_par squared\n Load_par 9 Call
printf
116
Cloning
  • Example
  • double poewr_series(int n, double a, double x)
  • int p
  • for (p0 pltn p) result ap (xp)
  • return result
  • Is called with x set to 1.0

double poewr_series(int n, double a) int p
for (p0 pltn p) result ap (1.0p)
return result
double poewr_series(int n, double a) int p
for (p0 pltn p) result ap return
result
117
Postprocessing the target code
  • Stupid instruction sequences
  • Load_Reg R1, R2
  • Load_Reg R2, R1
  • or
  • Store_Reg R1, n
  • Load_Mem n, R1

118
Creating replacement patterns
  • Example
  • Load_Reg Ra, Rb Load_Reg Rc, Rd
  • RaRd, RbRc gt Load_Reg Ra, Rb
  • Load_const 1, Ra Add_Reg Rb, Rc
  • RaRb, is_last_use(Rb) gt Increment Rc

119
Locating and replacing instructions
  • Multiple pattern matching
  • Using FSA
  • Dotted items

120
Homework
  • Study sections
  • 4.2.13 Machine code generation
  • 4.3 Assemblers, linkers and loaders
Write a Comment
User Comments (0)
About PowerShow.com