Title: Compiler Run-time Organization Lecture 7
1Compiler Run-time OrganizationLecture 7
2What we have covered so far
- We have covered the front-end phases
- Lexical analysis
- Parsing
- Semantic analysis
- Next are the back-end phases
- Optimization
- Code generation
- Lets take a look at code generation . . .
3Run-time environments
- Before discussing code generation, we need to
understand what we are trying to generate - There are a number of standard techniques for
structuring executable code that are widely used
- Management of run-time resources
- Correspondence between static (compile-time) and
dynamic (run-time) structures - Storage organization
5Run-time Resources
- Execution of a program is initially under the
control of the operating system - When a program is invoked
- The OS allocates space for the program
- The code is loaded into part of the space
- The OS jumps to the entry point (i.e., main)
6Memory Layout
Other Space
- By tradition, pictures of machine organization
have - Low address at the top
- High address at the bottom
- Lines delimiting areas for different kinds of
data - These pictures are simplifications
- E.g., not all memory need be contiguous
8What is Other Space?
- Holds all data for the program
- Other Space Data Space
- Compiler is responsible for
- Generating code
- Orchestrating use of the data area
9Code Generation Goals
- Two goals
- Correctness
- Speed
- Most complications in code generation come from
trying to be fast as well as correct
10Assumptions about Execution
- Execution is sequential control moves from one
point in a program to another in a well-defined
order - When a procedure is called, control eventually
returns to the point immediately after the call - Do these assumptions always hold?
- An invocation of procedure P is an activation of
P - The lifetime of an activation of P is
- All the steps to execute P
- Including all the steps in procedures P calls
12Lifetimes of Variables
- The lifetime of a variable x is the portion of
execution in which x is defined - Note that
- Lifetime is a dynamic (run-time) concept
- Scope is a static concept
13Activation Trees
- Assumption (2) requires that when P calls Q, then
Q returns before P does - Lifetimes of procedure activations are properly
nested - Activation lifetimes can be depicted as a tree
- class Main
- int g() return 1
- int f() return g()
- void main() g() f()
- The activation tree depends on run-time behavior
- The activation tree may be different for every
program input - Since activations are properly nested, a stack
can track currently active procedures
- class Main
- int g() return 1
- int f() return g()
- void main() g() f()
Stack Main f g
17Revised Memory Layout
18Activation Records
- The information needed to manage one procedure
activation is called an activation record (AR) or
frame - If procedure F calls G, then Gs activation
record contains a mix of info about F and G.
19What is in Gs AR when F calls G?
- F is suspended until G completes, at which
point F resumes. Gs AR contains information
needed to resume execution of F. - Gs AR may also contain
- Gs return value (needed by F)
- Actual parameters to G (supplied by F)
- Space for Gs local variables
20The Contents of a Typical AR for G
- Space for Gs return value
- Actual parameters
- Pointer to the previous activation record
- The control link points to AR of caller of G
- Machine status prior to calling G
- Contents of registers program counter
- Local variables
- Other temporary values
- class Main
- int g() return 1
- int f(int x)
- if (x 0) return g()
- else return f(x - 1) ()
- void main() f(3) ()
- AR for f
control link
return address
22Stack After Two Calls to f
- Main has no argument or local variables and its
result is never used its AR is uninteresting - () and () are return addresses of the
invocations of f - The return address is where execution resumes
after a procedure call finishes - This is only one of many possible AR designs
- Would also work for C, Pascal, FORTRAN, etc.
24The Main Point
- The compiler must determine, at compile-time, the
layout of activation records and generate code
that correctly accesses locations in the
activation record - Thus, the AR layout and the code generator must
be designed together!
- The picture shows the state after the call to 2nd
invocation of f returns
- The advantage of placing the return value 1st in
a frame is that the caller can find it at a fixed
offset from its own frame - There is nothing magic about this organization
- Can rearrange order of frame elements
- Can divide caller/callee responsibilities
differently - An organization is better if it improves
execution speed or simplifies code generation
27Discussion (Cont.)
- Real compilers hold as much of the frame as
possible in registers - Especially the method result and arguments
- All references to a global variable point to the
same object - Cant store a global in an activation record
- Globals are assigned a fixed address once
- Variables with fixed address are statically
allocated - Depending on the language, there may be other
statically allocated values
29Memory Layout with Static Data
30Heap Storage
- A value that outlives the procedure that creates
it cannot be kept in the AR - Class foo() return new Class
- The Class value must survive deallocation of
foos AR - Languages with dynamically allocated data use a
heap to store dynamic data
- The code area contains object code
- For most languages, fixed size and read only
- The static area contains data (not code) with
fixed addresses (e.g., global data) - Fixed size, may be readable or writable
- The stack contains an AR for each currently
active procedure - Each AR usually fixed size, contains locals
- Heap contains all other data
- In C, heap is managed by malloc and free
32Notes (Cont.)
- Both the heap and the stack grow
- Must take care that they dont grow into each
other - Solution start heap and stack at opposite ends
of memory and let the grow towards each other
33Memory Layout with Heap
34Data Layout
- Low-level details of machine architecture are
important in laying out data for correct code and
maximum performance - Chief among these concerns is alignment
- Most modern machines are (still) 32 bit
- 8 bits in a byte
- 4 bytes in a word
- Machines are either byte or word addressable
- Data is word aligned if it begins at a word
boundary - Most machines have some alignment restrictions
- Or performance penalties for poor alignment
36Alignment (Cont.)
- Example A string
- Hello
- Takes 5 characters (without a terminating \0)
- To word align next datum, add 3 padding
characters to the string - The padding is not part of the string, its just
unused memory
37Code Generation Overview
- Stack machines
- The MIPS assembly language
- A simple source language
- Stack-machine implementation of the simple
38Stack Machines
- A simple evaluation model
- No variables or registers
- A stack of values for intermediate results
- Each instruction
- Takes its operands from the top of the stack
- Removes those operands from the stack
- Computes the required operation on them
- Pushes the result on the stack
39Example of Stack Machine Operation
- The addition operation on a stack machine
40Example of a Stack Machine Program
- Consider two instructions
- push i - place the integer i on top of the
stack - add - pop two elements, add them and put
- the result back on the stack
- A program to compute 7 5
- push 7
- push 5
- add
41Why Use a Stack Machine ?
- Each operation takes operands from the same place
and puts results in the same place - This means a uniform compilation scheme
- And therefore a simpler compiler
42Why Use a Stack Machine ?
- Location of the operands is implicit
- Always on the top of the stack
- No need to specify operands explicitly
- No need to specify the location of the result
- Instruction add as opposed to add r1, r2
- Þ Smaller encoding of instructions
- Þ More compact programs
- This is one reason why Java Byte codes use a
stack evaluation model
43Optimizing the Stack Machine
- The add instruction does 3 memory operations
- Two reads and one write to the stack
- The top of the stack is frequently accessed
- Idea keep the top of the stack in a register
(called accumulator) - Register accesses are faster
- The add instruction is now
- acc acc top_of_stack
- Only one memory operation!
44Stack Machine with Accumulator
- Invariants
- The result of computing an expression is always
in the accumulator - For an operation op(e1,,en) push the accumulator
on the stack after computing each of e1,,en-1 - After the operation pop n-1 values
- After computing an expression the stack is as
45Stack Machine with Accumulator. Example
- Compute 7 5 using an accumulator
46A Bigger Example 3 (7 5)
acc 3 3
ltinitgt push acc
3 3, ltinitgt acc 7
7 3,
ltinitgt push acc 7
7, 3, ltinitgt acc 5
5 7, 3, ltinitgt acc
acc top_of_stack 12 7, 3,
ltinitgt pop
12 3, ltinitgt acc acc
top_of_stack 15 3, ltinitgt pop
- It is very important that the stack is preserved
across the evaluation of a sub-expression - Stack before the evaluation of 7 5 is 3,
ltinitgt - Stack after the evaluation of 7 5 is 3, ltinitgt
- The first operand is on top of the stack
48From Stack Machines to MIPS
- The compiler generates code for a stack machine
with accumulator - We want to run the resulting code on the MIPS
processor (or simulator) - We simulate stack machine instructions using MIPS
instructions and registers
49Simulating a Stack Machine
- The accumulator is kept in MIPS register a0
- The stack is kept in memory
- The stack grows towards lower addresses
- Standard convention on the MIPS architecture
- The address of the next location on the stack is
kept in MIPS register sp - The top of the stack is at address sp 4
50MIPS Assembly
- MIPS architecture
- Prototypical Reduced Instruction Set Computer
(RISC) architecture - Arithmetic operations use registers for operands
and results - Must use load and store instructions to use
operands and results in memory - 32 general purpose registers (32 bits each)
- We will use sp, a0 and t1 (a temporary
51A Sample of MIPS Instructions
- lw reg1 offset(reg2)
- Load 32-bit word from address reg2 offset into
reg1 - add reg1 reg2 reg3
- reg1 reg2 reg3
- sw reg1 offset(reg2)
- Store 32-bit word in reg1 at address reg2
offset - addiu reg1 reg2 imm
- reg1 reg2 imm
- u means overflow is not checked
- li reg imm
- reg imm
52MIPS Assembly. Example.
- The stack-machine code for 7 5 in MIPS
- acc 7
- push acc
- acc 5
- acc acc top_of_stack
- pop
li a0 7 sw a0 0(sp) addiu sp sp -4 li a0
5 lw t1 4(sp) add a0 a0 t1 addiu sp sp 4
- We now generalize this to a simple language
53A Small Language
- A language with integers and integer operations
- P D P D
- D def id(ARGS) E
- ARGS id, ARGS id
- E int id if E1 E2 then E3
else E4 - E1 E2 E1 E2
54A Small Language (Cont.)
- The first function definition f is the main
routine - Running the program on input i means computing
f(i) - Program for computing the Fibonacci numbers
- def fib(x) if x 1 then 0 else
- if x 2 then 1
else - fib(x - 1)
fib(x 2)
55Code Generation Strategy
- For each expression e we generate MIPS code that
- Computes the value of e in a0
- Preserves sp and the contents of the stack
- We define a code generation function cgen(e)
whose result is the code generated for e
56Code Generation for Constants
- The code to evaluate a constant simply copies it
into the accumulator - cgen(i) li a0 i
- Note that this also preserves the stack, as
57Code Generation for Add
- cgen(e1 e2)
- cgen(e1)
- sw a0 0(sp)
- addiu sp sp -4
- cgen(e2)
- lw t1 4(sp)
- add a0 t1 a0
- addiu sp sp 4
- Possible optimization Put the result of e1
directly in register t1 ?
58Code Generation for Add. Wrong!
- Optimization Put the result of e1 directly in
t1? - cgen(e1 e2)
- cgen(e1)
- move t1 a0
- cgen(e2)
- add a0 t1 a0
- Try to generate code for 3 (7 5)
59Code Generation Notes
- The code for is a template with holes for
code for evaluating e1 and e2 - Stack machine code generation is recursive
- Code for e1 e2 consists of code for e1 and e2
glued together - Code generation can be written as a
recursive-descent of the AST - At least for expressions
60Code Generation for Sub and Constants
- New instruction sub reg1 reg2 reg3
- Implements reg1 reg2 - reg3
- cgen(e1 - e2)
- cgen(e1)
- sw a0 0(sp)
- addiu sp sp -4
- cgen(e2)
- lw t1 4(sp)
- sub a0 t1 a0
- addiu sp sp 4
61Code Generation for Conditional
- We need flow control instructions
- New instruction beq reg1 reg2 label
- Branch to label if reg1 reg2
- New instruction b label
- Unconditional jump to label
62Code Generation for If (Cont.)
- cgen(if e1 e2 then e3 else e4)
- cgen(e1)
- sw a0 0(sp)
- addiu sp sp -4
- cgen(e2)
- lw t1 4(sp)
- addiu sp sp 4
- beq a0 t1 true_branch
false_branch cgen(e4) b end_if true_branch
cgen(e3) end_if
63The Activation Record
- Code for function calls and function definitions
depends on the layout of the activation record - A very simple AR suffices for this language
- The result is always in the accumulator
- No need to store the result in the AR
- The activation record holds actual parameters
- For f(x1,,xn) push xn,,x1 on the stack
- These are the only variables in this language
64The Activation Record (Cont.)
- The stack discipline guarantees that on function
exit sp is the same as it was on function entry - No need for a control link
- We need the return address
- Its handy to have a pointer to the current
activation - This pointer lives in register fp (frame pointer)
65The Activation Record
- Summary For this language, an AR with the
callers frame pointer, the actual parameters,
and the return address suffices - Picture Consider a call to f(x,y), The AR will
old fp
AR of f
66Code Generation for Function Call
- The calling sequence is the instructions (of both
caller and callee) to set up a function
invocation - New instruction jal label
- Jump to label, save address of next instruction
in ra - On other architectures the return address is
stored on the stack by the call instruction
67Code Generation for Function Call (Cont.)
- cgen(f(e1,,en))
- sw fp 0(sp)
- addiu sp sp -4
- cgen(en)
- sw a0 0(sp)
- addiu sp sp -4
- cgen(e1)
- sw a0 0(sp)
- addiu sp sp -4
- jal f_entry
- The caller saves its value of the frame pointer
- Then it saves the actual parameters in reverse
order - The caller saves the return address in register
ra - The AR so far is 4n4 bytes long
68Code Generation for Function Definition
- New instruction jr reg
- Jump to address in register reg
cgen(def f(x1,,xn) e) move fp sp
sw ra 0(sp) addiu sp sp -4 cgen(e)
lw ra 4(sp) addiu sp sp z lw fp
0(sp) jr ra
- Note The frame pointer points to the top, not
bottom of the frame - The callee pops the return address, the actual
arguments and the saved value of the frame
pointer - z 4n 8
69Calling Sequence. Example for f(x,y).
- Before call On entry Before
exit After call
70 Code Generation for Variables
- Variable references are the last construct
- The variables of a function are just its
parameters - They are all in the AR
- Pushed by the caller
- Problem Because the stack grows when
intermediate results are saved, the variables are
not at a fixed offset from sp
71Code Generation for Variables (Cont.)
- Solution use a frame pointer
- Always points to the return address on the stack
- Since it does not move it can be used to find the
variables - Let xi be the ith (i 1,,n) formal parameter of
the function for which code is being generated -
- cgen(xi) lw a0 z(fp) ( z
4i )
72Code Generation for Variables (Cont.)
- Example For a function def f(x,y) e the
activation and frame pointer are set up as
old fp
- X is at fp 4
- Y is at fp 8
- The activation record must be designed together
with the code generator - Code generation can be done by recursive
traversal of the AST - Production compilers do different things
- Emphasis is on keeping values (esp. current stack
frame) in registers - Intermediate results are laid out in the AR, not
pushed and popped from the stack
74End of Lecture
- Next Lecture Chapter 5
- Names
- Bindings
- Type Checking
- Scopes