Title: Activation Records
1Activation Records
2The Compiler So Far
- Lexical analysis
- Detects inputs with illegal tokens
- Syntactic analysis
- Detects inputs with ill-formed parse trees
- Semantic analysis
- Tries to catch all remaining errors
3Goals of a Semantic Analyzer
- Find remaining errors that would make program
invalid - undefined variables, types
- type errors that can be caught statically
- Figure out useful information for later phases
- types of all expressions
- data layout
- Terminology
- Static checks done by the compiler
- Dynamic checks done at run time
4Scoping
- The scope rules of a language
- Determine which declaration of a named object
corresponds to each use of the object - C and Java use static scoping
- Mapping from uses to declarations at compile time
- Lisp, APL, and Snobol use dynamic scoping
- Mapping from uses to declarations at run time
5Symbol Tables
- Purpose
- keep track of names declared in the program
- Symbol table entry
- associates a name with a set of attributes
- kind of name (variable, class, field, method, )
- type (int, float, )
- nesting level
- mem location (where will it be found at runtime)
6An Implementation
- Symbol table can consist of
- a hash table for all names, and
- a stack to keep track of scope
y
\
x
\
y
x
7Type Systems
- A languages type system specifies which
operations are valid for which types - A set of values
- A set of operations allowed on those values
- The goal of type checking is to ensure that
operations are used with the correct types - Enforces intended interpretation of values
- Type inference is the process of filling in
missing type information
8Intermediate Code Generation
lexical errors
Lexical Analysis
tokens
syntax errors
Syntactic Analysis
AST
semantic errors
Semantic Analysis
AST
Intermediate Code Gen
IR
9Run time vs. Compile time
- The compiler must generate code to handle issues
that arise at run time - Representation of various data types
- Procedure linkage
- Storage organization
- Big issue 1 Allow separate compilation
- Without it we can't build large systems
- Saves compile time
- Saves development time
- We must establish conventions on memory layout,
calling sequences, procedure entries and exits,
interfaces, etc.
10Activation Records
- A procedure is a control abstraction
- it associates a name with a chunk of code
- that piece of code is regarded in terms of its
purpose and not of its implementation - A procedure creates its own name space
- It can declare local variables
- Local declarations may hide non-local ones
- Local names cannot be seen from outside
11Control Abstraction
- Procedures must have a well defined call
mechanism - In many languages
- a call creates an instance (activation) of the
procedure - on exit, control returns to the call site, to the
point right after the call. - Use a call graph to see set of potential calls
12Handling Control Abstractions
- Generated code must be able to
- preserve current state
- save variables that cannot be saved in registers
- save specific register values
- establish procedure environment on entry
- map actual to formal parameters
- create storage for locals
- restore previous state on exit
13Local Variables
- Functions have local variables
- created upon entry
- Several invocations may exist
- Each invocation has an instantiation
- Local variables are (often) destroyed upon
function exit - Happens in a LIFO manner
- What else operates in a LIFO manner?
14Stack
- Last In, First Out (LIFO) data structure
stack
main () a(0)
Stack grows down
void a (int m) b(1)
void b (int n) c(2)
void c (int o) d(3)
void d (int p)
15Stack Frames
- Basic operations push, pop
- Happens too frequently!
- Local variables can be pushed/popped in large
batches (on function entry/exit) - Instead, use a big array with a stack pointer
- Garbage beyond the end of the sp
- A frame pointer indicates the start (for this
procedure)
16Stack Frames
- Activation record or stack frame stores
- local vars
- parameters
- return address
- temporaries
- (etc)
- (Frame size not known until
- late in the compilation process)
Arg n Arg 2 Arg 1 Static link
previous frame
fp
Local vars Ret address Temporaries Saved regs Arg
m Arg 1 Static link
current frame
sp
next frame
17The Frame Pointer
- Keeps track of the bottom of the current
activation record - g()
-
- f(a1,,an)
-
- g is the caller
- f is the callee
- What if f calls two functions?
Arg n Arg 2 Arg 1 Static link
gs frame
fp
Local vars Ret address Temporaries Saved regs Arg
m Arg 1 Static link
fs frame
sp
next frame
18Stacks
- Work languages with nested functions
- Functions declared inside other functions
- Inner functions can use outer functions local
vars - Doesnt happen in C
- Does happen in Pascal
- Work with languages that support function
pointers - ML, Scheme have higher-order functions
- nested functions AND
- functions as returnable values
- (ML, Scheme cannot use stacks for local vars!)
19Handling Nested Procedures
- Some languages allow nested procedures
- Example
proc A() proc B () call C()
proc C() proc D()
proc E() call B()
call E() call
D() call B()
call sequence
A
B
Can B call C? Can B call D? Can B call E? Can E
access C's locals? Can C access B's locals?
C
D
E
B
20Handling Nested Procedures
- In order to implement the "closest nested scope"
rule we need access to the frame of the lexically
enclosing procedure - Solution static links
- Reference to the frame of the lexically enclosing
procedure - Static chains of such links are created.
- How do we use them to access non-locals?
- The compiler knows the scope s of a variable
- The compiler knows the current scope t
- Follow s-t links
21Handling Nested Procedures
- Setting the links
- if callee is nested directly within caller
- set its static link to point to the caller's
frame pointer - proc A()
- proc B()
- if callee has the same nesting level as the
caller - set its static link to point to wherever the
caller's static link points - proc A()
- proc B()
22Activation Records
- Handling nested procedures
- We must keep a static link (vs. dynamic link)
- Registers vs. memory
- Registers are faster. Why?
Cycles?
ld ebx, 0fp ld ecx, 4fp add eax,ebx,ecx
add eax,ebx,ecx
23Registers
- Depending on the architecture, may have more or
fewer registers (typically 32) - Always faster to use registers
- (remember the memory hierarchy)
- Want to keep local variables in registers
- (when possible why cant we decide now?)
24Caller-save vs. Callee-save Registers
- What if f wants to use a reg? Who must
save/restore that register? - g()
- f(a1,,an)
-
- If g saves before call, restores after call
?caller-save - If f saves before using a reg, restores after
?callee-save - What are the tradeoffs?
Arg n Arg 2 Arg 1 Static link
gs frame
fp
Local vars Ret address Temporaries Saved regs Arg
m Arg 1 Static link
fs frame
sp
next frame
25Caller-save vs. Callee-save Registers
- Usually some registers are marked caller save
and some are marked callee save - e.g. MIPS r16-23 callee save
- all others caller save
- Optimization if g knows it will not need a
value after a call, it may put it in a
caller-save register, but not save it - Deciding on the register will be the task of the
register allocator (still a hot research area).
26Parameter Passing
- By value
- actual parameter is copied
- By reference
- address of actual parameter is stored
- By value-result
- call by value, AND
- the values of the formal parameters are copied
back into the actual parameters - Typical Convention
- Usually 4 parameters placed in registers
- Rest on stack
- Why?
27Parameter Passing
- We often put 4/6 parameters in registers
- What happens when we call another function?
- Leaf procedures dont call other procedures
- Non-leaf procedures may have dead variables
- Interprocedural register allocation
- Register windows
r32
r33
r34
r35
r32
1
1
1
r33
r34
r35
r32
r33
r34
r35
r32
A()
B()
C()
Outgoing args of A become incoming args to B
28Return Address
- Return address - after f calls g, we must know
where to return - Old machines return address always pushed on
the stack - Newer machines return address is placed in a
special register (often called the link register
lr) automatically during the call instruction - Non-leaf procedures must save lr on the stack
- Sidenote Itanium has 8 branch registers
29Stack Maintenance
- Calling sequence
- code executed by the caller before and after a
call - code executed by the callee at the beginning
- code executed by the callee at the end
30Stack Maintenance
- A typical calling sequence
- Caller assembles arguments and transfers control
- evaluate arguments
- place arguments in stack frame and/or registers
- save caller-saved registers
- save return address
- jump to callee's first instruction
31Stack Maintenance
- A typical calling sequence
- Callee saves info on entry
- allocate memory for stack frame, update stack
pointer - save callee-saved registers
- save old frame pointer
- update frame pointer
- Callee executes
32Stack Maintenance
- A typical calling sequence
- Callee restores info on exit and returns control
- place return value in appropriate location
- restore callee-saved registers
- restore frame pointer
- pop the stack frame
- jump to return address
- Caller restores info
- restore caller-saved registers
33Handling Variable Storage
- Static allocation
- object is allocated an address at compile time
- location is retained during execution
- Stack allocation
- objects are allocated in LIFO order
- Heap allocation
- objects may be allocated and deallocated at any
time.
34Review Normal C Memory Management
- A programs address space contains 4 regions
- stack local variables, grows downward
- heap space requested for pointers via malloc()
resizes dynamically, grows upward - static data variables declared outside main,
does not grow or shrink - code loaded when program starts, does not change
FFFF FFFFhex
stack
heap
static data
code
0hex
35Intel x86 C Memory Management
- A C programs x86 address space
- heap space requested for pointers via malloc()
resizes dynamically, grows upward - static data variables declared outside main,
does not grow or shrink - code loaded when program starts, does not change
- stack local variables, grows downward
heap
static data
code
stack
36Static Allocation
- Objects that are allocated statically include
- globals
- explicitly declared static variables
- instructions
- string literals
- compiler-generated tables used during run time
37Stack Allocation
- Follows stack model for procedure activation
- What can we determine at compile time?
- We cannot determine the address of the stack
frame - But we can determine the size of the stack frame
and the offsets of various objects within a frame
38Heap Allocation
- Used for dynamically allocated/resized objects
- Managed by special algorithms
- General model
- maintain list of free blocks
- allocate block of appropriate size
- handle fragmentation
- handle garbage collection
39Summary
- Stack frames
- Keep track of run-time data
- Define the interface between procedures
- Both language and architectural features will
affect the stack frame layout and contents - Started to see hints of the back-end optimizer,
register allocator, etc. - Next week Intermediate Representations