MIT 6'035 Introduction to Compilation - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

MIT 6'035 Introduction to Compilation

Description:

Ending Point State (SPARC) Memory (32 bit addresses, byte addressable) 32 Integer Registers ... Ending Point Computation (SPARC) ld addr , reg st reg , addr ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 47
Provided by: martin49
Category:

less

Transcript and Presenter's Notes

Title: MIT 6'035 Introduction to Compilation


1
MIT 6.035Introduction to Compilation
  • Martin Rinard
  • Laboratory for Computer Science
  • Massachusetts Institute of Technology

2
Programming Language Dilemma
  • Stored program computer
  • How to instruct computer what to do?
  • Need a program that computer can execute
  • Must be written in machine language
  • Unproductive to code in machine language
  • Design a higher level language
  • Implement higher level language
  • Alternative 1 Interpreter
  • Alternative 2 Compiler

3
Compilation As Translation
Starting Point
Source Program in Some Programming Language
Compiler
Generated Program in Machine Language
Ending Point
4
Starting Point
  • Standard imperative language (Java, C, C)
  • State
  • Variables,
  • Structures,
  • Arrays
  • Computation
  • Expressions (arithmetic, logical, etc.)
  • Assignment statements
  • Control flow (conditionals, loops)
  • Procedures

5
Ending Point State (SPARC)
  • Memory (32 bit addresses, byte addressable)
  • 32 Integer Registers
  • g0-g7 global registers
  • g0 reads as 0, writes have no effect
  • o0-o7 output registers
  • l0-l7 local registers
  • i0-i7 input registers
  • Condition Codes
  • Indicate results of integer operations
  • Used in branch instructions

6
Ending Point Computation (SPARC)
  • ld ltaddrgt,ltreggt
  • st ltreggt, ltaddrgt
  • ltbinary opgt ltsrc1gt, ltsrc2gt, ltdstgt
  • ltcomp opgt ltsrc1gt, ltsrc2gt
  • ltbranch opgt ltaddressgt
  • Conditional
  • Unconditional

7
Exploring Compiler Behavior
  • Start with sample input programs
  • Compile to assembler using cc S
  • Try to match up source code with generated code
  • State Translation
  • Variables (global, local, parameters)
  • Structures and arrays
  • Computation Translation
  • Expression evaluation and assignment
  • Flow of control constructs
  • Procedure call and return

8
Implementing Global Variables
  • Allocate memory location for variable
  • When program accesses variable, compiler
    generates load and store instructions

Memory
a
int a, b, c
b
c
9
Implementing Global Variables
  • int a, b, c
  • proc()
  • a b c

sethi hi(b),l0 or l0,lo(b),l0 ld
l00,l2 sethi hi(c),l0 or
l0,lo(c),l0 ld l00,l1 add
l2,l1,l1 sethi hi(a),l0 or
l0,lo(a),l0 st l1,l00
Load value of b into l0
Load value of c into l1
.align 8 .common a,4,4 .common b,4,4 .common
c,4,4
Add l0 and l1
Store result into a
Allocate storage for a,b, and c
10
Sethi Instruction
  • Encode parts of address in instruction stream
  • Machine code format of sethi instruction
  • Effect of sethi instruction
  • Replace top 22 bits of ltreggt with ltimmediategt
  • Set bottom 10 bits of ltreggt to zero
  • Example of general theme store constant values
    in immediate fields of instructions

00
5 bit ltreggt
100
22 bit ltimmediategt
11
Implementing Local Variables
  • Concept of procedure call stack
  • Each procedure invocation has state
  • Local variables
  • Return address
  • Stores state in frame
  • New frame allocated for each call
  • Frames usually allocated on call stack
  • Call stack allocated at top of memory
  • Call stack grows down

12
Implementing Local Variables
  • Frame pointer register points to current frame
  • Decreases at calls (stack grows down)
  • Increases on returns
  • Local variables allocated in frame

proc() int a, b, c
Memory
Frame for caller of proc
Registers
a
Local variables of proc
b
Frame for invocation of proc
c
fp
Return addr
13
Implementing Local Variables
proc() int a, b, c a b c
  • ld fp-12,l0
  • ld fp-16,l1
  • add l0,l1,l0
  • st l0,fp-8

Note fp is same as i6 Points to frame for
procedure
14
Implementing Structures
  • Structures contain several fields
  • Each structure typically stored in a contiguous
    block of memory

Memory
typedef struct int x, y, z foo foo p
z
y
x
p
15
Implementing Structures
Compute address of f-gty
ld fp-8,l0 add l0,4,l0 ld
l00,l2 ld fp-8,l0 add
l0,8,l0 ld l00,l1 add
l2,l1,l1 ld fp-8,l0 st
l1,l00
typedef struct int x, y, z foo proc()
foo f f-gtx f-gtyf-gtz
load f-gty
Compute address of f-gtz
load f-gtz
add values
Store into f-gtx
16
Optimized Version
typedef struct int x, y, z foo proc()
foo f f-gtx f-gtyf-gtz
ld fp-4,o0 ld o04,o1 ld
o08,o2 add o1,o2,o1 st o1,o0
17
Alignment, Padding, and Packing
  • Machines often have alignment requirements
  • Integers (4 bytes) must start at 4-byte aligned
    address (bottom 2 bits 0)
  • Shorts (2 bytes) must start at 2-byte aligned
    address (bottom bit 0)
  • Alignment requirements raise issues
  • Padding between fields to ensure alignment
  • Field packing to minimize memory usage

18
Padding and Packing Example
Packed Layout (4 byte savings)
Naïve Layout
typedef struct int w char x int y
char z foo foo p
Memory
Memory
z
y
y
x
x, z
w
w
p
p
19
Implementing Arrays
  • Allocate memory locations for array elements
  • Elements stored contiguously

Memory
a3
a2
int a4
a1
a0
20
Implementing Arrays
ld fp-12,l0 sll l0,2,l0 sethi
hi(a),l1 or l1,lo(a),l1 add
l0,l1,l0 ld l00,l0 st
l0,fp-8
Compute address of aj
int a4 proc() int i, j i aj
load aj into l0
store l0 into i
address of aj address of a0 (4 j) a
(4 j)
21
Expression Evaluation
  • Evaluate subexpressions, combine to get value of
    outer expression
  • Must always have values of operands in registers
  • Final result placed in register

22
Implementing Expression Evaluation
mov 3,l0 st l0,fp-8 mov 2,l0 st
l0,fp-12 ld fp-8,l0 ld
fp-12,l1 add l0,l1,l2 ld
fp-8,l0 ld fp-12,l1 or
l0,l1,l1 sub l2,l1,l1 sethi
hi(x),l0 or l0,lo(x),l0 st
l1,l00
Initialize a and b
int x proc() int a,b a 3 b 2 x
(ab)-(ab)
Load a and b
Add a and b to l2
Load a and b
Or a and b to l1
Compute l1l2-l1
Load address of x
Store result in x
23
Implementation Issues
  • Generating a linear sequence of instructions to
    compute a nested expression
  • Allocate storage for temporary values
  • Typically registers, but there are a limited
    number of registers for machine
  • May need to store temporaries in memory
  • Expression evaluation order affects number of
    values you need to keep around
  • In many cases, may be able to statically compute
    value of subexpressions

24
Optimized Implementation
int x proc() int a,b a 3 b 2 x
(ab)-(ab)
or g0,2,g2 sethi hi(x),g1 st
g2,g1lo(x)
25
Flow of Control
  • Convert structured flow of control to branch
    statements
  • Two pervasive shapes

if C then A else B
while C A
Code to evaluate C
Code to evaluate C
Code to execute A
Code to execute A
Code to execute B
Code after while statement
Code after if statement
26
Conditional Example
sethi hi(a),g1 ld g1lo(a),g1 cmp
g1,0 be .L1 sethi hi(b),g1 br
.L2 st g0,g1lo(b) .L1 or
g0,1,g2 st g2,g1lo(b) .L2 retl nop
int a, b proc() if (a) b 0 else
b 1
27
Optimized Conditional Example
sethi hi(a),g1 ld g1lo(a),g1 cmp
g1,0 be .L1 sethi hi(b),g1 retl st
g0,g1lo(b) .L1 or
g0,1,g2 retl st g2,g1lo(b)
int a, b proc() if (a) b 0 else
b 1
28
Apparent Anomaly in Code
sethi hi(a),g1 ld g1lo(a),g1 cmp
g1,0 be .L1 sethi hi(b),g1 br
.L2 // branch over else
part st g0,g1lo(b) // store value into
b for then part .L1 or g0,1,g2 st
g2,g1lo(b) .L2 retl nop
int a, b proc() if (a) b 0 else
b 1
  • Branch appears before store
  • Why will b get correct value?

29
Concept of Branch Delay Slots
  • In SPARC architecture, instruction after branch
    executes even if branch is taken!
  • be .L1
  • sethi hi(b),g1
  • Why do this?
  • It improved the performance of the initial
    version of the processor
  • Compiler could handle the complexity
  • What if there is no instruction to execute? nop!

This instruction executes even if the branch is
taken!
30
Instruction Scheduling
  • Branch delay slots are special case of
    instruction scheduling
  • Instruction scheduling packs instructions
    together for concurrent/pipelined execution
  • Sophisticated part of compilation
  • Moves work from hardware to compiler
  • Illustrates rarity of direct assembly coding
  • Required for IA-64 to work well

31
Implementation of Loops
  • Initialize i to 0 and n to 10
  • Load i and n
  • Branch to end if i gt n
  • Compute address of ai
  • Load ai
  • Increment value
  • Store back into ai
  • Increment i
  • Branch back to top if i lt n

int a10 proc() int n 10 int i i
0 while (i lt n) ai i
32
.L16 ld fp-12,l0 sll
l0,2,l0 sethi hi(a),l1 or
l1,lo(a),l1 add l0,l1,l0
st l0,fp-16 ld
fp-16,l0 ld l00,l0
add l0,1,l1 ld fp-16,l0
st l1,l00 ld
fp-12,l0 add l0,1,l0 st
l0,fp-12 ld fp-12,l1
ld fp-8,l0 cmp l1,l0
bl .L16 nop .L18
Implementation
mov 10,l0 st
l0,fp-8 mov 0,l0 st
l0,fp-12 ld fp-12,l1
ld fp-8,l0 cmp l1,l0
bge .L18 nop
int a10 proc() int n 10 int i i
0 while (i lt n) ai i
33
Optimizations
  • Keep i and address of ai in registers
  • Compute address of a0 before loop body,
    increment by 4 in loop body
  • Dont store n in memory or register, just use 10
    whenever you see it
  • Omit initial branch at top of loop

34
Optimized Implementation
sethi hi(a),g1 add g1,lo(a),g1 or
g0,0,g2 ld g1,g3 .L900000106 add
g3,1,g3 st g3,g1 add
g2,1,g2 add g1,4,g1 cmp
g2,10 bl,a .L900000106 ld g1,g3
.L77000006
Load base address of a
int a10 proc() int n 10 int i i
0 while (i lt n) ai i
Init i
Load ai
Increment and store ai
Update i and ptr to ai
Loop back
Load ai
35
More Optimizations
sethi hi(a),g1 add g1,lo(a),g1 or
g0,0,g2 ld g1,g3 .L900000106 add
g3,1,g3 st g3,g1 add
g2,1,g2 add g1,4,g1 cmp g1,10
a40 bl,a .L900000106 ld g1,g3
.L77000006
Load base address of a
int a10 proc() int n 10 int i i
0 while (i lt n) ai i
Init i
Load ai
Increment and store ai
Update i and ptr to ai
Loop back
Load ai
36
Procedure Call
  • Protocol between caller and callee
  • Heavily architecture dependent
  • SPARC concepts
  • Caller actions
  • Store parameters in o0-o6
  • Jump to callee, storing PC in o7
  • Get return result in o0
  • Callee actions
  • Get parameters in i0-i6, PC from caller in i7
  • Put return result in i0
  • Use PC from caller to return back to caller

37
Register Windows
  • Parameter issue
  • Caller puts parameters in o0-06
  • Callee expects parameters in i0-i6
  • Return result issue
  • Callee puts return result in i0
  • Caller expects result in o0
  • Why? Register windows!
  • Conceptually, have an overlapping stack of
    register windows

38
Prev
Visual Register Windows
g0-g7
i0-i7
l0-l7
Current
o0-o7
i0-i7
l0-l7
Next
o0-o7
i0-i7
l0-l7
  • Have current window
  • i0-i7 of current window are same as o0-o7 of
    previous window
  • o0-o7 of current window are same as i0-i7 of next
    window

o0-o7
39
Register Window Instructions
  • save sp, ltnumgt, sp
  • Pushes current window on stack
  • Allocates new window (o0-o7 become i0-i7, new
    l0-l7 and o0-o7)
  • Sets sp (o6) in new window to sp in old window
    plus ltnumgt
  • Note that o6 in old window becomes i6 in new
    window (sp becomes fp)
  • restore
  • Pops current window
  • (i0-i7 become o0-o7)

40
Stack and Frame Pointers
save sp, -12, sp (In practice, need at least
1264 bytes to leave space for reg saves)
Memory
Old Reg Win
fp i6
New Reg Win
sp o6
fp i6
sp o6
41
Procedure Call Example
sethi hi(n),l0 or l0,lo(n),l0 ld
l00,l0 mov l0,o0 call foo nop
Load n
int n bar() foo(n)
Set up parameter
Call foo (stores PC of call instruction into o7)
42
Procedure Example
save sp,-104,sp st i0,fp68 ld
fp68,l0 add l0,1,l0 st
l0,fp-4 ba .L13 nop .L13 ld
fp-4,l0 mov l0,i0 jmp i78 restore
Standard Prologue
New Reg Win, frame
Store param
int foo(int n) return n1
Compute result
Load result
Standard Epilogue
Return to caller
Restore Reg Win frame
43
Optimized Leaf Procedure
  • Punt register windows completely
  • Just compute in window from caller

int foo(int n) return n1
jmp o78 add o0,1,o0
Return to caller
Compute result
44
Complex Design
  • Need for separate compilation
  • Must be standard call/return protocol
  • Need for performance
  • Parameters/return value passed in registers
  • Supports efficient caller/callee linkage
  • Protocol supports tailored code generation
  • Caller does not set up register window, frame
    pointer, or stack pointer for callee
  • Enables leaf procedure optimizations
  • Compiler hides all this from programmer!

45
Broader View
  • Compilation is a specific instance of language
    processing and translation
  • Technical world is littered with small languages
  • Scripting languages
  • Configuration languages
  • Domain-specific languages
  • Language processing crucial skill that you can
    apply in many areas to improve productivity
  • Key aspects
  • Developer representation (text)
  • Internal representation (data structures)
  • Parsing, analysis, transformation, code
    generation
  • Studying compilers gives you skills you need to
    do language processing

46
Summary
  • Compiler responsibilities
  • Data layout and access
  • Global and local variables, parameters
  • Structures, arrays, and objects
  • Expression evaluation
  • Flow of control
  • Procedure and method calls
  • Hide low-level machine complexities
  • Optimizations
Write a Comment
User Comments (0)
About PowerShow.com