Title: Lecture 8: Instruction Set Architecture
1Lecture 8 Instruction Set Architecture
Pipelining
- Computer Engineering 585
- Fall 2001
2Compiler Structure
Front end
High Level Optimizations
Global Optimizations
Code generator
3Some Optimizations
- common sub-exp elimination AI1 BI1
- common subexp J I1 AJ BJ
- Y 500 Z/X . YY Z/X common subexpTMP
Z/X Y 500 TMP YY TMP - constant propagation A 5 B A X
B 5 X -
- copy propagation A X . Y A / 2
- Y X / 2
- code motion any loop-invariant code can be moved
out or move loop code so that loop length
decreases.
4Compiler Optimizations
For (I1 I lt 5 I) X 5 Y Y 5I
X5 For (I1 I lt 5 I) Y Y 5I
- strength reduction High-level in PL/I L
LENGTH (S1 S2) can be replaced by L LENGTH
(S1) LENGTH (S2) - machine level replace multiply and divide by a
power of 2 by shift.
5Memory Allocation by Compilers
local scalars
Stack
dynamic data, lists in LISP
Heap Data
Global Static Data
arrays, constants
User Program
OS
Low Addr0
6Register allocation
- most effective for stack scalars.
- Any aliased variables whether stack or global are
hard to allocate p A p50 ppX p30
A3210 - Call by reference or var also creates aliasing.
Actual parameter is an alias for the formal
parameter.
7Register Allocation
- How many registers suffice? Spec92 profiling
suggests 16-32 registers. - The RTL level code assumes infinitely many
virtual registers. - Register allocation maps virtual registers to
physical registers.
8Register Allocation
X5 Z2Y ZXZ X3X
Live Range X
Live Range Y
Live Range from a definition of a var. to the
last use.
9Register Allocation
Register Assignment R1 X1, X5, X6 R2 X2, X3, X4
Interference Graph
X1
X2
X3
X4
X5
X6
10DLX ISA
- DLX (DELUXE) Load/store architecture. Similar
to a typical RISC processor MIPS. - Simple load/store instruction set.
- Design for pipelining efficiency.
- Easily decoded instructions.
- Efficiency as a compiler target.
11DLX Architecture
- Thirty-two 32-bit general purpose registers ---
R0 is always 0. - FP Registers ---- 32 single-precision F0, F1,
,F31. - 16 double-precision F0F1, F2F3, ,F30F31.
- FP Status register modified by FP operations
like CCs, can exchange data with integer GPR. - Byte addressable Big-endian. Load/Store only
instructions to access memory. - Data Types byte, half-word, word integer data.
- Single-precision and double-precision FP data.
12DLX Architecture Contd.
- Addressing Modes Data operands register,
16-bit diplacement, 16-bit immediate - Branch address operands PC-relative ---
unconditional jump, 26-bit offset conditional
jump, 16-bit offset. - Absolute displacement with R0.
- Larger absolute addresses (32-bits) can be built
with LHI.
13DLX Instruction Formats
ALU with immediate operand, Load/Store, Branch
ALU instructions with register operands
14DLX Operations
- ALU ADD, SUB, MULT, DIV, AND, OR, XOR and
shifts. FP ADD, SUB, MULT and DIV. Conversion
instructions between integer, single precision FP
and double precision FP. compare instructions to
set a register to 0 or 1 on LT, GT, LE, GE, EQ,
NE. - control jumps (unconditional), also address
linking kind for procedure call/return. Branch
only on Equal to Zero or Not Equal to Zero
conditions. One branch on comparison bit in FP
status register. - data move LOAD/STORE for byte, halfword and
word sized data. LOAD/STORE for SP FP (F) and DP
FP (D) data. LHI loads the upper-half of a
register, setting the lower half to 0. LI used
for ADDI Rd, R0, 10. MOV for ADD Rd, R0, Rs.
15Load/Store Instructions
16ALU Instructions
17Control Instructions
lt
lt
lt
lt