Title: CS 2200 Lecture 03a Instruction Set Architectures ISAs
1CS 2200 Lecture 03aInstruction Set
Architectures (ISAs)
- (Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)
2Types of instructions and how theyre used
3Instruction Classes
- Weve talked lots about instructions, instruction
counts, but what are they really? - Lets look at some basic instruction classes
- Arithmetic/Logical AND, OR, add, sub
- Data Transfer Loads/Stores, move inst.
(machines - w/addressable memory)
- Control Branch, jump, procedure call return,
traps - System OP Sys. calls, virtual memory management
- Floating Point Floating point ops multiply and
divide - Decimal Dec. add, multiply dec.-to-char.
conversions - String String move, compare, search
- Graphics Pixel ops, compression/decompression
4What instructions are really used?
- Usually the most commonly executed instructions
are the simple ones - For example consider this Intel 80x86 mix
These 10 instructions make up 96 of all those
executed!
One could consider this to be more than 1
instruction
5How does code translate to instuctions?
- g h A8
- load s0,8(s3)
- actually
- lw s0,8(s3) lw dest, offset(base)
- add s1, s2, s0 add dst, src1, src2
- Notice Registers contain data or addresses!!
6Perception Check
- What exactly are we doing?
- HLL (C) ? Assembler
- Assembler ? Machine Instructions
- Goals
- See how HLL translate to Assembler
- Understand hardware designs, constraints, goals,
etc.
7Slightly more complex
- A12 h A8
- Compile it?
- lw s0, 8(s3)
- add s0, s2, s0
- sw s0, 12(s3)
Can reuse it though
(its a good thing to save registers in HLL
code you may not, but compiler will translate it
for you)
8Historical Note
- Early machines often had a register called an
index register - load addr(index)
- Address space was small
- Today
- lw s1, offset(base)
- Base address are much larger...need to be in a
register - Often offsets are small but if not...
9Variable Array Index?
- g h Ai
- add a0, s2, s3 i addr(A)
- lw a1, 0(a0) a1 ? Ai
- add s0, s1, a1 g ? h Ai
10Flashback How many registers?
- Early machines had about 1 (Accumulator)
- PDP-11 series had 8
- Typical 32 bit processors today have 32.
- Why not more?
- What happens when there are more variables than
registers? - Spillage Putting less commonly used variables
back into memory.
11Factoid
- accumulator Archaic term for a register. On-line
use of it as a synonym for register is a fairly
reliable indication that the user has been around
for a while. - Eric Raymond, The New Hackers Dictionary, 1991
12Note the speed
- Register ops access two regs and perform an
operation - Memory ops access one location and dont do
anything to it! - What do you think about the speed of memory
versus registers?
13Mips Registers
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0-v1
2-3
Values for results expression eval.
a0-a3
4-7
Arguments
t0-t7
8-15
Temporaries
s0-s7
16-23
Saved
t8-t9
24-25
More temporaries
k0-k1
26-27
Reserved for use by operating system
gp
28
Global pointer
sp
29
Stack pointer
fp
30
Frame pointer
ra
31
Return address
14LC2200 Registers
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0
2
Return value
a0-a4
3-7
Argument or temporary
s0-s3
8-11
Saved general purpose registers
k0
12
Reserved for OS/traps
sp
13
Stack pointer
fp
14
Frame pointer
ra
15
Return address
15Decisions, decisions...
- What distinguishes a computer from a simple
calculator is its ability to make decisions.
Some smart guy. - The affects of branch instructions are a big
source of research (read headaches) in computer
architecture - In this class
- Jump will refer to unconditional changes in
control - Branch will refer to conditional changes in
control - And there are 4 main types of control-flow
change - Conditional branches (most frequent)
- Jumps
- Procedure calls
- Procedure returns
A function call just like a HLL but, we have to
get to it, get back from it, and preserve state
16which break down like this
s for benchmark Suite on load/store machine
Side note if you didnt know, int x int y xy
vs. double x, double y, xy results in different
types of instructions Why?
17Conditional branch instructions
18Where jumps and branches go
- Always to some specified destination address
- Most often, address is specified in instruction
- Procedure return in an exception however
- (Return target is not known at compile time)
- Usually, address specified relative to the PC
- PC Program Counter indexes executing
instructions - Control instructions specified as such are
PC-relative - Good b/c target is often near current instruction
(indexed by PC) and specified by fewer bits - Allows for position independence program can
run independently of where its loaded
19Target Unknown
- So what about these unknown addresses?
- Target NOT known at compile time
- Cant use PC relative must specify target
dynamically so we can change it at runtime - Some options
- Put target address in a register
- Let jump permit any addr. mode to supply target
address - Register indirect jumps useful for following
constructs - Case/Switch select among one of several
alternatives - Dynamically shared libraries library loaded
when invoked - No compile time target load from memory with
register indirect jump
20Some basic branch facts
- Branches usually use PC-relative addressing
- But how far is target from instruction?
- The answer to this question will tell us
- Which branch offsets to support
- How long an instruction is/how it should be
encoded - Note the gory interdependencies!!!
- Most branches are in the forward direction and
only 4-7 instructions away - Short displacement should suffice
- Gives us increased code density w/shorter
instructions - What would you want to design a datapath to
handle?
21How do we know where to go?
Often branch is simple inequality test or
compares with 0 architectures make this is a
special case and make it fast! (b/c you want to
spend time computing, not comparing)
22Lets look at some branch examplesand how they
map to statements from a high-level
language(using LC 2200 register conventions)
23LC2200 Registers
Recall
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0
2
Return value
a0-a4
3-7
Argument or temporary
s0-s3
8-11
Saved general purpose registers
k0
12
Reserved for OS/traps
sp
13
Stack pointer
fp
14
Frame pointer
ra
15
Return address
24if statement
- if (i j) goto L1
- f g h
- L1 f f - i
- beq s3, a4, L1 if ij goto L1
- add s0, s1, s2 f g h
- L1 sub s0, s0, s3 f f - i
if i ! j, we want to add g h and store it in f
before we calculate f f - i
how is this compare actually done?
25Recall example MIPS machine
26Anddid we just use a go to???
27if statement
- if (i ! j)
- f g h
- f f - i
- beq s3, a4, L1
- add s0, s1, s2
- L1 sub s0, s0, s3
28if-then-else
-
- if (i j)
- f g h
- else
- f g
- beq s3, a4, Then if i j we want to
add - add s0, s1, zero f g h (the else
part) - beq zero, zero, Exit
- Then add s0, s1, s2 f g h
- Exit
reg
contents
Note The LC2200 has no BNE
s0
f
s1
g
s2
h
s3
i
a4
j
29Loop with Variable Array Index
- Loop g g Ai
- i i j
- if (i ! h) goto Loop
reg
contents
s0
g
s1
h
s2
i
s3
j
Loop add a1, s2, a0 lw a1,
0(a1) add s0, s0, a1 add
s2, s2, s3 beq s2, s1, Exit
beq zero, zero, Loop Exit
a0
addr of A
Plus some temps
30While Loop
reg
contents
- while (savi k)
- i i j
- Loop add a1, s1, s0
- lw a2, 0(a1)
- beq a2, s3, Skip
- beq zero, zero, Exit
- Skip add s1, s1, s2
- beq zero, zero, Loop
- Exit
s0
addr(sav)
s1
i
s2
j
s3
k
a1
temp
a2
temp
31Case/Switch
- switch (k)
- case 0 f i j break
- case 1 f g h break
- case 2 f g - h break
- case 3 f i - j break
-
- slt t3, s5, zero If k lt 0
- bne t3, zero, Exit Exit
- slt t3, s5, t2 If k gt 4
- beq t3, zero, Exit Exit
- add t1, s5, s5 mpy k by 4
- add t1, t1, t1
Mips
32Case/Switchcontinued
- switch (k)
- case 0 f i j break
- case 1 f g h break
- case 2 f g - h break
- case 3 f i - j break
-
Jmptbl Address(L0) Address(L1) Address(L2)
Address(L3)
33Case/Switchcontinued
- switch (k)
- case 0 f i j break
- case 1 f g h break
- case 2 f g - h break
- case 3 f i - j break
add t1, t1, t4 t1 Jmptabk lw t0,
0(t1) jr t0 jump based
on reg t0 L0 add s0, s3, s4 j
Exit etc...
34Procedures (the previous examples were just
if statements, case statements, and loops what
about something like a function call)
35More detail (lots more detail actually!)
36Procedures
- Procedural abstraction
- What is the programmers model?
37Procedures
- Procedure abstraction
- What is the programmers model?
- What does the compiler have to do?
- Remember functions are not compiled at the same
time
38Procedures
- Procedure abstraction
- What is the programmers model?
- What does the compiler have to do?
- Remember functions are not compiled at the same
time - Simple hardware to support procedures?
39What do we need?
- What do we need to support procedure calls in
assembly language? - Nested modules/Recursion
- Pass values to modules
- Return value(s) from module
- Asynchronous compilation
- Need to do things in a uniform way
- Continue execution after module finishes
40Another way to look at it
- What does a programmer expect?
- 1. arguments bound to formal parameters
- 2. space for local variables
- 3. means to return a value (or values)
- 4. arbitrary nesting of procedure invocations
- e.g. for recursion
foo() bar(int a)
bar(42) int temp 3
... return(temp a)
41Procedure Issues
- Hardware instructions to support this model?
- Call/return
- Remember where we are in the program. Why?
- Program counter (PC)
- What should happen on every instruction
execution? - Would we need a PC if there were no procedure
abstraction? - Load/store
- Stack
- Push and pop
- Stack pointer (sp)
- Stack frames
- What do we store and restore on call/return?
42More procedure issues
- Software conventions (Why?)
- Reserve some number of registers for parameters,
return values, and return address - e.g. LC2200
- 5 for params, 1 for return values, one for return
address - JALR ltproc-addr in reggt, ra ra is
return-addr - JALR ra, zero Where does this go?
- What if we have more params or return values?
- Common use stack/memory
- Registers used in procedures
- Temporary registers
- Caller does not expect value to be preserved upon
return - LC2200 a0 to a4
- Saved registers
- Caller does expect value to be preserved on
return - LC2200 s0 to s3 (simplifies amount of state to
be saved)
43MIPS Registers
FYI
44LC2200 Registers
Recall
45A simple example
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi s0, zero, 3
temp 3 add v0, a0, s0 a
temp jalr ra, zero return!
46A Tale of Two Closets
A
S
My Closet
Renters Closet
47A Tale of Two Closets
A
S
My Closet
Renters Closet
48A Tale of Two Closets
A
S
My Closet
Renters Closet
49Procedure calls
- Now its not just a matter of going somewhere else
but we also may need to save state - At least return address must be saved (the PC)
- Some architectures provide a mechanism to save
registers, others make compiler do it - Two basic conventions used
- Caller saving The calling procedure must save
the registers it wants to use after the return - Callee saving Called procedure must save the
registers it wants to use - Most compilers conservatively caller save any
variable that might be accessed during a call
50Procedure call issues to think about
- Assume procedure P1 calls procedure P2
- Both procedures need to manipulate global
variable x - If P1 put x in a register, it needs to tell P2
about it - But what if P2 calls P3 which uses a register
where x was put by P1? And P3 shouldnt touch
x. - Some programs may work more efficiently with the
caller saved procedure but others might benefit
from the callee saving procedure. - Most sophisticated compilers use a combination of
both for maximum efficiency
51In detail
- how do we handle an arbitrary nesting of
procedure invocations? - Hmmm this means we need to save a bunch of
registers somewhere so we can restore state upon
return.
52Questions to consider
- Compiling simple procedures
- Just need to save/restore registers used by
called procedure - Who should do this? caller? callee?
- What regs in the LC2200 example?
53Question
- Caller has values in s1 and a4
- Callee will need (to destroy) s1 and a4 to
perform its operations. - Who should save s1?
- How many people say Caller?
- How many people say Callee?
- How many people say either?
54Another question
- Caller has values in s1 and a4
- Callee will need (to destroy) s1 and a4 to
perform its operations. - Who should save a4?
- How many people say Caller?
- How many people say Callee?
- How many people say either?
55Stacks
56Using Stacks
- Basic stack definition
- grows up, grows down
- next empty or last full
- Who saves (caller or callee?)
- Mechanics
- caller at call time
- callee at entry
- callee at exit
- caller after the return
57Stacks
- Stack grows up? down?
- sp points to next empty? or last full?
- Example
- stack grows up (in addresses)
- points to next empty slot
- PUSH(x)
- POP(x)
1000
full items
1001
1002
sp 1003
1003
1004
1005
1006
ADDI sp, sp, 1 SW x, -1(sp)
LW x, -1(sp) ADDI sp, sp, -1
58Stack Conventions
Unix memory layout stack grows down, heap grows
up
4GB-1
stack
unused
heap
data
code
0
59Stack Conventions
4GB-1
stack
unused
64KB-1
unused
LC-2200 layout? no heap. stack grows up. keeps
memory together
heap
data
stack
data
code
code
0
0
60Stack Usage
- RISC machines dont have PUSH/POP instructions,
you have to do it manually. - Usually, you dont synthesize PUSH/POP but rather
do alloc-use-dealloc, e.g - note this is a feature...
foo addi sp, sp, 2 sw ra, -2(sp)
sw s0, -1(sp) ... lw
s0, -1(sp) lw ra, -2(sp)
addi sp, sp, -2 jalr ra, zero
61Aside Leaf Procedures
- If a procedure calls no subroutines, its a
leaf in the call tree. - Leaf procedures dont need to save ra. In fact,
they may not need to save anything or even
allocate stack space! - another optimization enabled by RISC
62Example Leaf Procedure
- int leaf_example(int g, int h, int i, int j)
-
- int f
- f (g h) - ( i j)
- return f
-
63Example Leaf Procedure
- int leaf_example(int g, int h, int i, int j)
-
- int f
- f (g h) - ( i j)
- return f
-
?
What will be where?
(i.e. how the heck do we do this)
?
64Example Leaf Procedure
- int leaf_example(int g, int h, int i, int j)
-
- int f
- f (g h) - ( i j)
- return f
-
For educational purposes, we'll use the s
registers for temporary storage of values. Where
can we put the existing values that are in the s
registers?
65Example Leaf Procedure
- leaf_example
- addi sp, sp, -3 Make room for 3 items
- sw s0, 2(sp) Push them onto stack
- sw s1, 1(sp)
- sw s2, 0(sp)
- add s2, a0, a1 Do the calc... gh
- add s1, a2, a3 ij
- sub s0, s2, s1 (gh) - (ij)
- add v0, s0, zero Move!
- lw s2, 0(sp) Pop everything off
- lw s1, 1(sp) stack and put back
- lw s0, 2(sp) into regs
- addi sp, sp, 3 Adjust stack pointer
- jalr ra, zero return
66Question
- Class question
- What just happened on the last slide?
- Hint
- Think about information and conventions that
were discussed in previous slides
67Trivial procedure call revisited
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
68Trivial procedure call revisited
2
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
69A-type vs. S-type Temporaries
- s0..s3 temporaries
- s for saved
- callee-saved gotta save em before you can use
em - a0..a4 arguments or temporaries
- caller-saved they dont survive a procedure call
so you save them (as a caller) if you want them
to survive. - Why two types? When would you use each type?
70A-type vs. S-type
- Crudely
- S-type for long-lifetime temporaries
- A-type for short-lifetime temporaries
- Ex
- Aside MIPS has T-type as well
void baz() int i, total 0 for (i 0 i
lt 1000 i) total qux(i)
return(total)
int qux(int x) return(2 x 1)
71Trivial procedure call revisited ?
2
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
S-type was a poor choice!
72Trivial procedure call revisited ?
3
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
S-type was a poor choice!
73Sidebar on "IN" parameters
- The alert student may note that in the preceding
example the values of g, h, i and j were in the
a0 thru a3 registers. - If we made a change to any of those values
wouldn't that change persist (I hear you say). - Well, sort of but not really...
74What Really Happens?
- The calling program would be required to take the
values from memory and copy them into the a
registers - The called module does it's thing.
- Upon return the values in the a registers are
ignored! They are absolutely not guaranteed to be
valid!!!
75Questions?if not, a long example
76Caller/Callee Mechanics
who does what when?
foo() bar(int a)
int temp 3 bar(42)
... ...
return(temp a)
2. callee at entry
1. caller at call time
4. caller after return
3. callee at exit
77But first, strategy
do most work at callee entry/exit
- Caller at call time
- put arguments in a0..a4
- jalr ..., ra
- Callee at entry
- Callee at exit
- put return value in v0
- Caller after return
- retrieve return value from v0
78Good Strategy
do most work at callee entry/exit
- Caller at call time
- put arguments in a0..a4
- jalr ..., ra
- Callee at entry
- allocate all stack space
- save ra s0..s3 if necessary
- Callee at exit
- restore ra s0..s3 if used
- deallocate all stack space
- put return value in v0
- Caller after return
- retrieve return value from v0
most of the work
79Good Strategy
do most work at callee entry/exit
- Caller at call time
- put arguments in a0..a4
- save any caller-save temporaries
- jalr ..., ra
- Callee at entry
- allocate all stack space
- save ra s0..s3 if necessary
- Callee at exit
- restore ra s0..s3 if used
- deallocate all stack space
- put return value in v0
- Caller after return
- retrieve return value from v0
- restore any caller-save temporaries
most of the work
80MIPS Registers
Recall
81Example Factorial(!)
- int fact(int n)
-
- if (n lt 1)
- return 1
- else
- return (n fact(n-1))
82Factorial Again!
- int fact(int n)
-
- if (n lt 1)
- return 1
- else
- return (n fact(n-1))
fact n 0
fact n 1
fact n 2
fact n 3
fact n 4
caller fact(4)
83Factorial Again!
- fact sub sp, sp, 8 adjust stack for 2
- sw ra, 4(sp) save the return addr
- sw a0, 0(sp) save arg n
- slt t0, a0, 1 test for n lt 1
- beq t0, zero, L1 if n gt 1, goto L1
- add v0, zero, 1 return 1
- add sp, sp, 8 pop 2 off stack
- jr ra return to caller
- L1 sub a0, a0, 1 n gt 1 arg gets n-1
- jal fact call fact w/ (n-1)
- lw a0, 0(sp) ret frm jal restr n
- lw ra, 4(sp) restore ret addr
- add sp, sp, 8 adj stk ptr (pop 2)
- mul v0, a0, v0 ret n fact(n-1)
- jr ra return to caller
(Assembly code that we REALLY have to go through
in detail)