CS 2200 Lecture 03a Instruction Set Architectures ISAs

About This Presentation

Title:

CS 2200 Lecture 03a Instruction Set Architectures ISAs

Description:

Instruction Set Architectures (ISAs) ... Usually the most commonly executed instructions are the simple ones ... Some smart guy. ... – PowerPoint PPT presentation

Number of Views:67

Avg rating:3.0/5.0

Slides: 80

Provided by: michaelt8

Category:

more less

Transcript and Presenter's Notes

Title: CS 2200 Lecture 03a Instruction Set Architectures ISAs

1
CS 2200 Lecture 03aInstruction Set
Architectures (ISAs)

(Lectures based on the work of Jay Brockman,
Sharon Hu, Randy Katz, Peter Kogge, Bill Leahy,
Ken MacKenzie, Richard Murphy, and Michael
Niemier)

2
Types of instructions and how theyre used
3
Instruction Classes

Weve talked lots about instructions, instruction
counts, but what are they really?
Lets look at some basic instruction classes
Arithmetic/Logical AND, OR, add, sub
Data Transfer Loads/Stores, move inst.
(machines
w/addressable memory)
Control Branch, jump, procedure call return,
traps
System OP Sys. calls, virtual memory management
Floating Point Floating point ops multiply and
divide
Decimal Dec. add, multiply dec.-to-char.
conversions
String String move, compare, search
Graphics Pixel ops, compression/decompression

4
What instructions are really used?

Usually the most commonly executed instructions
are the simple ones
For example consider this Intel 80x86 mix

These 10 instructions make up 96 of all those
executed!
One could consider this to be more than 1
instruction
5
How does code translate to instuctions?

g h A8
load s0,8(s3)
actually
lw s0,8(s3) lw dest, offset(base)
add s1, s2, s0 add dst, src1, src2
Notice Registers contain data or addresses!!

6
Perception Check

What exactly are we doing?
HLL (C) ? Assembler
Assembler ? Machine Instructions
Goals
See how HLL translate to Assembler
Understand hardware designs, constraints, goals,
etc.

7
Slightly more complex

A12 h A8
Compile it?
lw s0, 8(s3)
add s0, s2, s0
sw s0, 12(s3)

Can reuse it though
(its a good thing to save registers in HLL
code you may not, but compiler will translate it
for you)
8
Historical Note

Early machines often had a register called an
index register
load addr(index)
Address space was small
Today
lw s1, offset(base)
Base address are much larger...need to be in a
register
Often offsets are small but if not...

9
Variable Array Index?

g h Ai
add a0, s2, s3 i addr(A)
lw a1, 0(a0) a1 ? Ai
add s0, s1, a1 g ? h Ai

10
Flashback How many registers?

Early machines had about 1 (Accumulator)
PDP-11 series had 8
Typical 32 bit processors today have 32.
Why not more?
What happens when there are more variables than
registers?
Spillage Putting less commonly used variables
back into memory.

11
Factoid

accumulator Archaic term for a register. On-line
use of it as a synonym for register is a fairly
reliable indication that the user has been around
for a while.
Eric Raymond, The New Hackers Dictionary, 1991

12
Note the speed

Register ops access two regs and perform an
operation
Memory ops access one location and dont do
anything to it!
What do you think about the speed of memory
versus registers?

13
Mips Registers
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0-v1
2-3
Values for results expression eval.
a0-a3
4-7
Arguments
t0-t7
8-15
Temporaries
s0-s7
16-23
Saved
t8-t9
24-25
More temporaries
k0-k1
26-27
Reserved for use by operating system
gp
28
Global pointer
sp
29
Stack pointer
fp
30
Frame pointer
ra
31
Return address
14
LC2200 Registers
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0
2
Return value
a0-a4
3-7
Argument or temporary
s0-s3
8-11
Saved general purpose registers
k0
12
Reserved for OS/traps
sp
13
Stack pointer
fp
14
Frame pointer
ra
15
Return address
15
Decisions, decisions...

What distinguishes a computer from a simple
calculator is its ability to make decisions.
Some smart guy.
The affects of branch instructions are a big
source of research (read headaches) in computer
architecture
In this class
Jump will refer to unconditional changes in
control
Branch will refer to conditional changes in
control
And there are 4 main types of control-flow
change
Conditional branches (most frequent)
Jumps
Procedure calls
Procedure returns

A function call just like a HLL but, we have to
get to it, get back from it, and preserve state
16
which break down like this
s for benchmark Suite on load/store machine
Side note if you didnt know, int x int y xy
vs. double x, double y, xy results in different
types of instructions Why?
17
Conditional branch instructions
18
Where jumps and branches go

Always to some specified destination address
Most often, address is specified in instruction
Procedure return in an exception however
(Return target is not known at compile time)
Usually, address specified relative to the PC
PC Program Counter indexes executing
instructions
Control instructions specified as such are
PC-relative
Good b/c target is often near current instruction
(indexed by PC) and specified by fewer bits
Allows for position independence program can
run independently of where its loaded

19
Target Unknown

So what about these unknown addresses?
Target NOT known at compile time
Cant use PC relative must specify target
dynamically so we can change it at runtime
Some options
Put target address in a register
Let jump permit any addr. mode to supply target
address
Register indirect jumps useful for following
constructs
Case/Switch select among one of several
alternatives
Dynamically shared libraries library loaded
when invoked
No compile time target load from memory with
register indirect jump

20
Some basic branch facts

Branches usually use PC-relative addressing
But how far is target from instruction?
The answer to this question will tell us
Which branch offsets to support
How long an instruction is/how it should be
encoded
Note the gory interdependencies!!!
Most branches are in the forward direction and
only 4-7 instructions away
Short displacement should suffice
Gives us increased code density w/shorter
instructions
What would you want to design a datapath to
handle?

21
How do we know where to go?
Often branch is simple inequality test or
compares with 0 architectures make this is a
special case and make it fast! (b/c you want to
spend time computing, not comparing)
22
Lets look at some branch examplesand how they
map to statements from a high-level
language(using LC 2200 register conventions)
23
LC2200 Registers
Recall
Name
R
Usage
zero
0
The constant value 0
at
1
Reserved for assembler
v0
2
Return value
a0-a4
3-7
Argument or temporary
s0-s3
8-11
Saved general purpose registers
k0
12
Reserved for OS/traps
sp
13
Stack pointer
fp
14
Frame pointer
ra
15
Return address
24
if statement

if (i j) goto L1
f g h
L1 f f - i
beq s3, a4, L1 if ij goto L1
add s0, s1, s2 f g h
L1 sub s0, s0, s3 f f - i

if i ! j, we want to add g h and store it in f
before we calculate f f - i
how is this compare actually done?
25
Recall example MIPS machine
26
Anddid we just use a go to???
27
if statement

if (i ! j)
f g h
f f - i
beq s3, a4, L1
add s0, s1, s2
L1 sub s0, s0, s3

28
if-then-else

if (i j)
f g h
else
f g
beq s3, a4, Then if i j we want to
add
add s0, s1, zero f g h (the else
part)
beq zero, zero, Exit
Then add s0, s1, s2 f g h
Exit

reg
contents
Note The LC2200 has no BNE
s0
f
s1
g
s2
h
s3
i
a4
j
29
Loop with Variable Array Index

Loop g g Ai
i i j
if (i ! h) goto Loop

reg
contents
s0
g
s1
h
s2
i
s3
j
Loop add a1, s2, a0 lw a1,
0(a1) add s0, s0, a1 add
s2, s2, s3 beq s2, s1, Exit
beq zero, zero, Loop Exit
a0
addr of A
Plus some temps
30
While Loop
reg
contents

while (savi k)
i i j
Loop add a1, s1, s0
lw a2, 0(a1)
beq a2, s3, Skip
beq zero, zero, Exit
Skip add s1, s1, s2
beq zero, zero, Loop
Exit

s0
addr(sav)
s1
i
s2
j
s3
k
a1
temp
a2
temp
31
Case/Switch

switch (k)
case 0 f i j break
case 1 f g h break
case 2 f g - h break
case 3 f i - j break
slt t3, s5, zero If k lt 0
bne t3, zero, Exit Exit
slt t3, s5, t2 If k gt 4
beq t3, zero, Exit Exit
add t1, s5, s5 mpy k by 4
add t1, t1, t1

Mips
32
Case/Switchcontinued

switch (k)
case 0 f i j break
case 1 f g h break
case 2 f g - h break
case 3 f i - j break

Jmptbl Address(L0) Address(L1) Address(L2)
Address(L3)
33
Case/Switchcontinued

switch (k)
case 0 f i j break
case 1 f g h break
case 2 f g - h break
case 3 f i - j break

add t1, t1, t4 t1 Jmptabk lw t0,
0(t1) jr t0 jump based
on reg t0 L0 add s0, s3, s4 j
Exit etc...
34
Procedures (the previous examples were just
if statements, case statements, and loops what
about something like a function call)
35
More detail (lots more detail actually!)
36
Procedures

Procedural abstraction
What is the programmers model?

37
Procedures

Procedure abstraction
What is the programmers model?
What does the compiler have to do?
Remember functions are not compiled at the same
time

38
Procedures

Procedure abstraction
What is the programmers model?
What does the compiler have to do?
Remember functions are not compiled at the same
time
Simple hardware to support procedures?

39
What do we need?

What do we need to support procedure calls in
assembly language?
Nested modules/Recursion
Pass values to modules
Return value(s) from module
Asynchronous compilation
Need to do things in a uniform way
Continue execution after module finishes

40
Another way to look at it

What does a programmer expect?
1. arguments bound to formal parameters
2. space for local variables
3. means to return a value (or values)
4. arbitrary nesting of procedure invocations
e.g. for recursion

foo() bar(int a)
bar(42) int temp 3
... return(temp a)

41
Procedure Issues

Hardware instructions to support this model?
Call/return
Remember where we are in the program. Why?
Program counter (PC)
What should happen on every instruction
execution?
Would we need a PC if there were no procedure
abstraction?
Load/store
Stack
Push and pop
Stack pointer (sp)
Stack frames
What do we store and restore on call/return?

42
More procedure issues

Software conventions (Why?)
Reserve some number of registers for parameters,
return values, and return address
e.g. LC2200
5 for params, 1 for return values, one for return
address
JALR ltproc-addr in reggt, ra ra is
return-addr
JALR ra, zero Where does this go?
What if we have more params or return values?
Common use stack/memory
Registers used in procedures
Temporary registers
Caller does not expect value to be preserved upon
return
LC2200 a0 to a4
Saved registers
Caller does expect value to be preserved on
return
LC2200 s0 to s3 (simplifies amount of state to
be saved)

43
MIPS Registers
FYI
44
LC2200 Registers
Recall
45
A simple example
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi s0, zero, 3
temp 3 add v0, a0, s0 a
temp jalr ra, zero return!
46
A Tale of Two Closets
A
S
My Closet
Renters Closet
47
A Tale of Two Closets
A
S
My Closet
Renters Closet
48
A Tale of Two Closets
A
S
My Closet
Renters Closet
49
Procedure calls

Now its not just a matter of going somewhere else
but we also may need to save state
At least return address must be saved (the PC)
Some architectures provide a mechanism to save
registers, others make compiler do it
Two basic conventions used
Caller saving The calling procedure must save
the registers it wants to use after the return
Callee saving Called procedure must save the
registers it wants to use
Most compilers conservatively caller save any
variable that might be accessed during a call

50
Procedure call issues to think about

Assume procedure P1 calls procedure P2
Both procedures need to manipulate global
variable x
If P1 put x in a register, it needs to tell P2
about it
But what if P2 calls P3 which uses a register
where x was put by P1? And P3 shouldnt touch
x.
Some programs may work more efficiently with the
caller saved procedure but others might benefit
from the callee saving procedure.
Most sophisticated compilers use a combination of
both for maximum efficiency

51
In detail

how do we handle an arbitrary nesting of
procedure invocations?
Hmmm this means we need to save a bunch of
registers somewhere so we can restore state upon
return.

52
Questions to consider

Compiling simple procedures
Just need to save/restore registers used by
called procedure
Who should do this? caller? callee?
What regs in the LC2200 example?

53
Question

Caller has values in s1 and a4
Callee will need (to destroy) s1 and a4 to
perform its operations.
Who should save s1?
How many people say Caller?
How many people say Callee?
How many people say either?

54
Another question

Caller has values in s1 and a4
Callee will need (to destroy) s1 and a4 to
perform its operations.
Who should save a4?
How many people say Caller?
How many people say Callee?
How many people say either?

55
Stacks

Why a stack?

56
Using Stacks

Basic stack definition
grows up, grows down
next empty or last full
Who saves (caller or callee?)
Mechanics
caller at call time
callee at entry
callee at exit
caller after the return

57
Stacks

Stack grows up? down?
sp points to next empty? or last full?
Example
stack grows up (in addresses)
points to next empty slot
PUSH(x)
POP(x)

1000
full items
1001
1002
sp 1003
1003
1004
1005
1006
ADDI sp, sp, 1 SW x, -1(sp)
LW x, -1(sp) ADDI sp, sp, -1
58
Stack Conventions
Unix memory layout stack grows down, heap grows
up
4GB-1
stack
unused
heap
data
code
0
59
Stack Conventions
4GB-1
stack
unused
64KB-1
unused
LC-2200 layout? no heap. stack grows up. keeps
memory together
heap
data
stack
data
code
code
0
0
60
Stack Usage

RISC machines dont have PUSH/POP instructions,
you have to do it manually.
Usually, you dont synthesize PUSH/POP but rather
do alloc-use-dealloc, e.g
note this is a feature...

foo addi sp, sp, 2 sw ra, -2(sp)
sw s0, -1(sp) ... lw
s0, -1(sp) lw ra, -2(sp)
addi sp, sp, -2 jalr ra, zero
61
Aside Leaf Procedures

If a procedure calls no subroutines, its a
leaf in the call tree.
Leaf procedures dont need to save ra. In fact,
they may not need to save anything or even
allocate stack space!
another optimization enabled by RISC

62
Example Leaf Procedure

int leaf_example(int g, int h, int i, int j)
int f
f (g h) - ( i j)
return f

63
Example Leaf Procedure

int leaf_example(int g, int h, int i, int j)
int f
f (g h) - ( i j)
return f

?
What will be where?
(i.e. how the heck do we do this)
?
64
Example Leaf Procedure

int leaf_example(int g, int h, int i, int j)
int f
f (g h) - ( i j)
return f

For educational purposes, we'll use the s
registers for temporary storage of values. Where
can we put the existing values that are in the s
registers?
65
Example Leaf Procedure

leaf_example
addi sp, sp, -3 Make room for 3 items
sw s0, 2(sp) Push them onto stack
sw s1, 1(sp)
sw s2, 0(sp)
add s2, a0, a1 Do the calc... gh
add s1, a2, a3 ij
sub s0, s2, s1 (gh) - (ij)
add v0, s0, zero Move!
lw s2, 0(sp) Pop everything off
lw s1, 1(sp) stack and put back
lw s0, 2(sp) into regs
addi sp, sp, 3 Adjust stack pointer
jalr ra, zero return

66
Question

Class question
What just happened on the last slide?
Hint
Think about information and conventions that
were discussed in previous slides

67
Trivial procedure call revisited
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
68
Trivial procedure call revisited
2
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
69
A-type vs. S-type Temporaries

s0..s3 temporaries
s for saved
callee-saved gotta save em before you can use
em
a0..a4 arguments or temporaries
caller-saved they dont survive a procedure call
so you save them (as a caller) if you want them
to survive.
Why two types? When would you use each type?

70
A-type vs. S-type

Crudely
S-type for long-lifetime temporaries
A-type for short-lifetime temporaries
Ex
Aside MIPS has T-type as well

void baz() int i, total 0 for (i 0 i
lt 1000 i) total qux(i)
return(total)
int qux(int x) return(2 x 1)
71
Trivial procedure call revisited ?
2
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
S-type was a poor choice!
72
Trivial procedure call revisited ?
3
foo addi a0, zero, 42 constant
addi at, zero, bar jalr at, ra
jump-and-link ...
halt bar addi sp, sp, 2 alloc
sw ra, -1(sp) save RA sw
s0, -2(sp) save temp addi
s0, zero, 3 temp 3 add v0,
a0, s0 a temp lw s0, -2,(sp)
restore temp lw ra, -1,(sp)
restore RA addi sp, sp, -2
dealloc jalr ra, zero
return!
S-type was a poor choice!
73
Sidebar on "IN" parameters

The alert student may note that in the preceding
example the values of g, h, i and j were in the
a0 thru a3 registers.
If we made a change to any of those values
wouldn't that change persist (I hear you say).
Well, sort of but not really...

74
What Really Happens?

The calling program would be required to take the
values from memory and copy them into the a
registers
The called module does it's thing.
Upon return the values in the a registers are
ignored! They are absolutely not guaranteed to be
valid!!!

75
Questions?if not, a long example
76
Caller/Callee Mechanics
who does what when?

Four places

foo() bar(int a)

int temp 3 bar(42)
... ...
return(temp a)

2. callee at entry
1. caller at call time
4. caller after return
3. callee at exit
77
But first, strategy
do most work at callee entry/exit