INSTRUCTION SET DESIGN

About This Presentation

Title:

INSTRUCTION SET DESIGN

Description:

Title: COSC3330/6308 Computer Architecture Author: Jehan-Fran ois P ris Last modified by: Jehan-Fran ois P ris Created Date: 8/29/2001 4:04:21 AM – PowerPoint PPT presentation

Number of Views:116

Avg rating:3.0/5.0

Slides: 108

Provided by: Jehan76

Learn more at: https://www2.cs.uh.edu

Category:

more less

Transcript and Presenter's Notes

Title: INSTRUCTION SET DESIGN

1
INSTRUCTION SET DESIGN

Jehan-François Pâris
jparis_at_uh.edu

2
Chapter Organization

General Overview
Objectives
Realizations
The MIPS instruction set

3
Importance

The instruction set of a processor is its
interface with the outside world
Defined by the hardware
Used by assemblers, compilers and interpreters
Remained very visible to the users up to the 80s
Earlier PC programs were written in assembler

4
GENERAL OVERVIEW
5
Common features

A machine instruction normally consists of
An operation code
One, two or three operand addresses
Various flags
Some operands can be immediate
Address field contains the value of the operand
instead of its address

6
Common features

One or more operands can be in high-speed
registers
Dedicated registers
Very old solution
Register address can be specified in the opcode
General purpose registers

7
Common features

Memory operand addresses are represented in a
compact form
Base displacement
The address is specified by the contents of a
base register plus a displacement
Saves space because displacement is generally
small

8
Objectives

IS should be
Expressive
Powerful instructions
Designed for speed
Should be able to run fast and allow extensive
prefetching
Compact
Faster fetches from disk and from main memory

9
Objectives

User friendly
Very important when people were expected to
program is assembler
Manufacturers loved that because
Instruction sets were mostly proprietaryIBM
360/370 was the exception
Programs could not be ported to a different
architecture

10
The story of Gene Amdahl

Raised on a farm without electricity until he
went to high school
PhD from U Wisconsin-Madison
Became one of the top architects of the IBM/360
series
Had his next design rejected by IBM
Started his own company (Amdahl ) building big
mainframes and selling them at a much lower cost
than comparable IBM machines

11
How could they do that? (II)

Amdahl could undersell IBM by focusing on larger
"mainframes"
Andahl's computers were air-cooled while IBM's
water-cooled
"Decreased installation costs by 50,000 to
250,000."
http//www.fundinguniverse.com/company-histories/a
mdahl-corporation-history/

12
How could they do that? (I)

IBM 360 series was first series of computers
with
Very different capacities
Same instruction set
IBM pricing policy was keeping computer prices
proportional to their capacity
Did not reflect proportionally lower
manufacturing cost of high-end machines

13
The end of the story

Left Amdahl in 1979 to pursue unsuccessfully
several ventures
Amdahl Computers is now part of Fujitsu and
focuses on services

14
Inherent conflicts

Expressiveness vs. Speed
CISC instructions were powerful but microcoded
Compactness vs. Speed
Many instruction sets had instructions of
different length
Cannot know start of next instruction before
decoding the opcode of current one

15
What is microcode?

Some machine language below the instruction set
Invisible to the programmer/compiler
Each instruction corresponded to one or more
microinstructions
Some architectures allowed the user to program
new instructions or a whole new instruction set

16
AN EXAMPLE IBM 360/370/
17
The 360 architecture (I)

Developed in the 60s but kept almost unchanged
for 30 years
Thirty-two bit words and 8-bit bytes
Memory was byte-addressable
Had 24-bit addresses restricting main memory size
to 16 MB (enormous at that time!)
Later extended to 32 then 64 bits

18
The 360 architecture (II)

Instruction set included
32-bit operations for scientific and engineering
computing (FORTRAN)
Byte-oriented operations for character processing
and decimal arithmetic then judged essential for
business applications
Name of series referred to wide range of
applications that could run on the machine

19
IBM 360 instruction set (I)

Had multiple instructions formats all with
Mostly 8-bit opcodes
16 general purpose registers
RR (register to register)
16 bits

20
IBM 360 instruction set (II)

RX (register to/from indexed storage)
32 bits
Address of memory operand wascontents of base
and index registers plus12-bit displacement D

21
IBM 360 instruction set (III)

SI ( storage and immediate)
32 bits
Has an 8-bit wide immediate field I
Address of memory operand wascontents of base
register B plus12-bit displacement D

22
IBM 360 instruction set (IV)

SS ( storage to storage)
48 bits
Two memory operands
First addresses of fields of length L

23
IBM 360 instruction set (V)

S ( storage )
32 bits with a 16-bit opcode
Mostly for privileged instructions

24
Discussion (I)

Flexible and compact
Multiple instruction sizes
Must decode current instruction to know start of
the next one
Regular design
Many operations can be RR, RS, RX, SI and SS
(character manipulation and decimal arithmetic)

25
Discussion (II)

RX format
Memory address is indexed by base register and
index register
ai can be decomposed into
Current base register
Offset of a0 relative to base register
Index i multiplied by size of array element in
index register

26
Discussion (III)

Why such a complex addressing format?
Index register was used to access arrays
Base register allowed for a much shorter address
field
4 bits for base register 12 bits for
displacement
vs
24 bits for a full address

27
THE MIPS INSTRUCTION SET
28
MIPS (I)

Originally stood for Microprocessor without
Interlocked Pipeline Stages
First RISC microprocessor
Development started in 1981 under John Hennessy
at Stanford University
Started a companyMIPS Computer Systems, Inc.

29
MIPS (II)

Owned by SGI from 1992 to 1998
Until SGI switched to the Intel Itanium
architecture
Used by used by DEC, NEC, Pyramid Technology,
Siemens Nixdorf, Tandem and others during the
late 80s and 90s
Until Intel Pentium took over
Now primarily used in embedded systems

30
Overview

Two versions
MIPS32 with 32-bit addresses (discussed here)
MIPS64 with 64-bit addresses
Both MIPS architectures have
Thirty-two registers (32 bits on MIPS 32)
A byte-addressable memory
Byte, half-word and word-oriented operations

31
Bit ordering

All examples assume that byte-ordering
islittle-endian
Bits are numbered from right to left

31
0
32
Number representation (I)

MIPS uses twos complement representation for
negative numbers
00.0 represents 0
00.1 represents 1
01.1 represents 2n1 1
10.0 represents 2n1
11.1 represents 1

33
Two-complement representation

Assume n-bit integers
All positive integers have first bit equal to
zero
All negative integers have first bit equal to one
To negate an integer, we compute its complement
to 2n in unsigned arithmetic

34
Example

Assume n 4
0000, 0001, 0010, 0011, 0100, 0101, 0110 and 0111
represent integers 0 to 7
To find the representation of -3, we do
16 -3 13, that is, 1101
More generally 1000, 1001, 1010, 1011, 1100,
1101, 1110, 1111 represent negative integers -8
to -1

35
Another way to look at it (I)
Unsigned Unsigned
0 0000 1000 8
1 0001 1001 9
2 0010 1010 10
3 0011 1011 11
4 0100 1100 12
5 0101 1101 13
6 0110 1110 14
7 0111 1110 15
36
Another way to look at it (II)
U and S Unsigned Signed
0 0000 1000 8 16 - 8 - 8
1 0001 1001 9 16 - 7 - 7
2 0010 1010 10 16 - 6 - 6
3 0011 1011 11 16 - 5 - 5
4 0100 1100 12 16 - 4 - 4
5 0101 1101 13 16 - 3 - 3
6 0110 1110 14 16 -2 - 2
7 0111 1110 15 16 -1 - 1
37
Number representation (II)

Can create problems when we fetch a byte or
half-word into a 32 bit register
If we fetch the 16-bit half-word
we have two possible outcomes

100101
(unsigned)
(signed)
38
MIPS instruction set

Designed for speed and prefetching ease
All instructions are 32-bit long
Five instruction formats
Three basic formats
R, I and J
Two floating point formats
FR and FJ

39
The R format

Six-bit opcode
R instructions have three operands
Five bits per register ? 32 registers
Shamt specifies a shift amount (5 bits)
Funct selects the specific variant of operation
defined in opcode (6 bits)
Many R instructions have an all-zero opcode

40
Register naming conventions

s0, s1, , s7 for saved registers
Saved when we do a procedure call
t0, , t9 for temporary registers
Not saved when you do a procedure call
0 is the zero register
Always contains zeroes
Other conventions are used for registers used in
procedures calls

41
R format instructions (I)

Arithmetic instructions
add sa, sb, sc a b c
sub sd, se, sf d e f
Logical instructions
and sa, sb, sc a b c
or sd, se, sf d e f
nor sg, sh, si g (h i)
xor sj, sk, sl j (kl)(kl)

42
Notes

MIPS logical instructions are bitwise operations
Implement bitwise and I operations of C
MIPS has no negation instructions
Use NOR and specify register 0 as one of the two
input registers
0 is hard-wired to contain zero
nor sk, sl, 0 k l

43
R format instructions (II)

More arithmetic instructions
addu s1, s2, s3
subu s1, s2, s3
Unsigned versions of add and sub
Multiply and divide instructions will be covered
later

44
R format instructions (III)

Shift instructions
sll s1, s0, n
slr s1, s0, n
Shift contents of source register s0 by n bits
to the left (sll) or to the right (slr)
Fill the emptied bits with zero
Store results in destination register s1

45
Notes (II)

A right shift followed by a logical and can be
used to extract some specific bits
If we are interested in bits 14-15 of register
s0
We set up a register s2 containing 011two
We do
slr t0, s0, 14 use temporary register t0
and s1, t0, s2 answer is in s1

46
Notes (III)

Want to extract bits XY in positions 14 and 15
slr t0, s0,15
and s1, t0, s2 s2 contains mask 011

?????????????????xy??????????
0000000000??????????????????xy
000000000000000000000000000xy
47
R format instructions (IV)

Register comparison instructions
slt t0, s3, s4
sltu t0, s3, s4
Sets register t0 to 1 if s3 lt s4and to 0
otherwise
slt does a signed comparison
sltu does an unsigned comparison

48
R format instructions (IV)
for big jumps

Jump instructions
jr s0
Jump to address contained in register s0
Since s0 can contain a 32-bit address, the jump
can go anywhere
jalr s0, s1
Jump to address contained in register s0 and
save address of next instruction in register s1
(defaults to register 31)

49
The I format
constant/address

Last field can be
a 16-bit displacementAddress of memory operand
is the sum of the contents of register rt and
this displacement
a 16-bit constantRegister rt can then specify
a second register operand

50
Discussion

I format contains instructions involving
One register and a memory location
Two registers and an immediate value
Two registers and a jump address relative to the
current value of the program counter (PC)
MIPS instruction set uses the same format for
three very different instruction styles
Simplifies decoding hardware

51
Register and memory instructions

Transfer data
To a register load
From a register store
No arithmetic instructions as in the IBM/360 IS
All arithmetic and logical operations are either
register to register or immediate to register

52
Register and memory instructions

Load instructions
lbu s1, a(s2)
Load unsigned byte into register s1 from memory
address obtained by adding the contents of
register s2 to theaddress field a
lbhu s1, a(s2)
Load half-word unsigned

53
Register and memory instructions

Load instructions (contd)
lw s1, a(s2)
Load word unsigned
ll t1, a(s2)
Load linked used with store conditional to
implement atomic locks
The famous spinlocks

Wait for COSC 4330
54
Register and memory instructions

Store instructions
sb s1, a(s2)
Store least significant byte
sbh s1, a(s2)
Store least significant half-word
sw s1, a(s2)
Store word

55
Register and memory instructions

Store instructions (contd)
sc t1, a(s2)
Store conditional
Always follows a load linked
Fails if value in memory changes since the load
linked instruction
Saves contents of t1 in memory and sets t1 to
one if store conditional succeeds or to zero
otherwise

Wait for COSC 4330
56
Immediate instructions

Immediate value is 16-bit wide
addi 1, 2, n
Store in register s0 the sum of the contents of
register s1 and the decimal value n
addiu 1, 2, n
Unsigned version of addi
andi 1, 2, n
ori 1, 2, n

57
Three missing instructions

No subi or subiu
Instead of subtracting an immediate value nwe
can add a negative value n
No load immediate
Replaced by ori with a zero value
ori s0, 0, n load value n into register s0

? Typo in textbook?
58
Loading 32-bit constants (I)

lui s0, n
load upper immediate
Loads the decimal value n into the 16 most
significant bits of register s0

lui s0, n
59
Loading 32-bit constants (I)

Works in combination with ori

60
Other immediate instructions

Comparison instructions
slti t0, s3, n
sltui t0, s3, n
Both set register t0
To one if s3 lt sign extended value of n
To zero otherwise
slti does a signed comparison
sltui does an unsigned comparison

61
Notes

We should use immediate instructions whenever
possible
They are faster and use one register less than
the equivalent R instructions
To isolate bits 14-15 of register s0, use
slr t0, s0, 14
and s1, t0, 3 decimal 3 binary 011

62
Immediate branch instructions (I)

All immediate jump and branch instructions use a
very specific addressing scheme
Use the program counter as base register
Makes sense
Both register fields can be used for conditional
branch instructions

63
Immediate branch instructions (II)

Immediate field contains a signed number
Can jump ahead or behind the address indicated by
the current value of the PC 4
Address is multiplied by four before being added
to the value of the PC

New PC Old PC 4 4n
64
Why PC 4

Because
By the time the MIPS CPU adds 4nto the PC
register, it will contain the address of the next
instruction

65
Immediate branch instructions (III)

Works well because
PC value is a multiple of four as instructions
are properly aligned at addresses that are
multiple of 4
Solution extends the range of the jump from ?
215 B 32 KB to ? 217 B 128 KB
can use J or JR for bigger jumps

66
Immediate branch instructions (IV)

beq s0, 1 a
Branch on equal
Jump to address computed by adding 4a to the
current value of the PC 4 if the values of
registers s0 and s1 are equal
bne s0, 1 a
Branch on not equal

67
The J format (I)

Sole operand is a 26-bit address
Jump instructions
j a
Unconditionally jump to a new address
jal a
Unconditionally jump to a new address and store
address of next instruction in ra

68
The J format (I)

Note that jal has an implicit operand
Register ra (stands for return address)
Always register 31
In general, implicit operands
Allow more compact instruction formats
Complicate register management

69
Computing the new address

Obtained as follows
Bits 10 are zero (address is multiple of 4)
Bits 282 come from jump operand (26 bits)
Bits 3129 come from PC4

Allows to jump to anywhere in memory
70
Observations

The overall philosophy used to design the MIPS
instruction set was
Favor simplicity and regularity
Only three instruction formats
Optimize for the more frequent cases
Immediate to register instructions
Make good compromises

71
Comparison with IBM 360 IS

IBM 360
Variable length instructions
Sixteen GP registers
RR format has two register operands
RS format
SI format
SS format

MIPS
Fixed-size instructions
Thirty-two GP registers
R format has three register operands
An option of I format
No, but the equivalent of a non-existing RI
format
No equivalent

72
The stack

Used to store saved registers
LIFO structure starting at a high address value
and growing downwards
MIPS software reserves register 29 for the stack
pointer (sp)
We can push registers to the stack and pop them
later

73
The stack
Stack pointer sp
74
Handling procedure calls

MIPS software uses the following conventions
a0, a3 are the four argument registers that
can be used to pass parameters
v0 and v1 are the two value registers that can
be used to return values
ra is the register containing thereturn address

75
Simple procedure call (I)

Before starting the new procedure, we must save
the registers used by the calling procedures
Not all of them
Save the eight registers s0, s1, , s7
Do not save the ten temporary registerst0, ,
t9
Must also restore these registers when we exit
the procedure

76
Simple procedure call (II)

At call time
addi sp, sp, -32 eight times four bytessw
s7, 28(sp)sw s6, 24(sp)sw s5, 20(sp)sw
s4, 16(sp)sw s3, 12(sp)sw s2, 8(sp)sw
s1, 4(sp)sw s0, 0(sp

Reality CheckWe will only save the registers
that will bereused by the callee
77
Simple procedure call (III)

At return time
lw s0, 0(sp) restore the registers s0
to s7lw s1, 4(sp)lw s2, 8(sp)lw s3,
12(sp)lw s4, 16(sp)lw s5, 20(sp)lw s6,
24(sp)lw s7, 28(sp)addi sp, sp, 32 eight
times four bytes

78
Nested procedures (I)

Procedures that call other procedures
Including themselves recursive procedures
Likely to reuse argument registers and temporary
registers
Caller will save the argument registers and
temporary registers it will need after the call
Callee will save the saved registers of the
caller before reusing them

79
Nested procedures (II)

All saved registers are restored when the
procedure returns and the stack is shrunk

80
The assembler (I)

Helps the programmer with
Symbolic names
Arguments of jump and branch instructions are
labels rather than numerical constants
beq s0, s1 done
.
done

81
The assembler (II)

Pseudo-instructions
To reserve memory locations for data
To create user-friendly specialized versionsof
very general instructions
bzero s0, address
for
beq s0, 0, address

82
A review question

Which MIPS instructions involve
Three explicit operands?
Two explicit operands?
One explicit operand?

83
Answer (I)

Three explicit operands
All R format instructions but jr and jalr
All I format instructions involving an immediate
value but lui
All I format instructions involving a
conditional branch

84
Answer (II)

Two explicit operands
All I format instructions involving a load or a
store
Load upper immediate (lui)
Jump and link to register (jalr)

85
Answer (III)

One explicit operand
Jump register instruction (jr)
Both J format instructions
jal has an implicit second operand that it uses
to store the return address (ra)

86
OTHER EXAMPLES
87
Another RISC IS ARM (I)

ARM stands for Advanced RISC Machine
Originally developed as a CPU architecture for a
desktop computer manufactured by Acorn in
Britain
Now dominates the embedded system market
Company has no foundry
It licenses its products to manufacturers

88
Another RISC IS ARM (II)

Used in cell phones, embedded systems,
Same 32-bit instruction formats
Only 8 registers
Nine addressing modes
Including autoincrement
Branches use condition codes of arithmetic unit

89
Another RISC IS ARM

All instructions have a 4-bit condition code
allowing their conditional execution
Claimed to faster than using a branch
NO zero register
Includes instructions like logical not, load
immediate, move (R to R)

90
(No Transcript)
91
The x86 instruction set (I)

Not the result of a well-thought design process
Evolved over the years
8086 was the first x86 processor architecture
Announced in 1978
Sixteen-bit architecture
Assembly language-compatible extension to Intel
8080 (8-bit microprocessor)

92
The x86 instruction set (II)

New instructions were added over the years,
mostly by Intel , but also by AMD
Floating-point instructions
Multimedia instructions
More floating-point instructions
Vector computing
80836 was the first 32-bit processor
Introduced more interesting instructions

93
The x86 instruction set (III)

AMD extended the x86 architecture to 64-bit
address space and registers in 2003
Intel followed AMDs lead the next year

94
The x86 instruction set (IV)

The result is a huge instruction set combining
Old instructions kept to insure backward
compatibility
Golden handcuff
Newer 32-bit instructions that are in use today

95
A curious tradeoff

Complexity of x86 instruction set
Complicates the design of x86 microprocessors
More work for Intel and AMD architects
Effectively prevents other manufacturers to
design and sell x86-compatible microprocessors
A mixed blessing for the duopoly Intel/AMD

96
Stack-oriented architectures (I)

A rarity in microprocessor architecture
Bad solution
Used in many interpreted languages
Java virtual machine, Python
Big exceptions are Dalvik and Lua
Register-based virtual machines

97
Interpreted vs. Compiled (I)

Compiled language
Machine code is directly executable

Sourcecode
Compiler
Machinecode
98
Interpreted vs. Compiled (II)

Interpreted language

Compiler
Bytecode
Source code
Interpreter
Bytecode
Results
99
Stack-oriented architectures (II)

Named registers replaced by a stack of registers

Two basic operations
PUSH ltaddressgtpush on stack contents of memory
address
POP ltaddressgtpop top of stack and store it at
specified memory address

100
Stack-oriented architecture (III)

Binary arithmetic and logical operations
Have both operands on the top of the stack
Replace them by the result
Example

AB
101
Tradeoff

Advantages
Simple compilers and interpreters
Very compact object code
Main disadvantage
More memory references
Cannot save partial results into a register

102
CONCLUSION
103
Good design principles (I)

Simplicity favors regularity
As few instruction formats as possible for easier
decoding
Smaller is faster
No complicated instructions

104
Good design principles (II)

We should make the common case faster
PC-relative addressing for branches
Good design requires good compromises
Limiting address displacements and immediate
values to 16 bits ensures that all instructions
can fit in a 32-bit word
Weird J format addressing mode

105
Myths (I)

Adding more powerful instructions will increase
performance
IBM 360 has an instruction saving multiple
registers
DEC VAX has an instruction computing polynomial
terms
These complex instructions limit pipelining
options

106
Myths (II)

Writing programs in assembly language produces
faster code than that generated by a compiler
Compilers are now better than humans for register
allocations
In addition, programs written in assembly
language cannot be ported to other CPU
architectures
Think of Macintosh programs