Lecture 3 Instruction Set Architecture - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Lecture 3 Instruction Set Architecture

Description:

Introduce wide variety of design alternative to instruction set architecture ... Address the issue of a languages and compiler and their bearing on ISA ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 59
Provided by: pradondet
Category:

less

Transcript and Presenter's Notes

Title: Lecture 3 Instruction Set Architecture


1
Lecture 3Instruction Set Architecture
  • Pradondet Nilagupta
  • Fall 2000
  • (original notes from Prof. Mike Schulte)

2
Overview ISA (I)
  • Concentrate on ISA
  • Introduce wide variety of design alternative to
    instruction set architecture
  • Focus on four topics
  • Classification of instruction set alternative
  • Give some qualitative assessment of the advantage
    and disadvantage of various approach
  • Present and analyze some instruction set
    measurement that are largely independent of a
    specific instruction

3
Overview ISA (II)
  • Address the issue of a languages and compiler and
    their bearing on ISA
  • Show how these idea are reflected in DLX
    instruction set, which is typical of recent
    instruction set architectures
  • Examine a wide variety of architectural
    measurement
  • Measurements depend on the programs measured and
    on the compiler used in making these measurements

4
Hot Topics in Computer Architecture
  • 1950s and 1960s
  • Computer Arithmetic
  • 1970 and 1980s
  • Instruction Set Design
  • ISA Appropriate for Compilers
  • 1990s
  • Design of CPU
  • Design of memory system
  • Design of I/O system
  • Multiprocessors
  • Instruction Set Extensions

5
Instruction Set Architecture
  • Instruction set architecture is the structure of
    a computer that a machine language programmer
    must understand to write a correct (timing
    independent) program for that machine.
  • The instruction set architecture is also the
    machine description that a hardware designer must
    understand to design a correct implementation of
    the computer.

6
Instruction Set Architecture
  • The instruction set architecture serves as the
    interface between software and hardware

software
instruction set
hardware
7
Interface Design
  • A good interface
  • Lasts through many implementations (portability,
    compatibility)
  • Is used in many different ways (generality)
  • Provides convenient functionality to higher
    levels
  • Permits an efficient implementation at lower
    levels

8
What Are the Components of an ISA?
  • Sometimes known as The Programmers Model of the
    machine
  • Storage cells
  • General and special purpose registers in the CPU
  • Many general purpose cells of same size in memory
  • Storage associated with I/O devices
  • The machine instruction set
  • The instruction set is the entire repertoire of
    machine operations
  • Makes use of storage cells, formats, and results
    of the fetch/execute cycle
  • i.e., register transfers

9
What Are the Components of an ISA?
  • The instruction format
  • Size and meaning of fields within the instruction
  • The nature of the fetch-execute cycle
  • Things that are done before the operation code is
    known

10
Programmers Models of Various Machines
M
6
8
0
0
V
A
X
1
1
P
P
C
6
0
1
I
8
0
8
6
(
i
n
t
r
o
d
u
c
e
d

1
9
9
3
)
(
i
n
t
r
o
d
u
c
e
d

1
9
8
1
)
(
i
n
t
r
o
d
u
c
e
d

1
9
7
5
)
(
i
n
t
r
o
d
u
c
e
d

1
9
7
9
)
7
0
3
1
0
0
6
3
1
5
7
0
8
0
A
A
X
R
0
3
2

6
4
-
b
i
t
B
X
1
5
B
1
2

g
e
n
e
r
a
l
D
a
t
a
f
l
o
a
t
i
n
g

p
o
i
n
t
p
u
r
p
o
s
e
r
e
g
i
s
t
e
r
s
R
1
1
C
X
I
X
6

s
p
e
c
i
a
l
r
e
g
i
s
t
e
r
s
r
e
g
i
s
t
e
r
s
p
u
r
p
o
s
e
3
1
D
X
A
P
S
P
r
e
g
i
s
t
e
r
s
P
C
F
P
0
3
1
S
P
S
t
a
t
u
s
S
P
A
d
d
r
e
s
s
0
3
2

3
2
-
b
i
t
B
P
a
n
d
P
C
g
e
n
e
r
a
l

c
o
u
n
t
S
I
p
u
r
p
o
s
e
r
e
g
i
s
t
e
r
s
r
e
g
i
s
t
e
r
s
D
I
P
S
W
3
1
C
S
M
e
m
o
r
y
0
3
1
0
D
S
0
3
2
s
e
g
m
e
n
t
2

b
y
t
e
s

1
6
2

b
y
t
e
s

S
S
r
e
g
i
s
t
e
r
s
o
f

m
a
i
n

M
o
r
e

t
h
a
n

5
0

o
f

m
a
i
n

m
e
m
o
r
y
3
2
-
b
i
t

s
p
e
c
i
a
l
m
e
m
o
r
y
E
S
c
a
p
a
c
i
t
y
p
u
r
p
o
s
e
c
a
p
a
c
i
t
y
1
6
3
2
2


1
2


1
r
e
g
i
s
t
e
r
s
I
P
M
o
r
e

t
h
a
n

3
0
0
S
t
a
t
u
s
F
e
w
e
r
i
n
s
t
r
u
c
t
i
o
n
s

t
h
a
n

1
0
0
0
i
n
s
t
r
u
c
t
i
o
n
s
0
5
2
2

b
y
t
e
s

2
0
2

b
y
t
e
s

o
f

m
a
i
n

o
f

m
a
i
n

m
e
m
o
r
y
m
e
m
o
r
y
c
a
p
a
c
i
t
y
c
a
p
a
c
i
t
y
5
2
2


1
2
0
2


1
M
o
r
e

t
h
a
n

2
5
0
M
o
r
e

t
h
a
n

1
2
0
i
n
s
t
r
u
c
t
i
o
n
s
i
n
s
t
r
u
c
t
i
o
n
s
11
What Must an Instruction Specify?(I)
Data Flow
  • Which operation to perform add r0, r1, r3
  • Ans Op code add, load, branch, etc.
  • Where to find the operand or operands add r0, r1,
    r3
  • In CPU registers, memory cells, I/O locations, or
    part of instruction
  • Place to store result add r0, r1, r3
  • Again CPU register or memory cell

12
What Must an Instruction Specify?(II)
  • Location of next instruction add r0, r1, r3
    br endloop
  • Almost always memory cell pointed to by program
    counterPC
  • Sometimes there is no operand, or no result, or
    no next instruction. Can you think of examples?

13
Instructions Can Be Divided into 3 Classes (I)
  • Data movement instructions
  • Move data from a memory location or register to
    another memory location or register without
    changing its form
  • Loadsource is memory and destination is register
  • Storesource is register and destination is
    memory
  • Arithmetic and logic (ALU) instructions
  • Change the form of one or more operands to
    produce a result stored in another location
  • Add, Sub, Shift, etc.

14
Instructions Can Be Divided into 3 Classes (II)
  • Branch instructions (control flow instructions)
  • Alter the normal flow of control from executing
    the next instruction in sequence
  • Br Loc, Brz Loc2,unconditional or conditional
    branches

15
Examples of Data Movement Instructions
Instruction Meaning Machine MOV A, B Move 16
bits from memory location A to VAX11 Location
B LDA A, Addr Load accumulator A with the byte at
memory M6800 location Addr lwz R3, A Move
32-bit data from memory location A to PPC601
register R3 li 3, 455 Load the 32-bit integer
455 into register 3 MIPS R3000 mov R4, dout Move
16-bit data from R4 to output port dout DEC
PDP11 IN, AL, KBD Load a byte from in port KBD to
accumulator Intel Pentium LEA.L (A0), A2 Load
the address pointed to by A0 into A2 M6800
  • Lots of variation, even with one instruction type

16
Examples of ALUInstructions
Instruction Meaning Machine MULF A, B,
C multiply the 32-bit floating point values
at VAX11 mem locns. A and B, store at C nabs
r3, r1 Store abs value of r1 in r3 PPC601 ori 2,
1, 255 Store logical OR of reg 1 with 255 into
reg 2 MIPS R3000 DEC R2 Decrement the 16-bit
value stored in reg R2 DEC PDP11 SHL AX, 4 Shift
the 16-bit value in reg AX left by 4 bit
posns. Intel 8086
  • Notice again the complete dissimilarity of both
    syntax and semantics.

17
Examples of Branch Instructions
Instruction Meaning Machine BLSS A, Tgt Branch to
address Tgt if the least significant VAX11 bit
of mem locn. A is set (i.e. 1) bun r2 Branch
to location in R2 if result of previous PPC601 fl
oating point computation was Not a Number
(NAN) beq 2, 1, 32 Branch to location (PC 4
32) if contents MIPS R3000 of 1 and 2 are
equal SOB R4, Loop Decrement R4 and branch to
Loop if R4 ? 0 DEC PDP11 JCXZ Addr Jump to Addr
if contents of register CX ? 0. Intel 8086
18
ISA Metrics
  • Orthogonality
  • No special registers, few special cases, all
    operand modes available with any data type or
    instruction type
  • Completeness
  • Support for a wide range of operations and target
    applications
  • Regularity
  • No overloading for the meanings of instruction
    fields
  • Streamlined
  • Resource needs easily determined
  • Ease of compilation (programming?), Ease of
    implementation, Scalability

19
Instruction Set Design Issues
  • Instruction set design issues include
  • Where are operands stored?
  • registers, memory, stack, accumulator
  • How many explicit operands are there?
  • 0, 1, 2, or 3
  • How is the operand location specified?
  • register, immediate, indirect, . . .
  • What type size of operands are supported?
  • byte, int, float, double, string, vector. . .
  • What operations are supported?
  • add, sub, mul, move, compare . . .

20
Evolution of Instruction Sets
Single Accumulator (EDSAC 1950)
Accumulator Index Registers
(Manchester Mark I, IBM 700 series 1953)
Separation of Programming Model from
Implementation
High-level Language Based
Concept of a Family
(B5000 1963)
(IBM 360 1964)
General Purpose Register Machines
Complex Instruction Sets
Load/Store Architecture
(CDC 6600, Cray 1 1963-76)
(Vax, Intel 8086 1977-80)
RISC
(Mips,Sparc,88000,IBM RS6000, . . .1987)
21
Evolution of Instruction Sets
  • Major advances in computer architecture are
    typically associated with landmark instruction
    set designs
  • Ex Stack VS. GPR (System 360)
  • Design decisions must take into account
  • technology
  • machine organization
  • programming languages
  • compiler technology
  • operating systems
  • The design decisions in turn influence these.

22
Classifying ISAs
  • Accumulator (before 1960)
  • 1 address add A acc ? acc memA
  • Stack (1960s to 1970s)
  • 0 address add tos ? tos next
  • Memory-Memory (1970s to 1980s)
  • 2 address add A, B memA ? memA memB
  • 3 address add A, B, C memA ? memB memC
  • Register-Memory (1970s to present)
  • 2 address add R1, A R1 ? R1 memA
  • load R1, A R1 ? memA
  • Register-Register (Load/Store) (1960s to
    present)
  • 3 address add R1, R2, R3 R1 ? R2 R3
  • load R1, R2 R1 ? memR2
  • store R1, R2 memR1 ? R2

23
Ex. Expression Evaluation for 3-, 2-, 1-, and
0-Address Machines

  • Number of instructions number of addresses both
    vary
  • Discuss as examples size of code in each case

24
Stack Architectures
  • Instruction set
  • add, sub, mult, div, . . .
  • push A, pop A
  • Example AB - (ACB)
  • push A
  • push B
  • mul
  • push A
  • push C
  • push B
  • mul
  • add
  • sub

A
C
B
BC
ABC
result
A
B
AB
AB
A
C
A
AB
A
AB
A
AB
AB
25
The 0-Address, or Stack, Machine and Instruction
Format
26
Stacks Pros and Cons
  • Pros
  • Good code density (implicite top of stack)
  • Low hardware requirements
  • Easy to write a simpler compiler for stack
    architectures
  • Cons
  • Stack becomes the bottleneck
  • Little ability for parallelism or pipelining
  • Data is not always at the top of stack when need,
    so additional instructions like TOP and SWAP are
    needed
  • Difficult to write an optimizing compiler for
    stack architectures

27
Accumulator Architectures
  • Instruction set
  • add A, sub A, mult A, div A, . . .
  • load A, store A
  • Example AB - (ACB)
  • load B
  • mul C
  • add A
  • store D
  • load A
  • mul B
  • sub D

B
BC
ABC
A
ABC
AB
result
28
1-Address Machine and Instruction Format
M
e
m
o
r
y
C
P
U
O
p
1
A
d
d
r

O
p
1
W
h
e
r
e

t
o

f
i
n
d

o
p
e
r
a
n
d
2
,

a
n
d
w
h
e
r
e

t
o

p
u
t

r
e
s
u
l
t
A
c
c
u
m
u
l
a
t
o
r
P
r
o
g
r
a
m
2
4
N
e
x
t
i
A
d
d
r

N
e
x
t
i
c
o
u
n
t
e
r
W
h
e
r
e

t
o

f
i
n
d
n
e
x
t

i
n
s
t
r
u
c
t
i
o
n
Need instructions to load and store operands LDA
OpAddr STA OpAddr
  • Special CPU register, the accumulator, supplies 1
    operand and stores result
  • One memory address used for other operand

29
Accumulators Pros and Cons
  • Pros
  • Very low hardware requirements
  • Easy to design and understand
  • Cons
  • Accumulator becomes the bottleneck
  • Little ability for parallelism or pipelining
  • High memory traffic

30
Memory-Memory Architectures
  • Instruction set
  • (3 operands) add A, B, C sub A, B, C mul A, B, C
  • (2 operands) add A, B sub A, B mul A, B
  • Example AB - (ACB)
  • 3 operands 2 operands
  • mul D, A, B mov D, A
  • mul E, C, B mul D, B
  • add E, A, E mov E, C
  • sub E, D, E mul E, B
  • add E, A
  • sub E, D

31
The 2-Address Machine and Instruction Format
M
e
m
o
r
y
C
P
U
O
p
1
A
d
d
r

O
p
1
O
p
2
A
d
d
r

O
p
2
,
R
e
s
P
r
o
g
r
a
m
2
4
N
e
x
t
i
A
d
d
r

c
o
u
n
t
e
r
N
e
x
t
i
W
h
e
r
e

t
o

f
i
n
d
n
e
x
t

i
n
s
t
r
u
c
t
i
o
n
  • Result overwrites Operand 2
  • Needs only 2 addresses in instruction but less
    choice in placing data

32
Memory-MemoryPros and Cons
  • Pros
  • Requires fewer instructions (especially if 3
    operands)
  • Easy to write compilers for (especially if 3
    operands)
  • Cons
  • Very high memory traffic (especially if 3
    operands)
  • Variable number of clocks per instruction
  • With two operands, more data movements are
    required

33
Register-Memory Architectures
  • Instruction set
  • add R1, A sub R1, A mul R1, B
  • load R1, A store R1, A
  • Example AB - (ACB)
  • load R1, A
  • mul R1, B / AB /
  • store R1, D
  • load R2, C
  • mul R2, B / CB /
  • add R2, A / A CB /
  • sub R2, D / AB - (A CB) /

34
Memory-Register Pros and Cons
  • Pros
  • Some data can be accessed without loading first
  • Instruction format easy to encode
  • Good code density
  • Cons
  • Operands are not equivalent (poor orthorganality)
  • Variable number of clocks per instruction
  • May limit number of registers

35
Load-Store Architectures
  • Instruction set
  • add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3
  • load R1, R4 store R1, R4
  • Example AB - (ACB)
  • load R1, A
  • load R2, B
  • load R3, C
  • load R4, R1
  • load R5, R2
  • load R6, R3
  • mul R7, R6, R5 / CB /
  • add R8, R7, R4 / A CB /
  • mul R9, R4, R5 / AB /
  • sub R10, R9, R8 / AB - (ACB) /

36
The 3-Address Machine and Instruction format
CPU
Memory
  • Address of next instruction kept in processor
    state registerthe PC (except for explicit
    branches/jumps)
  • Rest of addresses in instruction
  • Discuss savings in instruction word size

37
Load-Store Pros and Cons
  • Pros
  • Simple, fixed length instruction encoding
  • Instructions take similar number of cycles
  • Relatively easy to pipeline
  • Cons
  • Higher instruction count
  • Not all instructions need three operands
  • Dependent on good compiler

38
RegistersAdvantages and Disadvantages
  • Advantages
  • Faster than cache (no addressing mode or tags)
  • Deterministic (no misses)
  • Can replicate (multiple read ports)
  • Short identifier (typically 3 to 8 bits)
  • Reduce memory traffic
  • Disadvantages
  • Need to save and restore on procedure calls and
    context switch
  • Cant take the address of a register (for
    pointers)
  • Fixed size (cant store strings or structures
    efficiently)
  • Compiler must manage

39
General Register Machine and Instruction Formats
40
General Register Machine and Instruction Formats
  • It is the most common choice in todays
    general-purpose computers
  • Which register is specified by small address (3
    to 6 bits for 8 to 64 registers)
  • Load and store have one long one short address
    1- addresses
  • Arithmetic instruction has 3 half addresses

41
Real Machines Are Not So Simple
  • Most real machines have a mixture of 3, 2, 1, 0,
    and 1- address instructions
  • A distinction can be made on whether arithmetic
    instructions use data from memory
  • If ALU instructions only use registers for
    operands and result, machine type is load-store
  • Only load and store instructions reference memory
  • Other machines have a mix of register-memory and
    memory-memory instructions

42
Big Endian Addressing
  • With Big Endian addressing, the byte binary
    address
  • x . . . x00
  • is in the most significant position (big end) of
    a 32 bit word (IBM, Motorola, Sun, HP).

43
Little Endian Addressing
  • With Little Endian addressing, the byte binary
    address
  • x . . . x00
  • is in the least significant position (little
    end) of a 32 bit word (DEC, Intel).

44
Operand Alignment
  • An access to an operand of size s bytes at byte
    address A is said to be aligned if
  • A mod s 0

45
Unrestricted Alignment
  • If the architecture does not restrict memory
    accesses to be aligned then
  • Software is simple
  • Hardware must detect misalignment and make 2
    memory accesses
  • Expensive detection logic is required
  • All references can be made slower
  • Sometimes unrestricted alignment is required for
    backwards compatibility

46
Restricted Alignment
  • If the architecture restricts memory accesses to
    be aligned then
  • Software must guarantee alignment
  • Hardware detects misalignment access and traps
  • No extra time is spent when data is aligned
  • Since we want to make the common case fast,
    having restricted alignment is often a better
    choice, unless compatibility is an issue.

47
Types of Addressing Modes (VAX)
memory
  • 1. Register direct Ri
  • 2. Immediate (literal) n
  • 3. Displacement MRi n
  • 4. Register indirect MRi
  • 5. Indexed MRi Rj
  • 6. Direct (absolute) Mn
  • 7. Memory Indirect MMRi
  • 8. Autoincrement MRi
  • 9. Autodecrement MRi - -
  • 10. Scaled MRi Rjd n
  • Studies by Clark and Emer indicate that modes
    1-4 account for 93 of all operands on the VAX.

reg. file
48
Frequency of Immediate Addressing on DLX
  • Not all instructions can take advantage of
    immediate addressing.

49
Types of Operations
  • Arithmetic and Logic AND, ADD
  • Data Transfer MOVE, LOAD, STORE
  • Control BRANCH, JUMP, CALL
  • System OS CALL, VM
  • Floating Point ADDF, MULF, DIVF
  • Decimal ADDD, CONVERT
  • String MOVE, COMPARE
  • Graphics (DE)COMPRESS

50
80x86 Instruction Frequency
51
Relative Frequency of Control Instructions
  • Design hardware to handle branches quickly,
    since these occur most frequently

52
Frequency of Operand Sizeson 32-bit Load-Store
Machine
  • For floating-point want good performance for 64
    bit operands.
  • For integer operations want good performance for
    32 bit operands.

53
Encoding an Instruction set
  • a desire to have as many registers and addressing
    mode as possible
  • the impact of size of register and addressing
    mode fields on the average instruction size and
    hence on the average program size
  • a desire to have instruction encode into lengths
    that will be easy to handle in the implementation

54
Three choice for encoding the instruction set
  • Variable
  • Instruction length varies based on opcode and
    address specifiers
  • For example, VAX instructions vary between 1 and
    53 bytes
  • Good code density, but difficult to decode
  • Fixed
  • Only a single size for all instructions
  • For example, DLX, MIPS, Power PC, Sparc all have
    32 bit instructions
  • Not as good code density, but easier to decode
  • Hybrid
  • Have multiple format lengths specified by the
    opcode
  • For example, IBM 360/370 and Intel 80x86
  • Compromise between code density and ease of decode

55
Compilers and ISA
  • Compiler Goals
  • All correct programs compile correctly
  • Most compiled programs execute quickly
  • Most programs compile quickly
  • Achieve small code size
  • Provide debugging support
  • Multiple Source Compilers
  • Same compiler can compiler different languages
  • Multiple Target Compilers
  • Same compiler can generate code for different
    machines

56
Compilers Phases
  • Compilers use phases to manage complexity
  • Front end
  • Convert language to intermediate form
  • High level optimizer
  • Procedure inlining and loop transformations
  • Global optimizer
  • Global and local optimization, plus register
    allocation
  • Code generator (and assembler)
  • Dependency elimination, instruction selection,
    pipeline scheduling

57
Allocation of Variables
  • Stack
  • used to allocate local variables
  • grown and shrunk on procedure calls and returns
  • register allocation works best for
    stack-allocated objects
  • Global data area
  • used to allocate global variables and constants
  • many of these objects are arrays or large data
    structures
  • impossible to allocate to registers if they are
    aliased
  • Heap
  • used to allocate dynamic objects
  • heap objects are accessed with pointers
  • never allocated to registers

58
Designing ISA to Improve Compilation
  • Provide enough general purpose registers to ease
    register allocation ( more than 16).
  • Provide regular instruction sets by keeping the
    operations, data types, and addressing modes
    orthogonal.
  • Provide primitive constructs rather than trying
    to map to a high-level language.
  • Simplify trade-off among alternatives.
  • Allow compilers to help make the common case fast.
Write a Comment
User Comments (0)
About PowerShow.com