Kein Folientitel

About This Presentation

Title:

Kein Folientitel

Description:

Title: Kein Folientitel Author: Udi Last modified by: brinks Created Date: 7/4/2002 8:33:44 PM Document presentation format: Bildschirmpr sentation – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 57

Provided by: udi7

Category:

more less

Transcript and Presenter's Notes

Title: Kein Folientitel

1
Computer Architecture Slide Sets WS
2011/2012 Prof. Dr. Uwe Brinkschulte Prof. Dr.
Klaus Waldschmidt
Part 7 Instruction Set Architecture (ISA)
2
Programming model
The Instruction Set Architecture (ISA) is the
programming model which is needed for programming
a processor. All details concerning the
implementation of the processor are out of focus
in the ISA. Therefore the ISA can be regarded
as an abstract interface between the compiler and
the microarchitecture of the processor.
3
Programming model

The following key questions lead us to the
specification of this interface
How data is represented?
Where data is stored?
How data is accessed?
How instructions are coded?
Which instructions are available to process
data?

4
Programming model

Therefore, the ISA defines
machine data types
address space organisation
register model
addressing modes
machine instruction set

5
Programming model
Since the programming model abstracts from
implementation details it is realized either in
hardware (real processors) or in software
(virtual processors). For instance, if the
instruction set includes an instruction for
multiplication, the CPU of the processor needs a
digital combinatorial circuit for multiplication.
In this sense, a relation between the abstract
ISA and the microarchitecture exists.
6
Machine data types

A data type is a tuple of values and operations
which can be performed on these values.
The operations are implemented by the machine
instructions.
Machine data types (like data types in high level
languages) are classified into structured and
unstructured data types.
An additional class are the primitive data types.

7
Primitive machine data types

Bit value set 0,1 operations AND, OR, XOR,
negation, compare
Byte value set bit pattern (8
bit) normally smallest addressable unit
operations same as for bit, additionally
ADD, SUB, MUL, DIV, SHIFT, ROTATE,
Word value set normally a multiple of
bytes largest addressable unit (in a single
operation) operations same as for byte
(sometimes the following convention is used
Half-Word 16 Bit Word 32
Bit Double Word 64 Bit)

8
Examples for more data types
n-1 i 0
n 8,16,32
- vector (bit) - BCD number (binary coded
decimal) - Binary number unsigned - two
complement number - floating point
number - string
7 0 15 0
1
0
3
2
1
0
8, 16 Bit
31 0
7
6
5
4
3
2
1
0
32 Bit
n-1 0
n 8, 16, 32
MSB LSB
n-1 0
n 8, 16, 32
MSBsign bit LSB
31 23 22 0
biased.expon.
s
fraction
n-1 0 n-1 0 n-1
0
...
n 8, 16, 32
(taken from MC680x0)
9
Address space organisation
Physical organisation depends on the processor
7 0
15 8 7 0
31 24 23 16 15
8 7 0
0 n
0 n
0 n
. . .
. . .
. . .
. . .
. . .
. . .
. . .
32 bit processor
16 bit processor
8 bit processor
n physical address, n 2 address bus width
10
Address space organisation
Locical organisation byte oriented access for
most processor types
7 0
physical word on a 8 bit processor
0 1 2 3 m
physical word on a 16 bit processor
physical word on a 32 bit processor
. . .
m logical address, m n bit width / 8
11
Address space organisation
Physical to logical mapping
31 24 23 16 15
8 7 0
0 n
0
1
2
3
4
5
6
7
8
9
10
11
locical address
12
13
14
15
physical address
. . .
. . .
. . .
. . .
m-3
m-2
m-1
m
12
Address space organisation
Aligned access the accessed word is aligned
according to its length in the physical
address space
(logical adress mod length) 0
31 24 23 16 15
8 7 0
bytes to byte boundaries
0 n
byte
byte
byte
byte
half-words to half-word boundaries
half-word
half-word
words to word boundaries
word
. . .
. . .
. . .
. . .
13
Address space organisation
Unaligned (misaligned) access the accessed word
is not aligned according to its length in
the physical address space
(logical
adress mod length) ? 0
31 24 23 16 15
8 7 0
0 n
half-word
-word
word
half-
. . .
. . .
. . .
. . .
Some processors do not support unaligned access
(e.g. SPARC)
14
Byte order in words
Two different formats
8 Bit - byte 16 Bit - word 32 Bit - word
N
big endian byte ordering
N 1
N
N 1
N
N 2
N 3
31 24
23 16
15 8
7 0
Word address is the address of the most
significant byte (used e.g. in MC680x0 or SPARC)
little endian byte ordering
8 Bit - byte 16 Bit - word 32 Bit - word
N
N 1
N
N 1
N
N 2
N 3
31 24
23 16
15 8
7 0
Word address is the address of the least
significant byte (used e.g. in Pentium family)
N least significant byte, N 3 most
significant byte
15
Byte order in words
Locical (byte oriented) memory organization of a
32 bit word
big endian byte ordering
N 3
b b1 b2 b3
N 2
byte address
N 1
N
little endian byte ordering
N
b b1 b2 b3
N 1
byte address
N 2
N 3
16
Register model

The number of registers being part of a processor
varies between 20 and 200. The advantage of data
storage in registers against DRAM or
SRAM-memories are
faster access time
register addresses could be shorter with respect
to the instruction format.
An ISA is called Load-Store-ISA if all machine
instructions except register load and store
instructions operate on the register file only.

17

Register model

Registers are classified into hidden registers
and programmer visible registers.
The visible registers are the workplace of the
programmer and are often organized as register
files.
Hidden registers are supply registers needed for
the internal functionality of the processing unit
(CPU).
Both visible and hidden registers are designed
for various purpose and functionality.

18

Register model

A register model defines which processor
registers are visible (addressable) to the
programmer.
Usually these are the working registers and the
state register.
The state register monitors the state of the
processor through conditional flags.
It shows for example whether the processor
operates in system or user mode.
The state register is mostly read-only
Commonly existing hidden registers are the
instruction register and the memory interface
registers.

19
Register implementation
D0
D1
D31
32 bit register with D-Latches
D Q
D Q
D Q
........
clk
clk
D0
Q0
D1
Q1
Q0
Q1
Q31
....
Q31
D31
Symbol
Q0
Q2
Q1
Q3
D Q Q
D Q Q
D Q Q
D Q Q
Asynchronous counter with D-Latches
clk
20

Common visible registers

program counter (PC) - contains the next
instruction address
state register (SR) - monitors the state of the
processor
stackpointer (SP) - stores the top of the stack
accumulator (ACCU) stores computation results
(in older or simple processors)
data registers (DXi) - storing operands for
computations
address registers (AXi) - storing operand
addresses
general purpose registers (GPi) - storing either
operands or operand addresses

21

Common hidden registers

instruction register (IR) contains the
currently processed instruction
instruction queue (IQ) - contains the next
instructions to be processed
memory address register (MAR) - buffers the
address of a memory access (e.g. to
save or load a general purpose
register)
memory data register (MDR) - buffers the content
of a memory access (e.g. to save or load a
general purpose register)

22

Program counter register

Pointer to the next instruction to be executed
Normally incremented
Set by a jump, jump subroutine, interrupt, return
or return from interrupt instruction

31
0

N - 4 N N 4 M
Add
A
B
Jump
M

23

Stackpointer register

Addresses a location in the memory which is
organized as a stack (LIFO).
Elements can be pushed (write) and popped (read)
only from the top of the stack.
Consequence Data are stored in a subsequent
order
Used e.g. for jump subroutine/return operation on
PC

31
0

N - 4 N N 4
Push
X
Pop

Some processors distinguish between user
stackpointer (e.g. for jump subroutine/return)
and supervisor stackpointer (e.g. for
interrupt/return from interrupt)
24

Sampe CISC register set
Intel Pentium
25

Sampe RISC register set
Power PC (extract)

The register file of RISC processors has to be
much bigger compared to CISC processors.
A RISC needs more registers, because the register
file is source and destination of all arithmetic
or logic instructions.

26
Multiple register sets
27
Multiple register sets
Processors with multiple register sets a step
towards multithreaded processors

Processor with multiple register sets
Each register set can store the program counter
(PC) and the state register (SR)
PC and SR exist only once
gt several contexts can be stored, fast
context switching

Multithreaded processor
multiple PCs and SRs exist
instructions from several threads can be
executed at the same time in the pipeline
gt several contexts can be processed

28
Multiple overlapping register sets,register
windows

The registers of a register file are grouped into
blocks called windows.
These overlapping windows are used by the
subroutines of a program.
MORS (multiple overlapping register set)

jump subroutine
Overall register set
Register window 1
Register window 2
Register window 3
Register window n
return
29
Multiple overlapping register sets,register
windows

Simplifies parameter passing on jumping to
subroutines
Each subroutine has its own working space within
the register file
Parameters can be directly passed with no need to
copy registers or pass parameters by memory
gt mainly used in RISC processors
Two possible approaches
Fixed size register window
Variable size register window

30
Fixed size window
preceding window
local register
alternative register naming r31 i7 r24
i0 r23 I7 r16 I0 r15 o7 r8 o0 r7
g7 r0 g0
r31 r24
save
In i1
continued
r23 r16
current window
Local i1
r15 r8
r31
Out i1
In i
succeeding window
Local i
r31
Out i
In i-1
CWP
r8
Local i-1
r7 r0
global registers
restore
r8
Out i-1
continued
based on SPARC architecture
31
Fixed size window

In case of the SPARC architecture, a window
consists of 32 registers of which the first 8
also belong to the preceding window and the last
8 also belong to the succeeding window.
The registers are addressed relative to the
current window pointer (CWP).
A subroutine call is performed by incrementing
the CWP and saving the PC.
The parameters are passed through the overlapping
registers of the two windows.
The content of the program counter is saved
(return address) into one of these registers.
A time consuming save and reload of registers is
omitted.
In case of an overflow of the MORS the window
contents have to be saved to a stack.

32
Variable size window
global registers
local registers
preceding
0
0
previous RSP
r0 r1
gr0
r66
gr1
Local Out
current window
?
current RSP
?
r0 r1
In Local Out
gr63
63
?
?
r64
based on AMD 29000 architecture
r65
127
33
Register size of processors with 3-address
architecture

processor/architecture (vendor) of general purpose registers of general purpose registers of general purpose registers bit width bit width bit width
processor/architecture (vendor) overall directly accessible register width register address immediate operands instr.
Alpha 21364 (Compaq) 32 32 64 Bit 5 Bit 8 Bit 32 Bit
Am29000 (AMD) 192 192 32 Bit 8 Bit 8 Bit 32 Bit
ARM7TDMI (ARM) 16 16 32 Bit 4 Bit 8 Bit 32 Bit
Crusoe TM5800 (Transmeta) 64 64 32 Bit 6 Bit - -
pa-8700 (HP) 32 32 64 Bit 5 Bit 11 Bit 32 Bit
Itanium 2 (Intel, HP) 128 128 64 Bit 7 Bit 8 Bit 41 Bit
MC88100 (Motorola) 32 32 32 Bit 5 Bit 16 Bit 32 Bit
MIPS65 20Kc (MIPS) 32 32 64 Bit 5 Bit 16 Bit 32 Bit
Nemesis C (TU Berlin) 96 16 32 Bit 4 Bit 1 Bit 16 Bit
PowerPC 970 (IBM) 32 32 64 Bit 5 Bit 16 Bit 32 Bit
UltraSPARC III Cu (SUN) 160 32 64 Bit 5 Bit 13 Bit 32 Bit
34
Register size of processors with 2-address
architecture

processor (vendor) of general purpose registers of general purpose registers of general purpose registers bit width bit width bit width
processor (vendor) overall directly accessible register width register address immediate operands smallest instr.
Athlon (AMD X86-64) 16 16 64 Bit 4 Bit 8 - 32 Bit 8 Bit
ColdFire MFC5206 (Motorola) 8 8 8 8 32 Bit 3 Bit 8 - 32 Bit 16 Bit
MC680xx (Motorola) 8 8 8 8 32 Bit 3 Bit 8 - 32 Bit 16 Bit
Pentium X (Intel X86) 8 8 32 Bit 3 Bit 8 - 32 Bit 8 Bit
35
Addressing modes

Machine instructions normally hold information
about the operand addresses.
This can either be a physical address, e.g. a
register number or the address of a memory
location, or it can be an address specification.
An address specification defines how to calculate
the address.
Thus, the address information determines the
location of the operand(s) belonging to the
instruction using one of many addressing modes.

36
Addressing modes

Instruction format
e.g. arithmetic instruction

operands needed for the execution defined by the
opcode
target source source
opcode
operand register memory address
specification itself number location
(dynamic address calculation) The result of
the dynamic address calculation is called
effective address
37

Addressing modes

immediate The operand is part of the
instruction.
memory direct and register direct The
instruction contains the operand address.
register indirect The instruction contains
a register number pointing to a register
holding the address of the operand. In
assembler code this addressing mode is typically
denoted by register name

38

Addressing modes

memory indirect A register addressed in
the instruction contains the address of a
memory cell which holds the operand address.
register offset The instruction contains a
register number and an offset. The operand
address is the sum of the registers content and
the offset.
implicit The instruction implicitly targets
a single register (like the ACCU)

39

Effective address
The address is calculated from several parts
found in the instruction and in registers or
memory cells at runtime (dynamic address
calculation). The calculated address is defined
as effective address.

Reasons for using dynamic address calculation
Addresses of data structure elements are
composed of the first address of the data
structure and the offset of the element to the
beginning. Often this offset is unknown at
compile time, therefore the effective address
has to be calculated at runtime.
Repeated execution of the same instruction,
e.g. in a loop, often accesses successive memory
addresses which have to be calculated at
runtime.

40

Effective address (cont.)

An operand address often is unknown at compile
time, because it is calculated during program
execution.
The partitioning of addresses into a base
address stored in a register and an offset
simplifies the handling of shift able
variables and shift able program code.

41
Addressing modes 1
instruction
immediate
operand
e.g. LOAD 8, r1
instruction
memory direct
eff. address
e.g. LOAD (2000), r1
memory
o p e r a n d
instruction
register direct
eff. address
e.g. LOAD r2, r1
register
o p e r a n d
42
Addressing modes 2
instruction
register address
register indirect
register
e.g. LOAD (r2), r1
e f f e c t i v e a d d r e s s
memory
o p e r a n d
instruction
register indirect with predecrement
register address
register
e.g. LOAD -(r2), r1
m e m o r y a d d r e s s
-
decrement
eff. address
memory
o p e r a n d
43
Addressing modes 3
instruction
register address
register
m e m o r y a d d r e s s
register indirect with displacement (indexed)

displacement
register

i n d e x
eff. address
e.g. LOAD.B 126(r3)(r2), r1
scaling 1, 2 or 4
memory
o p e r a n d
44
Addressing modes 4
instruction
register address
register
m e m o r y a d d r e s s
memory indirect

displacement1
memory
indirect memory address
e.g. LOAD 28(126(r2)), r1

displacement2
eff. address
memory
o p e r a n d
45
Addressing modes 5
instruction
register address
register
m e m o r y a d d r e s s
memory indirect (post indexed)

displacement1
memory
indirect memory address
e.g. LOAD.B 28(r3)(126(r2)), r1

displacement2
register
eff. address

i n d e x
memory
scaling 1, 2 or 4
o p e r a n d
46
Addressing modes 6
instruction
register address
register
m e m o r y a d d r e s s
memory indirect (preindexed)

displacement1
register

i n d e x
memory
scaling 1, 2 or 4
indirect memory address
e.g. LOAD.B 28(126(r3)(r2)), r1
eff. address

displacement2
memory
o p e r a n d
47
Access to branch target table by PC relative
addressing
memory
JMP disp (PC)(rn)
(PC)
branch target table access through program
counter relative addressing
displacement

target 0
target 1
i n d e x

target 2
48
Machine instruction set

The machine instruction set of a computer
normally includes instructions of different
formats, e.g. 0-address instructions, 1-address
instructions, 2-address instructions and
3-address instructions.
An instruction is divided into so called fields.
The more address fields an instruction contains
the smaller the number of addressable memory
cells and/or the number of operations encoded in
the opcode field becomes (if we assume a
constant instruction length).

49
Variable length vs. constant length instruction
format

Variable length (e.g. 16 - 256 Bit) mostly used
in CISC architectures
flexible instruction format
high code density
long immediate and displacement values
Constant length (e.g. 32 Bit) mostly used in
RISC architectures
simple and fast fetch
simple and fast decode
simplified pipelining

50
Scheme of basic operations of common processors
basic operations
conditional operations
unconditional operations
combinatorial operations
control flow operations
simple branches
transport operations
arithmetic logic operations
system branches
subroutine branches
load operations
arithmetic operations
store operations
logic and shift operations
call
call
return
return
semaphore operations
state and control operations
51

Instruction classes

Instruction sets are divided into groups
combining instructions with similar
functionality
Typical instruction groups
transport instructions
arithmetic instructions
logic instructions
shift and rotate instructions
bitwise instructions
string and array instructions
branch instructions
system instructions
synchronization instructions

52
Load store architecture

All instructions - except load and store
instructions - address registers only.
Load and store instructions are needed to
transfer data to and from main memory.
Mainly used in RISC ISA, combined with pipelining
it allows to complete most instructions in one
cycle
Furthermore, the address fields of instructions
becomes shorter as they only have to address a
register instead of a memory address.
A load store ISA accelerates a machine if there
are only small caches or if the caches are
completely missing and a big register file is
available.

53
Two examples for an instruction format
Example An arithmetic instruction SUBc r3, r7,
r21 binary code 11010 10101 00111 00011 1
0000000000 hexcode D54E3800
Example A store instruction STORE r24,
126(r5) binary code 00111 11000 00101
00000000001111110 hexcode 3E0A007E
31 26 21 16 11 0
31 26 21 16
0

OP
TR
SR1
SR2
OP
SR
BR
c x
DP
instruction format OP opcode TR target
register SRn source register c/x set/do not set
condition code
instruction format OP opcode SR source
register BR base register DP displacement
(signed)
54
State register of a RISC processor (based on
SPARC-architecture)
31
0
16 15
I E
P S
SR
N Z V C
IM
CWP
S
supervisor/user
carry
interrupt mask
previous S-bit
overflow
interrupt enable
conditional bits
zero
current window pointer
negative
55
Conditional codes dependent on conditional bits Z
(zero),N (negative), C (carry) und V (overflow).
Mnemonics according toMotorolas ColdFire MFC5206
processor.
conditional value mnemonic operation expression operand type
equal not equal eq ne ? Z ? Z independent
higher than higher than or same lower than lower than or same ht hs lo ls gt lt ? C ? ? Z ? C C C ? Z unsigned
greater than greater than or equal less than less than or equal gt ge lt le gt lt (N V) ? ? Z (N V) (N ? V) (N ? V) ? Z signed
arithmetic overflow arithmetic shortfall negative positive vs vc ne pl V ? V N ? N signed
56
Multimedia instructions

Typical SIMD instructions to process a single
operation on a set of data (e.g. changing the
brightness of image pixels)
Operations can be on packed integers (e.g. MMX on
Pentium) or packed floats (e.g. SSE2 on Pentium)
Typical operations arithmetic (saturated or
overflow), logic, compare, pack, unpack
Example

Write a Comment

User Comments (0)