Instruction Set Principles - PowerPoint PPT Presentation

About This Presentation

Title:

Instruction Set Principles

Description:

Instruction Set Principles ISA should reflect application characteristics: Desktop computing is compute-intensive, thus focusing on features favoring Integer and FP ops; – PowerPoint PPT presentation

Number of Views:146

Avg rating:3.0/5.0

Slides: 42

Provided by: DavidPa61

Learn more at: http://cse.unl.edu

Category:

more less

Transcript and Presenter's Notes

Title: Instruction Set Principles

1
Instruction Set Principles

ISA should reflect application characteristics
Desktop computing is compute-intensive, thus
focusing on features favoring Integer and FP ops
Server computing is data-intensive, focusing on
integers and char-strings (yet FP ops are still
standard in them)
Embedded computing is time-sensitive, memory and
power conciouse, thus focusing on code-density,
real-time and media data streams.

2
Instruction Set Principles

Taxonomy of ISA
Stack both operands are implicit on the top of
the stack, a data structure in which items are
accessed an a last in, first out fashion.
Accumulator one operand is implicit in the
accumulator, a special-purpose register.
General Purpose Register all operands are
explicit in specified registers or memory
locations. Depending on where operands are
specified and stored, there are three different
ISA groups
Register-Memory one operand in register and one
in memory.Examples IBM 360/370, Intel 80x86
family, Mototola 68000
Memory-Memory both operands are in memory.
Example VAX.
RegisterRegister (load store) all operands,
except for those in load and store instructions,
are in registers. Examples SPARC (Sun
Microsystems), MIPS, Precision Architecture (HP),
PowerPC (IBM), Alpha (DEC).

3
Instruction Set Principles

Taxonomy of ISA Examples

C?AB
(a) Stack
(d) Reg-Reg/Load-Store
(e) Memory-Memory
(b) Accumulator
(c) Register-Memory
TOS
Reg. Set
Reg. Set
Stack
Accumulator
ALU
ALU
ALU
ALU
ALU
Memory
Memory
Memory
Memory
Memory
Push A Push B Add Pop C
Load A Add B Store C
Load R1,A Add R1,B Store R1,C
Load R1,A Load R2,B Add R3,R1,R2 Store R3,C
Add C,A,B or Add A,B
4
Instruction Set Principles

Comparisons

ISA Type Advantages Disadvantages
Register-register (0,3) ( of mem addr, Max of opnds) Simple, fixed-length instruction encoding simple code generation model instructions take similar of cycles to execute. Higher IC than ISAs with memory references in instructions. Higher instructions count and lower instruction density leads to larger programs.
Register-memory (1,2) Data can be accessed without a separate load instruction first Instruction format tends to be easy to encode and yields good code density Operands are not equivalent since a source operand in a binary operation is destroyed Encoding a register number and a memory address in each instruction may restrict the number of registers CPIi vary by operand location.
Memory-memory (2,2) or (3,3) Most compact Does not waste registers for temporaries. Large variation in instruction size, especially for three-operand instructions. In addition, large variation in CPIi Memory accesses create memory bottleneck (no longer used today).
5
Instruction Set Principles

Addressing Memory how to specify and interpret
memory address is important since all data are
initially in the memory.
Interpreting Memory Addresses
All computers, except DSPs, are byte-addressed,
providing access for bytes, half-words (2 bytes),
words (4 bytes), and double words (8 bytes)
Ordering bytes within a larger object 8 bytes in
a double word
Little Endian
Big Endian
Byte ordering can be a problem when exchanging
data between computers with different ordering
conventions
Alignment of bytes an access to an object of
size s bytes at byte address A is aligned if A
mod s 0. Memory is aligned on a multiple of a
word or double-word boundary
Misalignment causes extra memory accesses and HW
costs

7 6 5 4 3 2 1
0
0 1 2 3 4 5 6
7
6
Instruction Set Principles

Addressing Modes how ISA specifies the address
of an object to be accessed (fig. 2.6-2.7)
Operands they can be found in registers, memory
locations, and instructions themselves
(instruction stream)
Effective Address specifies the actual memory
address when a memory location is used for an
operand
PC-Relative Addressing addressing modes that
depend on the program counter
Immediates/Literals considered as memory
addressing modes, even though the value they
access is in the instruction steam
Displacement Mode must determine the range of
displacement judiciously (via quantitative
studies, fig. 2.8)
Immediate/literal Mode must decide the level of
support (all or a subset ops) and the range of
values (fig. 2.9-10)
Modulo/Circular Mode for DSPs handling infinite,
continuous stream of data relies on circular
buffers
Bit-Reverse Mode used exclusively for FFTs

7
Instruction Set Principles

Type and Size of Operands encoding in opcode
designates operand types in all modern day
computers while tags were used to indicate types
in old machines
Desktop and Server architectures
Character 8-bit, usually in ASCII
16-bit Unicode used in Java is gaining
popularity
Integers are almost universally represented as
twos complement binary numbers short integer
(half-word), integer (word), long integer
(double-word)
Single-precision (1-word) and double-precision
(2-word) floating point the IEEE float-point
standard, IEEE standard 754
Architectures supporting business applications
Packed decimal/binary-coded decimal 4 bits are
used to encode the values 0-9 and two decimal
digits are packed into each byte, for getting
results that exactly match decimal numbers (some
decimal fractions do not have exact
representation in binary)
Frequency of access to types helps determine what
types are most important to support efficiently
(fig. 2.12)

8
Instruction Set Principles

Operands for Media and Signal Processing
Graphics applications deal with 2D and 3D images
Vertex usually of 32-bit floating-point values,
is a data structure with four components for
representing 3D images x-coordinate,
y-coordinate, z-coordinate, w-coordinate (color
or hidden surfaces)
Pixel consists of four 8-bit channels R (red),
G (green), B (blue), and A (transparency of the
surface or pixel)
DSPs adds a unique data type
fixed point a binary point just to the right of
the sign bit, thus representing a fraction
between 1 and 1
Blocked floating point because the exponent
variable is often shared among many fixed-point
variables (the fixed point does not include an
exponent in every word, thus relying on DSP
programmer to keep the exponent in a separate
variable and ensure that each result is shifted
left or right to keep alignment).

9
Instruction Set Principles

Operations in the Instruction Set (fig. 2.15)
Rule of thumb the most widely executed
instructions are the simple operations of an
instruction set (fig 2.16)
Operations for Media and Signal Processing less
precision and narrower data width due to the
tolerance of human perception
Partitioned add 4 16-bit adds performed on a
single 64-bit ALU in a single cycle (SIMD or
vector instructions, fig2.17)
Paired operations one instruction can launch two
32-bit operations on operands found side by side
on a double-precision register
Saturated arithmetic due to real-time
requirement, DSP does not allow exception
handling and must tolerate overflow by
substituting it with the largest representable
number
Multiply-accumulate (MAC) key to dot-product
operations for vector and matrix multiplies
(MACs/second is the primary peak performance
metric for DSP)

10
Instruction Set Principles

Instructions for Control Flow
There four different types of control flow change
(fig 2.19)
Conditional branch 75 integer and 82 fp
How to specify branch conditions? (fig 2.21-2.22)
Jump (or unconditional branch) 6 integer and
10 fp
Procedure calls and Procedure returns 19 and 8
Caller saving vs. callee saving
Addressing Modes for Control Flow Instructions
PC-relative advantageous for cases where targets
are near the branch instruction and has the
desirable property of position independence (fig
2.20)
Register indirect jumps if the target is not
known at compile time, PC cannot be used rather,
a location is used to dynamically specify the
target
Case of switch in most languages
Virtual functions or methods in OO languages
High-order functions or function pointers in C
or C
Dynamically shared libraries

11
Instruction Set Principles

Encoding an Instruction Set there are three
choices
Variable allows virtually all addressing modes
to be with all operations, enabling the smallest
code representation
examples VAX and Intel 80x86 (1-5 operands, each
with 10 addressing modes)
Fixed load-store ISA, with only one memory
operand and only one or two addressing modes,
thus being able to encode addressing mode as part
of the opcode
Examples Alpha, ARM, MIPS, PowerPC, SPARC,
SuperH
Largest code size
Hybrid IBM 360/370, MIPS16, Thumb, TI TMS320C54x
(fig 2.23)
Competing forces no. size of reg addr modes,
code, pipeline

Operation and of operands
Address specifier 1
Address field 1
Address specifier n
Address field n
Address field 2
Address field 1
Address field 3
Operation
12
Instruction Set Principles

The Role of Compilers
The Structure of Recent Compilers multi-phased
(fig. 2.24)
Difficulties compiler makes gross assumptions
about the abilities of later phases, hence
phase-ordering problem. For instance, it can not
guarantee allocations of registers where they are
most desirable.
Example global common subexpression elimination
-- replacing multiple computations of the same
variable with a single computation and a
temporary location for storing the value. If this
temporary is not allocated a register, the slow
accessing to memory may actually negate the gain
from such optimization!
Register Allocation plays a central role in
compiler optimization both in speeding up the
code and in making other optimizations useful.
graph coloring (16 general purpose registers)
for simple cases and heuristics for more
complicated cases

13
Instruction Set Principles

Impact of Optimizations on Performance
Major types of optimizations and examples in each
class
Change in instruction count for the programs
lucas and mcf from the SPEC2000 as compiler
optimization levels vary
Level 0 unoptimized
Level 1 local optimizations, code scheduling,
and local register allocation
Level 2 global optimizations, loop
transformation, and global register allocation
and
Level 3 procedure integration

14
Instruction Set Principles

The Impact of Compiler Technology on the
Architects Decisions
How are variables allocated and addressed?
How many registers are needed to allocate
variables appropriately?
stack procedure calls (grows) and returns
(shrinks), activation of records most effective
with register
global data area statically declared objects --
arrays or aggregate data structure difficult, if
not impossible, to allocate registers if objects
are aliased
heap dynamic objects -- accessed through
pointers and typically non-scalar almost
impossible for register allocation due to
pointers
Because of aliasing, a compiler must be
conservative for it is impossible to know what a
pointer may refer to, or inversely, what an
object is referred to by.

15
Instruction Set Principles

How the Architect Can Help the Compiler Writer
Guiding principle for compiler designer Make the
frequent cases fast and the rare cases correct.
Other guide lines
Regularity orthogonality (independence among the
3 components of ISA operation, data type, and
addressing mode) helps to make decision early and
correctly
Provide primitives, not solutions support for
HLL should be in ways that's not language
dependent
Simplify trade-offs among alternatives
(optimizing objectives) help the compiler writer
understand costs of various alternatives
Provide instructions that bind the quantities
known at compile time as constants
It is better to err on the side of simplicity
less is more!!

16
Instruction Set Principles

The MIPS Architecture
MIPS is a simple 64-bit load-store architecture.
32 64-bit general purpose registers
R0, R1, R31 integer registers Value of R0 is
always 0.
32 64-bit floating point registers
F0, F1, F31 floating point registers
Data types
8-bit bytes, 16-bit half words, 32-bit words, and
64-bit double words for integers
32-bit single precision and 64-bit double
precision for floating point.
Addressing modes
Register Immediate and displacement with 16-bit
field.
Byte-addressable memory, a mode bit to allow
software to select either Big Endian or Little
Endian
Instruction encoding fixed

17
Instruction Set Principles

The MIPS Instruction Format

18
Instruction Set Principles

The MIPS Operations
Load and store instructions

19
Instruction Set Principles

The MIPS Operations
ALU instructins

20
Instruction Set Principles

The MIPS Operations
Control flow instructions

21
Instruction Set Principles MIPS Example
22
Instruction Set Principles MIPS/DLX Example
23
Instruction Set Principles MIPS Example
24
Instruction Set Principles MIPS Example
25
Instruction Set Principles MIPS Example
26
Instruction Set Principles
27
Instruction Set Principles
28
Instruction Set Principles
29
Instruction Set Principles
30
Instruction Set Principles
31
Instruction Set Principles
32
Instruction Set Principles
33
Instruction Set Principles
34
Instruction Set Principles
35
Instruction Set Principles
36
Instruction Set Principles
37
Instruction Set Principles
38
Instruction Set Principles
39
Instruction Set Principles
40
Instruction Set Principles
41
Instruction Set Principles MIPS/DLX Example

Write a Comment

User Comments (0)