Instruction Set Principles - PowerPoint PPT Presentation

About This Presentation
Title:

Instruction Set Principles

Description:

Instruction Set Principles ISA should reflect application characteristics: Desktop computing is compute-intensive, thus focusing on features favoring Integer and FP ops; – PowerPoint PPT presentation

Number of Views:146
Avg rating:3.0/5.0
Slides: 42
Provided by: DavidPa61
Learn more at: http://cse.unl.edu
Category:

less

Transcript and Presenter's Notes

Title: Instruction Set Principles


1
Instruction Set Principles
  • ISA should reflect application characteristics
  • Desktop computing is compute-intensive, thus
    focusing on features favoring Integer and FP ops
  • Server computing is data-intensive, focusing on
    integers and char-strings (yet FP ops are still
    standard in them)
  • Embedded computing is time-sensitive, memory and
    power conciouse, thus focusing on code-density,
    real-time and media data streams.

2
Instruction Set Principles
  • Taxonomy of ISA
  • Stack both operands are implicit on the top of
    the stack, a data structure in which items are
    accessed an a last in, first out fashion.
  • Accumulator one operand is implicit in the
    accumulator, a special-purpose register.
  • General Purpose Register all operands are
    explicit in specified registers or memory
    locations. Depending on where operands are
    specified and stored, there are three different
    ISA groups
  • Register-Memory one operand in register and one
    in memory.Examples IBM 360/370, Intel 80x86
    family, Mototola 68000
  • Memory-Memory both operands are in memory.
    Example VAX.
  • RegisterRegister (load store) all operands,
    except for those in load and store instructions,
    are in registers. Examples SPARC (Sun
    Microsystems), MIPS, Precision Architecture (HP),
    PowerPC (IBM), Alpha (DEC).

3
Instruction Set Principles
  • Taxonomy of ISA Examples

C?AB
(a) Stack
(d) Reg-Reg/Load-Store
(e) Memory-Memory
(b) Accumulator
(c) Register-Memory
TOS
Reg. Set
Reg. Set
Stack
Accumulator
ALU
ALU
ALU
ALU
ALU
Memory
Memory
Memory
Memory
Memory
Push A Push B Add Pop C
Load A Add B Store C
Load R1,A Add R1,B Store R1,C
Load R1,A Load R2,B Add R3,R1,R2 Store R3,C
Add C,A,B or Add A,B
4
Instruction Set Principles
  • Comparisons

ISA Type Advantages Disadvantages
Register-register (0,3) ( of mem addr, Max of opnds) Simple, fixed-length instruction encoding simple code generation model instructions take similar of cycles to execute. Higher IC than ISAs with memory references in instructions. Higher instructions count and lower instruction density leads to larger programs.
Register-memory (1,2) Data can be accessed without a separate load instruction first Instruction format tends to be easy to encode and yields good code density Operands are not equivalent since a source operand in a binary operation is destroyed Encoding a register number and a memory address in each instruction may restrict the number of registers CPIi vary by operand location.
Memory-memory (2,2) or (3,3) Most compact Does not waste registers for temporaries. Large variation in instruction size, especially for three-operand instructions. In addition, large variation in CPIi Memory accesses create memory bottleneck (no longer used today).
5
Instruction Set Principles
  • Addressing Memory how to specify and interpret
    memory address is important since all data are
    initially in the memory.
  • Interpreting Memory Addresses
  • All computers, except DSPs, are byte-addressed,
    providing access for bytes, half-words (2 bytes),
    words (4 bytes), and double words (8 bytes)
  • Ordering bytes within a larger object 8 bytes in
    a double word
  • Little Endian
  • Big Endian
  • Byte ordering can be a problem when exchanging
    data between computers with different ordering
    conventions
  • Alignment of bytes an access to an object of
    size s bytes at byte address A is aligned if A
    mod s 0. Memory is aligned on a multiple of a
    word or double-word boundary
  • Misalignment causes extra memory accesses and HW
    costs

7 6 5 4 3 2 1
0
0 1 2 3 4 5 6
7
6
Instruction Set Principles
  • Addressing Modes how ISA specifies the address
    of an object to be accessed (fig. 2.6-2.7)
  • Operands they can be found in registers, memory
    locations, and instructions themselves
    (instruction stream)
  • Effective Address specifies the actual memory
    address when a memory location is used for an
    operand
  • PC-Relative Addressing addressing modes that
    depend on the program counter
  • Immediates/Literals considered as memory
    addressing modes, even though the value they
    access is in the instruction steam
  • Displacement Mode must determine the range of
    displacement judiciously (via quantitative
    studies, fig. 2.8)
  • Immediate/literal Mode must decide the level of
    support (all or a subset ops) and the range of
    values (fig. 2.9-10)
  • Modulo/Circular Mode for DSPs handling infinite,
    continuous stream of data relies on circular
    buffers
  • Bit-Reverse Mode used exclusively for FFTs

7
Instruction Set Principles
  • Type and Size of Operands encoding in opcode
    designates operand types in all modern day
    computers while tags were used to indicate types
    in old machines
  • Desktop and Server architectures
  • Character 8-bit, usually in ASCII
  • 16-bit Unicode used in Java is gaining
    popularity
  • Integers are almost universally represented as
    twos complement binary numbers short integer
    (half-word), integer (word), long integer
    (double-word)
  • Single-precision (1-word) and double-precision
    (2-word) floating point the IEEE float-point
    standard, IEEE standard 754
  • Architectures supporting business applications
  • Packed decimal/binary-coded decimal 4 bits are
    used to encode the values 0-9 and two decimal
    digits are packed into each byte, for getting
    results that exactly match decimal numbers (some
    decimal fractions do not have exact
    representation in binary)
  • Frequency of access to types helps determine what
    types are most important to support efficiently
    (fig. 2.12)

8
Instruction Set Principles
  • Operands for Media and Signal Processing
  • Graphics applications deal with 2D and 3D images
  • Vertex usually of 32-bit floating-point values,
    is a data structure with four components for
    representing 3D images x-coordinate,
    y-coordinate, z-coordinate, w-coordinate (color
    or hidden surfaces)
  • Pixel consists of four 8-bit channels R (red),
    G (green), B (blue), and A (transparency of the
    surface or pixel)
  • DSPs adds a unique data type
  • fixed point a binary point just to the right of
    the sign bit, thus representing a fraction
    between 1 and 1
  • Blocked floating point because the exponent
    variable is often shared among many fixed-point
    variables (the fixed point does not include an
    exponent in every word, thus relying on DSP
    programmer to keep the exponent in a separate
    variable and ensure that each result is shifted
    left or right to keep alignment).

9
Instruction Set Principles
  • Operations in the Instruction Set (fig. 2.15)
  • Rule of thumb the most widely executed
    instructions are the simple operations of an
    instruction set (fig 2.16)
  • Operations for Media and Signal Processing less
    precision and narrower data width due to the
    tolerance of human perception
  • Partitioned add 4 16-bit adds performed on a
    single 64-bit ALU in a single cycle (SIMD or
    vector instructions, fig2.17)
  • Paired operations one instruction can launch two
    32-bit operations on operands found side by side
    on a double-precision register
  • Saturated arithmetic due to real-time
    requirement, DSP does not allow exception
    handling and must tolerate overflow by
    substituting it with the largest representable
    number
  • Multiply-accumulate (MAC) key to dot-product
    operations for vector and matrix multiplies
    (MACs/second is the primary peak performance
    metric for DSP)

10
Instruction Set Principles
  • Instructions for Control Flow
  • There four different types of control flow change
    (fig 2.19)
  • Conditional branch 75 integer and 82 fp
  • How to specify branch conditions? (fig 2.21-2.22)
  • Jump (or unconditional branch) 6 integer and
    10 fp
  • Procedure calls and Procedure returns 19 and 8
  • Caller saving vs. callee saving
  • Addressing Modes for Control Flow Instructions
  • PC-relative advantageous for cases where targets
    are near the branch instruction and has the
    desirable property of position independence (fig
    2.20)
  • Register indirect jumps if the target is not
    known at compile time, PC cannot be used rather,
    a location is used to dynamically specify the
    target
  • Case of switch in most languages
  • Virtual functions or methods in OO languages
  • High-order functions or function pointers in C
    or C
  • Dynamically shared libraries

11
Instruction Set Principles
  • Encoding an Instruction Set there are three
    choices
  • Variable allows virtually all addressing modes
    to be with all operations, enabling the smallest
    code representation
  • examples VAX and Intel 80x86 (1-5 operands, each
    with 10 addressing modes)
  • Fixed load-store ISA, with only one memory
    operand and only one or two addressing modes,
    thus being able to encode addressing mode as part
    of the opcode
  • Examples Alpha, ARM, MIPS, PowerPC, SPARC,
    SuperH
  • Largest code size
  • Hybrid IBM 360/370, MIPS16, Thumb, TI TMS320C54x
    (fig 2.23)
  • Competing forces no. size of reg addr modes,
    code, pipeline

Operation and of operands
Address specifier 1
Address field 1
Address specifier n
Address field n
Address field 2
Address field 1
Address field 3
Operation
12
Instruction Set Principles
  • The Role of Compilers
  • The Structure of Recent Compilers multi-phased
    (fig. 2.24)
  • Difficulties compiler makes gross assumptions
    about the abilities of later phases, hence
    phase-ordering problem. For instance, it can not
    guarantee allocations of registers where they are
    most desirable.
  • Example global common subexpression elimination
    -- replacing multiple computations of the same
    variable with a single computation and a
    temporary location for storing the value. If this
    temporary is not allocated a register, the slow
    accessing to memory may actually negate the gain
    from such optimization!
  • Register Allocation plays a central role in
    compiler optimization both in speeding up the
    code and in making other optimizations useful.
  • graph coloring (16 general purpose registers)
    for simple cases and heuristics for more
    complicated cases

13
Instruction Set Principles
  • Impact of Optimizations on Performance
  • Major types of optimizations and examples in each
    class
  • Change in instruction count for the programs
    lucas and mcf from the SPEC2000 as compiler
    optimization levels vary
  • Level 0 unoptimized
  • Level 1 local optimizations, code scheduling,
    and local register allocation
  • Level 2 global optimizations, loop
    transformation, and global register allocation
    and
  • Level 3 procedure integration

14
Instruction Set Principles
  • The Impact of Compiler Technology on the
    Architects Decisions
  • How are variables allocated and addressed?
  • How many registers are needed to allocate
    variables appropriately?
  • stack procedure calls (grows) and returns
    (shrinks), activation of records most effective
    with register
  • global data area statically declared objects --
    arrays or aggregate data structure difficult, if
    not impossible, to allocate registers if objects
    are aliased
  • heap dynamic objects -- accessed through
    pointers and typically non-scalar almost
    impossible for register allocation due to
    pointers
  • Because of aliasing, a compiler must be
    conservative for it is impossible to know what a
    pointer may refer to, or inversely, what an
    object is referred to by.

15
Instruction Set Principles
  • How the Architect Can Help the Compiler Writer
  • Guiding principle for compiler designer Make the
    frequent cases fast and the rare cases correct.
  • Other guide lines
  • Regularity orthogonality (independence among the
    3 components of ISA operation, data type, and
    addressing mode) helps to make decision early and
    correctly
  • Provide primitives, not solutions support for
    HLL should be in ways that's not language
    dependent
  • Simplify trade-offs among alternatives
    (optimizing objectives) help the compiler writer
    understand costs of various alternatives
  • Provide instructions that bind the quantities
    known at compile time as constants
  • It is better to err on the side of simplicity
    less is more!!

16
Instruction Set Principles
  • The MIPS Architecture
  • MIPS is a simple 64-bit load-store architecture.
  • 32 64-bit general purpose registers
  • R0, R1, R31 integer registers Value of R0 is
    always 0.
  • 32 64-bit floating point registers
  • F0, F1, F31 floating point registers
  • Data types
  • 8-bit bytes, 16-bit half words, 32-bit words, and
    64-bit double words for integers
  • 32-bit single precision and 64-bit double
    precision for floating point.
  • Addressing modes
  • Register Immediate and displacement with 16-bit
    field.
  • Byte-addressable memory, a mode bit to allow
    software to select either Big Endian or Little
    Endian
  • Instruction encoding fixed

17
Instruction Set Principles
  • The MIPS Instruction Format

18
Instruction Set Principles
  • The MIPS Operations
  • Load and store instructions

19
Instruction Set Principles
  • The MIPS Operations
  • ALU instructins

20
Instruction Set Principles
  • The MIPS Operations
  • Control flow instructions

21
Instruction Set Principles MIPS Example
22
Instruction Set Principles MIPS/DLX Example
23
Instruction Set Principles MIPS Example
24
Instruction Set Principles MIPS Example
25
Instruction Set Principles MIPS Example
26
Instruction Set Principles
27
Instruction Set Principles
28
Instruction Set Principles
29
Instruction Set Principles
30
Instruction Set Principles
31
Instruction Set Principles
32
Instruction Set Principles
33
Instruction Set Principles
34
Instruction Set Principles
35
Instruction Set Principles
36
Instruction Set Principles
37
Instruction Set Principles
38
Instruction Set Principles
39
Instruction Set Principles
40
Instruction Set Principles
41
Instruction Set Principles MIPS/DLX Example
Write a Comment
User Comments (0)
About PowerShow.com