Comparing IA-64 and HPL-PD - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Comparing IA-64 and HPL-PD

Description:

... implement rotating registers, loop counters, and epilogue counters in combination with predication. Used to implement modulo scheduling of loops. – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 21
Provided by: IgorPech
Learn more at: https://cs.nyu.edu
Category:

less

Transcript and Presenter's Notes

Title: Comparing IA-64 and HPL-PD


1
Comparing IA-64 and HPL-PD
2
Overview
  • IA-64 has a number of novel features for
    supporting ILP
  • Predication
  • Data Speculation
  • Control Speculation
  • Software Pipelining
  • Compiler-directed Caching
  • These features all exist in HPL-PD!
  • also great similarity in ISA (arithmetic, logic
    operations, etc).
  • there are few extensions
  • Multimedia Instructions
  • Semaphore Instructions

3
Predication Support
  • IA-64 and Trimaran both support conditional
    executions of instructions through predicate
    registers, and instructions to manipulate them.
  • Both support parallel compare operations
  • I.e. assigning to two predicate registers
    simultaneously
  • through a modifier in HPL-PD
  • through a completer in IA-64
  • wired-and, wired-or

4
Control Speculation
  • Control Speculation is supported in both IA-64
    and HPL-PD with the same semantics
  • IA-64
  • GPR includes 1 bit speculation tag (NAT bit)
  • FPR uses a special encoding called NATVal
  • No extra bit needed
  • Only LOAD instruction has control speculative
    version
  • Need verification instruction for exception
    handling
  • HPL-PD
  • Both GPR and FPR have speculation tag
  • Extra bit like NAT in IA-64
  • All integer instruction and float point
    instruction have control speculative versions
  • Exception is automatically tracked by the hardware

5
Control SpeculationIA-64 Example
6
Control SpeculationHPL-PD Example
7
Data Speculation
  • Data speculation is supported in both IA-64 and
    HPL-PD in a similar manner.
  • I.e. moving a load above a store that may write
    to the same address.
  • IA-64
  • Supports load checking (ld.s) as well as checking
    with recovery
  • The compiler can move up not only the
    definitions, but also one or more of its uses
    (check.a)
  • HPL-PD
  • Also supports recovery in load checking (BRDV)

8
Data SpeculationExamples
IA-64
HPL-PD
9
Data SpeculationRecovery Examples (IA-64)
10
Data SpeculationRecovery Examples (HPL-PD)
11
Compiler Directed Cache
  • The memory hierarchy is visible to the compiler
    in both HPL-PD and IA-64
  • IA-64
  • The compiler can supply hints in store, load, and
    prefetch instructions on where in the cache
    hierarchy the data will be found or left.
  • For prefetching, the lfetch instructions
    requests that cache lines be moved between
    different levels of the memory hierarchy.
  • lfetch maintains cache coherence
  • HPL-PD
  • The compiler can also supply hints in store, and
    load instructions
  • Prefetching is simply a load to R0

12
Compiler Directed CacheIA-64
13
Compiler Directed CacheHPL-PD
14
Support for Software Pipelining
  • Both IA-64 and Trimaran implement rotating
    registers, loop counters, and epilogue counters
    in combination with predication.
  • Used to implement modulo scheduling of loops.

15
Software Pipelining ExampleHPL-PD
Example of software pipelining in Trimaran
A slice executed as a single VLIW instruction.
Taken from the Trimaran Tutorial
16
Software Pipelining IA-64
Software pipelining on the IA-64
loop (p14) ld1 r32 r12,1 (p15) add r34 1,
r33 (p16) st1 r13 r35,1 br.ctop loop
C source for (i0 iltn i) yi xi 1
Taken from the Intel web tutorial
17
Differences
  • Multimedia Instruction
  • Semaphore Instruction
  • Register Stack Engine

18
Register Stack Engine
  • IA-64 implements a mechanism called a register
    stack engine (RSE) that manages the dynamic
    allocation of stack frames using registers
    gpr32-gpr127.
  • The operations of the RSE are transparent to the
    software.
  • It ensures that contents of registers are always
    available.

19
Multimedia Instruction
  • IA-64 has multimedia instructions that treat the
    GPRs as concatenation of eight 8-bit, four 16-bit
    or two 32-bits and operate on each element
    independently and in parallel.
  • Inspired by MMX
  • The instructions include
  • parallel addition and subtraction
  • parallel average
  • parallel shift left and add
  • parallel compare
  • parallel multiply right

20
Semaphore Instruction
  • IA-64 has semaphore instructions that
  • atomically load a general register from memory,
  • perform an operation and
  • then store a result to the same memory location.
  • The instructions include
  • exchange
  • compare and exchange
  • fetch and add
Write a Comment
User Comments (0)
About PowerShow.com