EPIC Architecture (Explicitly Parallel Instruction Computing) - PowerPoint PPT Presentation

About This Presentation
Title:

EPIC Architecture (Explicitly Parallel Instruction Computing)

Description:

Intel's IA-64 Architectural Features. IA-64's Key Technologies. Summary and Reference ... EPIC means Explicitly Parallel Instruction computing, and EPIC architecture ... – PowerPoint PPT presentation

Number of Views:2747
Avg rating:3.0/5.0
Slides: 29
Provided by: csU73
Category:

less

Transcript and Presenter's Notes

Title: EPIC Architecture (Explicitly Parallel Instruction Computing)


1
EPIC Architecture(Explicitly Parallel
Instruction Computing)
  • Yangyang Wen
  • CDA5160--Advanced Computer Architecture I
  • University of Central Florida

2
Outline
  • What is EPIC?
  • EPIC Philosophy
  • Architectural Features Supporting EPIC
  • Intels IA-64 Architectural Features
  • IA-64s Key Technologies
  • Summary and Reference

3
Traditional Architectures Limited Parallelism
Todays Processors often 60 Idle
4
EPIC Architecture Explicit Parallelism
Better Parallel machine Code
Increases Parallel Execution
5
What is EPIC ?
  • EPIC means Explicitly Parallel Instruction
    computing, and EPIC architecture provides
    features that allow compilers to take a proactive
    role in enhancing Instruction level parallelism(
    ILP) without unacceptable hardware complexity.

6
EPICs Performance
7
EPIC Design Philosophy
  • EPIC permits the compiler have advanced features
    to enhance ILP predication, speculation.
  • EPIC can design the plan of execution (POE) at
    compile-time and communicate the POE to the
    hardware.
  • EPIC must have massive hardware resources for
    parallel execution

8
Introducing IA-64
  • IA-64 comes from Intel and is the first 64-bit
    architecture for Intel.
  • The first instance of a commercially available
    EPIC ISA.
  • The first architecture to bring ILP features to
    general-purpose microprocessors.

9
IA-64s Architectural Basics
  • Explicit Parallelism
  • Enhanced ILP
  • Compiler-oriented
  • Extremely large physical memory
  • A huge virtual address space for applications
  • 64-bit computation
  • Extremely large register files

10
(No Transcript)
11
IA-64s Key Technologies
  • Instructions Bundling
  • Predication
  • Control Speculation
  • Data Speculation
  • Software pipelining

12
Instruction Bundling
128-bit bundle
41-bits
0
127
Instruction 1
Instruction 0
Template
Insrtruction2
  • Uses a form of VLIW architecture
  • Three Instructions are combined into a 128-bit
    instruction
  • Parallel Instructions are executed in groups
  • Template bits decode and route instructions and
    mark the end of groups of parallel instructions.

13
ILP Bottlenecks
  • Branches
  • Deal with branch, take predication.
  • Branch mispredications cause 20 to 30 loss in
    processor performance .
  • Memory latency
  • Latency is the time it takes to get data from
    memory. The longer it takes you to access memory
    to get code and data, the longer the CPU sits
    idle.
  • For memory latency, it's the loads that are the
    big problem, not the stores.

14
Predication
If AgtB
If AgtB
If AgtB SA else SB end if
Predicate SA
SB
SA
The predication is wrong
PS
Throw away SA
SB
(b) IA-64 predication
  1. Traditional predication

Branching is a major cause of lost performance.
15
EPIC Predication Process
16
Predication Benefits
  • Reduce branches
  • Reduce mispredication penalties
  • Reduce critical paths

17
Control Speculation
IA-64 Architectures
Traditional Architectures
ld.s r8a instr 1 instr 2
instr 1
instr 2
. . .
br
Barrier
br
chk.s r8 use
Load a use
Allows elevation of load, even above a branch
Elevating the load above a branch is not possible
Memory latency is a major performance bottleneck
18
Introducing the Token Bit
IA-64
ld.s r8a instr 1 instr 2
Exception Detection
Propagate Exception
br
Exception Delivery
chk.s r8 use
  • When elevate ld, give an exception detection
  • If the load address is valid, its normal.
  • If the load address is invalid, compiler sets
    token bit ,and jumps out of this path.
  • If the code goes to chk.s, and the chk.s detects
    the token bit,jumps to fix-up code,executes the
    load.

19
Data Speculation
Traditional Architectures
IA-64
instr 1
ALAT
load.a instr 1 instr 2
instr 2
. . .
store
Barrier
store
load use
load.c use
Chk.a
Allows the compiler to elevate the load ,even it
isnt sure if the memory reference overlaps.
Cant elevate the load, so prevents from
reordering insts
20
Advanced Load Address Table ALAT
reg
Address
reg
Address
  • When elevate ld.a,insert ALAT
  • When store, remove overlap address records in
    ALAT
  • When chk.a,if no address is found ,there is a
    conflict, and jumps to fix-up code to reexecute
    the code

reg
Address
...
21
Speculation Benefits
  • Reduces impact of memory latency
  • Study demonstrates performance improvement of 80
    when combined with predication
  • Greatest improvement to code with many cache
    accesses
  • Scheduling flexibility enables new levels of
    performance headroom

22
Software Pipelining
vs.
  • Overlap the execution of different loop
    iterations
  • Get more iterations in same amount of time

23
Software Pipelining Example
For(I0Ilt1000I) xIxIs
Loop Ld f0,0(r1) Add f0,f0,f1 Sd
f0,0(r1) Add r1,r1,8 Subi r2,r2,1 Benz loop
Loop SD f2, -4(r1) Add f2,f0,f1 Subi
r2,r2,1 Ld f0, 4(r1) Benz loop
Software pipelining
24
Software Pipelining Advantages
  • Traditionally performed through loop unrolling
  • less code compared loop unrolling, increased
    regularity
  • Smaller code means fewer cache misses
  • Especially useful for integer code with small
    number of loop iterations

25
Software Pipelining disadvantages
  • Requires many additional instructions to manage
    the loop
  • Without hardware support the overhead may greatly
    increase code size
  • typically only used in special technical
    computing applications

26
IA-64 Features Supporting Software Pipelining
  • Full predication
  • Circular Buffer of General and FP Registers
  • Loop Branches Decrement RRBs (register rename
    bases)

27
Summary
  • Predication removes branches
  • Parallel compares increase parallelism
  • Benefits complex control flow large databases
  • Speculation reduces memory latency impact
  • IA-64 removes recovery from critical path
  • Benefits applications with poor cache locality
    server applications, OS
  • S/W pipelining support with minimal overhead
    enables broad usage
  • Performance for small integer loops with unknown
    trip counts as well as monster FP loops

28
Reference
  • M. S. Schlanker, "EPIC Explicitly Parallel
    Instruction Computing", Computer, vol. ?, No. ?,
    pp 37--45, 2000.
  • Jerry Huck et al., "Introducing the IA-64
    Architecture", Sept - Oct. 2000, pp. 12-23
  • Carole Dulong The IA-64 Architecture at
    Work,Computing Practices
Write a Comment
User Comments (0)
About PowerShow.com