Loop Unrolling - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Loop Unrolling

Description:

Poison bit: applied to destination register. set upon exception. raise exception upon access to poisoned register. Michigan State University ... – PowerPoint PPT presentation

Number of Views:972
Avg rating:3.0/5.0
Slides: 30
Provided by: richard139
Category:

less

Transcript and Presenter's Notes

Title: Loop Unrolling


1
Loop UnrollingPredication
  • CSE 820

2
Software Pipelining
  • With software pipelining a reorganized loop
    contains instructions from different iterations
    of the original loop.
  • Sometimes called symbolic loop unrolling.

3
Software Pipelined Loop
4
Unrolled Loopselect subset of each iteration
(bold)
  • Iteration 1 L.D F0,0 (R1)
  • ADD.D F4, F0, F2 S.D F4, 0 (R1)
  • Iteration 2 L.D F0,0 (R1)
  • ADD.D F4, F0, F2 S.D F4, 0 (R1)
  • Iteration 3 L.D F0,0 (R1)
  • ADD.D F4, F0, F2 S.D F4, 0 (R1)

5
Software Pipelining
  • Loop S.D F4, 16 (R1) stores into Mi
  • ADD.D F4, F0, F2 adds to Mi-1 L.D
    F0,0 (R1) loads Mi-2
  • DADDUI R1, R1, -8 BNE R1, R2,
    Loop
  • Requires start-up and clean-up.

6
Symbolic Loop Unrolling
  • Software pipelining can be thought of as symbolic
    loop unrolling, but has the advantage of
    generating less code.

7
Software Pipelining has less overhead
8
Global Code Scheduling
  • allows moving instructions across branches
  • Most techniques concentrate on determining a
  • Straight-line code segment representing the most
    frequently executed code

9
Trace Scheduling
  • Concept
  • Guess the likely path through branches(called
    the trace)
  • Trace now contains long stretches of code without
    taken branches (predicted)
  • Schedule the trace allowing movement across
    branches
  • Add code to off-the-trace to undo the effects of
    movement
  • The increased ability to move across branches
    should improve scheduling

10
Movement Undo
  • Consider
  • if (cond) then xx 5 // likely else //
    unlikely
  • After Movement
  • x x 5
  • if (cond)then // likelyelse x x 5 //
    unlikely // undo

11
Select a trace
12
Trace showing jumps off the trace
13
Superblocks
  • Avoid the multiple entry and exits of traces.
  • Superblock has one entry and multiple exits which
    makes scheduling easier.
  • The one-entry-multiple-exit is achieved by
    duplicating code where the unlikely path exits
    the trace so that no reentry is needed.

14
Superblock one entry and multiple exits
15
Predicated Instructions
  • Requires
  • Hardware
  • ISA modification
  • Predicated instructions eliminate branches,
    converting a control dependence into a data
    dependence.
  • IA-64 has predicated instructions, but many
    existing ISA contain at least one(the
    conditional move).

16
Conditional Move
  • if (R1 0) R2 R3
  • Branch BNEZ R1,L ADDU R2, R3, R0L
  • Conditional Move CMOVZ R2, R3, R1
  • In a pipeline, the control dependence at the
    beginning of the pipeline is transformed into a
    data dependence at the end of the pipeline.

17
Full Predication
  • Every instruction has a predicateif the
    predicate is false, it becomes a NOP.
  • It is particularly useful for global scheduling
    since non-loop branches can be eliminated the
    harder ones to schedule.

18
Exceptions Predication
  • A predicated instruction must not be allowed to
    generate an exception,if the predicate is false.

19
Implementation
  • Although predicated instructions can be annulled
    early in the pipeline, annulling during commit
    delays annulment until later so data hazards have
    an opportunity to be resolved.
  • The disadvantage is that resources such as
    functional units and registers (rename or other)
    are used.

20
Predication is good for
  • Short alternative control flow
  • Eliminating some unpredictable branches
  • Reducing the overhead of global scheduling
  • But the precise rules for compilation are still
    being determined.

21
Limitations
  • Annulled instructions waste resources registers,
    functional units, cache memory bandwidth
  • If predicate condition cannot be separated from
    the instruction, a branch might have had better
    performance, if it could have been accurately
    predicted.

22
Limitations (cont)
  • Predication across multiple branches can
    complicate control and is undesirable unless
    hardware supports it (as in IA-64).
  • Predicated instructions may have a speed
    penaltynot the case when all instructions are
    predicated.

23
Example
  • if (A0) AB else A A4
  • LD R1,0(R3) load A BNEZ R1,L1 test
    A LD R1,0(R2) then clause J L2 skip else
  • L1 DADDI R1,R1,4 else clause
  • L2 SD R1,0(R3) store A

24
Hoist Load
  • if (A0) AB else A A4
  • LD R1,0(R3) load A LD R14,0(R2)
    speculative load B BEQZ R1,L3 other
    branch of if DADDI R14,R1,4 else clause
  • L3 SD R14,0(R3) store A
  • What if speculative load raises an exception?

25
Guard
  • if (A0) AB else A A4
  • LD R1,0(R3) load A sLD R14,0(R2) speculat
    ive load BNEZ R1,L1 test A SPECCK 0(R2) spe
    culative check J L2 skip else
  • L1 DADDI R14,R1,4 else clause
  • L2 SD R14,0(R3) store A
  • sLD does not raise certain exceptions leaves
    them for SPECCK (IA-64).

26
Other exception techniques
  • Poison bit
  • applied to destination register.
  • set upon exception
  • raise exception upon access to poisoned register.

27
Hoist Load above Store
  • If memory addresses are known, a load can be
    hoisted above a store.
  • If not,
  • add a special instruction to check addresses
    before the loaded value is used.(It is similar
    to SPECCK shown earlier IA-64)

28
Speculation soft vs. hard
  • must be able to disambiguate memory(to hoist
    loads past stores), but at compile time
    information is insufficient
  • hardware works best when control flow is
    unpredictable and when hardware branch prediction
    is superior
  • exception handling is easier in hardware
  • trace techniques require compensation code
  • compilers see further for better scheduling

29
IA-64
Write a Comment
User Comments (0)
About PowerShow.com