InterIteration Scalar Replacement in the Presence of ControlFlow - PowerPoint PPT Presentation

About This Presentation
Title:

InterIteration Scalar Replacement in the Presence of ControlFlow

Description:

Carr & Kennedy, PLDI 1990. Scalar Replacement - Arrays, no control flow - Carr & Kennedy, SPE 1994. Generalized Scalar Replacement - Restricted control-flow ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 23
Provided by: mbu63
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: InterIteration Scalar Replacement in the Presence of ControlFlow


1
Inter-Iteration Scalar Replacement in the
Presence of Control-Flow
  • Mihai Budiu Microsoft Research, Silicon Valley
  • Seth Copen Goldstein Carnegie Mellon University
  • ODES 2005

2
Summary
  • What compiler optimization
  • Where dense regular matrix codes
  • FORTRAN
  • some media processing
  • Goal reduce number of memory accesses
  • How allocate array elements to registers
  • New optimal algorithm based on predication

3
Outline
  • Scalar Replacement
  • Predicated PRE
  • Combining the two
  • Results

4
Scalar Replacement
tmp ai tmp 2 tmp ltlt 4 ai tmp
ai ai 2 ai ltlt 4
Front-end
ld ai arith arith st ai
ld ai arith ... st ai ld ai arith st ai
Back-end
5
Inter-Iteration Scalar Replacement
tmp0 a0 for (i0 i lt N i) tmp1
a1 ai tmp0 tmp1 tmp0 tmp1
for (i0 i lt N i) ai ai1
Runtime
ld a0 ld a1 st a0 ld a2 st a1
i0
i0
ld a0 ld a1 st a0 ld a1 ld a2 st a1
tmp1
i1
i1
6
Rotating Scalars
for () . tmp0 tmp1 tmp1
tmp2 tmp2 tmp3 tmp3 ai4
for (i0 i lt N i) ai ai3
Invariant tmp0 ai0 tmp1 ai1 tmp2
ai2 tmp3 ai3
Itanium has hardware support for rotating
registers.
7
Control-Flow
for (i0 i lt N i) if (i 1)
ai ai3
8
Outline
  • Scalar Replacement
  • Predicated PRE
  • Combining the two
  • Results

9
Availability
y ai ... if (x) ... ... ai
  • y

10
Conservative Analysis
if (x) ... y ai ... ... ai
y?
11
Predicated PRE
flag false if (x) ... y ai
flag true ... ... flag ? y ai
Invariant flag true y ai
12
Outline
  • Scalar Replacement
  • Predicated PRE
  • Combining the two
  • Results

13
Scalars and Flags
for (i0 i lt N i) if (i 1) ai
ai3
Invariant
(valid0 true) tmp0 ai0 (valid1
true) tmp1 ai1 (valid2 true) tmp2
ai2 (valid3 true) tmp3 ai3
scalar
bool
14
Scalar Replacement Algorithm
if (! validk) ld aik tmpk aik
validk true
Can be implemented with predication or
conditional moves
tmpk v validk true
st aik, v
15
Optimality
  • No scalarized memory location is read or written
    two times
  • The resulting program touches exactly the same
    memory locations as the original program
  • Proof trivial based on valid flags invariant

given perfect dependence analysis and enough
registers
16
Additional Details
(see paper)
  • Initialize validk to false
  • Rotate scalars and valid flags
  • Use dirtyk flags to avoid extra stores
  • Postlude for missing stores
  • if (validk) aNk tmpk
  • Lift loop-invariant accesses
  • (finding loop-invariant predicates)
  • Hardware support

(for rotating registers and flags).
17
Outline
  • Scalar Replacement
  • Predicated PRE
  • Combining the two
  • Results

18
Redundant Stores
reduction
19
Redundant Loads
reduction
20
Performance Impact
target Spatial Computation
Removed accesses tend to be cache hits small
contribution to running time.
reduction running time
21
Conclusions
  • Use predicates to dynamically detect redundant
    memory accesses
  • Simple algorithm gives optimal result even with
    un-analyzable control flow
  • Can dramatically reduce memory accesses

22
Related Work
Carr Kennedy, PLDI 1990 Scalar Replacement -
Arrays, no control flow -
Carr Kennedy, SPE 1994 Generalized Scalar
Replacement - Restricted control-flow -
Morel Renvoise, CACM 1979 Partial Redundancy
Elimination - Not across remote iterations -
Scholz, Europar 2003 Predicated PRE - Single
iteration, no writes -
This work, ODES 2005 PPRE across iterations -
Optimal -
Non-speculative promotion
Speculative promotion
Write a Comment
User Comments (0)
About PowerShow.com