Load-Reuse Analysis design and evaluation - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Load-Reuse Analysis design and evaluation

Description:

Load-Reuse Analysis. design and evaluation. Rastislav Bod k Rajiv Gupta Mary Lou Soffa ... Steffen, Knoop, R thing Value Flow Graph [ESOP 90] ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 31
Provided by: rastisl
Category:

less

Transcript and Presenter's Notes

Title: Load-Reuse Analysis design and evaluation


1
Load-Reuse Analysisdesign and evaluation
  • Rastislav Bodík Rajiv Gupta Mary
    Lou Soffa

2
Partial Redundancy Elimination (PRE)
  • Partially redundant computed on some incoming
    paths

3
a..
4
(No Transcript)
5
Register promotion PRE of loads
store a1, x
load a2
store a3
load a4
  • Three steps
  • ? load-reuse analysis find loads that can
    reuse prior loads/stores
  • ? alias analysis which stores may kill
    reuse?
  • ? transformation remove redundancy PRE
    PLDI 98

6
Load-reuse analysis
  • Design goal
  • completeness find all reuse
  • To approach completeness, the analysis is
  • uniform analyze scalar, array,
    and pointer loads
  • path-sensitive different source of
    reuse on each path
  • Evaluation goal
  • how complete?
  • compare with ideal analysis
  • Detecting all reuse is undecidable
  • no ideal algorithm exists
  • instead, use simulation

7
Experimental framework
program
input
load-reuse analysis
simulator
1.
2.
data-flow solution
profile
estimator
3.
reuse level
weighted solution
transformation
PLDI 98
comparison
4.
8
1. Load-reuse analysis
  • Its a data-flow analysis
  • on a reuse-aware representation
  • Value Name Graph (VNG) POPL98
  • Whats new?
  • Sparse version of the VNG
  • up to 30-times smaller than non-sparse
  • Analyzing indirect loads/stores
  • also, model killing stores

9
Naming the value
y bc
a c-1
x ab1
10
bc
ab1
x
names for the value in x
11
GEN
1
1
1
x
bc
ab1
12
Naming the value across loads
f
offset 0
next
4
p
1
.. p-gtf
.. p-gtnext-gtf
GEN
(p4)
1
r ...
(p4)
1
p p-gtnext
p
1
p
(p4)
13
kill if r p4 or r (p4)
KILL ?
14
Sparse representation
for I 1, N .. AI AI-1
a1 AI
load a1
a2 AI-1
load a2
I I1
15
Ø
Ø
1
1
GEN
load a1
1
1
1
1
load a2
1
1
16
2. The simulator algorithm
for I 1, N .. AI AI-1
Ø
memory access history
load a1
AI
103
102
101
100
history length 1 to 4
load a2
AI-1
102
101
100
99
Simulator detects all PRE-exploitable reuse (up
to given history length), but also some noise
e.g. due to hash table accesses
17
Ideal amount of load reuse
of all dynamic loads
go m88ksim gcc compress li ijpeg vortex tomcatv sw
im su2cor hydro
history length
1
4
65 of executed loads has reuse exploitable by
PRE intra-procedural reuse, history1
18
3. How frequent is the reuse?
load x
Edge profile cheap and available - cannot
reconstruct frequencies of reuse paths
50
100
10
65
35
load x
40
75
5
40
35
30
900
855
25
kill x
75
20
55
load x
19
  • Path profile
  • precise
  • - more expensive
  • ? Use edge profile, but
  • bound its inherent error
  • compute lower upper bound on reuse

20
Hierarchy of estimators
Estimator data-flow solution edge profile ?
weighted data-flow solution
PRE
CMP1
smaller error (but more complex)
CMPc
CMPr
CMPf
Hierarchy a practical approach ? A simple
estimator not precise enough? Use next better
one !
21
The algorithms
1. The bounds generators points generating
reuse stealers points with no reuse upper bound
all reuse consumed lower bound all reuse stolen
load x
50
100
10
65
35
load x
75
40
5
40
35
30
900
855
25
kill x
75
20
55
load x
150
22
  • 2. Separating uncertainty
  • using the CMP region
  • defined for PRE PLDI 98
  • CMP code-motion preventing
  • all error is contained in the CMP region!

23
Improving precision
one region
connected regions
control flow reachability
network flow reachability
24
Estimators precision
PRE
CMP1
CMPc
smaller error
CMPr
CMPf
INT
FP
25
4. Analysis how close to ideal ?
100 reuse seen by simulator
p
ideal alias info
p
calls
array pointer stores calls
all stores calls
reuse killed by
26
Related Work
  • Load-Reuse Analysis
  • makes value numbering path-sensitive
  • Steffen, Knoop, Rüthing Value Flow Graph ESOP
    90
  • we show how analyze indirect loads, via symbolic
    evaluation
  • Simulation-based analysis evaluation
  • Diwan, McKinley, Moss PLDI98
  • Type-based alias analysis how powerful it needs
    to be?
  • Estimators
  • Ramalingam Frequency Analysis PLDI96
  • returns a single estimate, not its bounds

27
Summary
  • Load-reuse analysis
  • reuse across indirect memory references
  • sparse representation
  • Estimators three principles
  • confidence bound the edge-profile error
  • separation of uncertainty inside/outside the CMP
    region
  • hierarchy increasing precision and complexity
  • Evaluation
  • about 65 loads are amenable to PRE
  • our analysis can find about 80 of those

28
Combine three removal methods
PLDI 98
control speculation
S
code motion
restructuring
M
R
29
Example


10


50
ab
ab
ab
30
Relative removal power
Loads removed, dynamic count, normalized
INT
FP
Global CSE path-insensitive
Write a Comment
User Comments (0)
About PowerShow.com