Title: Potential for Dynamic code elimination
1Potential for Dynamic code elimination
- presented by
- Martin Labrecque
- Peter Yiannacouras
2Target Instructions (1)
- Dead instructions
- their output is overwritten without ever being
used - ADD r1 r2 r3
- .
- .
- .
- .
- MOV r1 0x0001
always unused or conditional to control path
3Target Instructions (2)
- Silent instructions
- they don't change the value at their output
location - vect contains numbers lt 100
- for (i .. N)
- r1 vecti
- ltsilentgt r3 (r1 lt 100)? 0
1 - ltsilentgt r5 r5 r3
- ltsilentgt r6 r5 4
dependence path
4Target Instructions (3)
- Value predicted instructions
- their output is known in advance
- they may modify the state of the machine
- for (i .. N)
- MOV r1 r1 8
- ADD r2 cos(r1)
dependence path
5Plan of talk
- Background about the target instructions
- How to deal with them
- A safe removal framework
- Result from removals
- Conclusions
6How many are they?
- Silent 29
- Dead 5
-
- Predicting stores
- 33 less writeback
- Predicting loads
- 12 speedup
- how long to confirm a dead instruction?
- are the 3 types of instructions chained?
- if so, can we take advantage of that?
7How removal works dead instructions
removed
D
feeds only
C
Dead ? removed
feeds no instruction
C
overwrites unused value
B
- Speculation ends
- ? when output is killed ? when
assumptions fail
8How removal works silent and value predictable
D
stays for verification
C
silent / predicted
silent / predicted
B
silent / predicted
A
9Infrastructure
- SimpleScalar simulator
- Monitor all instructions
- Dependence graph around target instructions
- Set some history and speculation limits
- Don't remove
- branches
- instructions with no output (ex NOP)
- squashed instructions
- if only outputs to ZERO register(s)
- trap instructions (ignored)
10Some results
11ResultsPercent Dead/Silent/Predicted
- FP benches do well on silent/vapred
- mgrid has 45 on both
12Number of Candidates
- Only a few instructions are candidates
- On average, 64 are live at a time (408)
13Removal Reoccurrence
Distribution of Percent of Time a Candidate is
Removed
- Means at 62, 64, and 60 respectively
14Predictor Size for Candidates
- Given that an instruction was removed on last
iteration
15Percent Memory Operations
- Usually not much
- Harder to track memory operations
16Removals Per Chain
- Dead very rarely has more than one!
- Silent up to 12
17Removals Per Chain Combined
- 5x more removals in total
- Up to 50 in a chain
18Speculation Failure
- Unbounded speculation for dead
- Unbounded Rollback cost
19Parameters for Dead Removal
- Exponentially decreasing chance of verifying dead
- Rollback cost increases
- Choose parameters to optimize
Distance Till Dead Speculation is confirmed
20Parameters for Silent/Value Predicted Removal
- Choose so as to not hinder chaining
- too big gt wasted overhead
21Conclusions
- Lots of Reuse gt Predictors neednt be large
- Can ignore memory operations with little loss
- Dead
- Safe recursive dead elimination is worthless
- Balance speculation rollback for single
removals - Silent/Value Predicted
- Chains of 12 can be removed
- Much more effective when combined (chains of 50)
22Applications
- Helper threads
- Dont need guaranteed correctness
- Can use removals to speed ahead of real thread
- Program Phase Analysis
- Dynamic Specialization at instruction level
23Extra slides
243 Parameters for Dead Removal
backward distance
forward distance
predecessor distance
251 Parameter for Silent and Predicted
backward distance
0 forward distance