Title: CS 7810 Lecture 13
1CS 7810 Lecture 13
Pipeline Gating Speculation Control For Energy
Reduction S. Manne, A. Klauser, D.
Grunwald Proceedings of ISCA-25 June 1998
2Cost of Speculation
9.9
12.2
23.9
10.4
6.9
4.6
11.3
1.7
Mispredict rates ?
3Pipeline Gating
- Low confidence branches throttle instr fetch
until they are resolved - Pipeline gating usually lasts for fewer than
five cycles
4Metrics
- SPEC (specificity) fraction of all mispredicted
- branches detected as low-confidence by the
- confidence estimator (coverage)
- PVN (predictive value of a negative test)
probability - of a low-confidence branch being incorrectly
- branch-predicted (accuracy)
5Confidence Estimators
- Perfect to gauge potential benefits
- Static branches that have low prediction rates
- JRS if a branch has yielded N successive
correct - predictions, it has high confidence
- Saturating counters unbiased counter value or
- disagreement in two predictors ? low confidence
- Distance mpreds are clustered, hence the first
4 - branches after a mispredict have low confidence
6SPEC and PVN
SPEC (coverage) mispred branches detected by
low-confidence estimator PVN (accuracy) of
low-confidence branches that are branch mpreds
- It is easier to achieve a high SPEC value than
PVN - A high PVN value can be achieved by using N
low-confidence branches - to invoke gating if PVN is 30, re-defining
low-confidence as two - low-confidence branches increases PVN to 51
7Perfect
8Gating Results
9Results
- Can gating improve performance? only if cache
- pollution is significant
- Less than 1 performance loss and up to 38
- reduction in extra work
- Energy consumption could go up some work is
- independent of number of executed instrs (clock
- distribution) incr. execution time can incr.
Energy - Pipeline gating should reduce power consumption
10Results
11CS 7810 Lecture 13
Cache Decay Exploiting Generational Behavior to
Reduce Cache Leakage Power S. Kaxiras, Z. Hu, M.
Martonosi Proceedings of ISCA-28 July 2001
12Leakage Power Trends
- Circuit delay a 1/(V Vth)
- Leakage a num transistors (incr)
- supply voltage (decr)
- (exp) low thresh. voltage (incr)
- L1 and L2 caches are the biggest
- contributors (high transistor budgets)
13Vdd-Gating
- Leakage can be reduced by gating off the
- supply voltage to the circuit
- When applied to a cache, the contents of the
- SRAM cell are lost
- Cache decay apply Vdd-gating when you do not
- care about cache contents
14Lifetime of a Cache Line
15Overheads
- Hardware to determine when to decay
- Introduces additional cache misses
- Normalized cache leakage power
- Activeratio (fraction of cache that is powered
on) - (Counter overhead Leak) x activity
- (L2 access energy Leak) x num-misses
- Increased execution time (lt 0.7)
- L2 access/leakage ratio is 9
16Skiers Dilemma
New skis 400 Ski rentals
20 Heuristic Buy skis after rental cost
purchase price Ski trips 5 10 15 20
25 50 Optimal 100 200 300
400 400 400 Heuristic 100 200
300 800 800 800 Likewise, decay a
cache line when the cost of an additional miss
equals leakage dissipated so far
17Tracking Dead Time
- Each line has a 2-bit counter that gets reset on
- every access and gets incremented every 2500
- cycles through a global signal (negligible
overhead) - After 10,000 clock cycles, the counter reaches
- the max value and triggers a decay
- Adaptive decay Start with a short decay period
- if you have a quick miss, double the period if
there - is no miss, halve the period
18Results
19Overheads
20Other Results
- L2 cache is equally suitable to decay techniques
- -- lifetimes are scaled by a factor of 10, an
extra - miss also costs a lot more
- For their experiments, there is little
interference - from multiprogramming
- Some instructions can easily be identified as
- last touches to a cache block potential for
early - cache decay
- Can this apply to bpred, register file?
21Title