Power Efficient Data Cache Designs - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Power Efficient Data Cache Designs

Description:

Exploiting criticality for saving power in Dcaches. Dynamic and static power tradeoff. Instruction classification: critical or non-critical. Results. Conclusions ... – PowerPoint PPT presentation

Number of Views:147
Avg rating:3.0/5.0
Slides: 31
Provided by: DAC63
Category:

less

Transcript and Presenter's Notes

Title: Power Efficient Data Cache Designs


1
Power Efficient Data Cache Designs
  • Jaume Abella1, Antonio González1,2
  • jabella, antonio _at_ac.upc.es

1 Computer Architecture Dept. UPC-Barcelona
2 Intel Barcelona Research Center Intel Labs,
UPC-Barcelona
2
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

3
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

4
Motivation
  • Thermal limitations
  • Battery life
  • Cooling is expensive
  • Not only dynamic power, but static power is
    crucial

5
Data Cache
  • Caches are the largest structures in a chip
  • Significant dynamic power per access
  • High static power
  • Idea classifying loads and spending less power
    for non-critical ones

6
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

7
Basic Idea
  • Idea
  • Classifying loads as critical or non-critical
  • Having two caches
  • fast and high power for critical loads
  • slow and low power for non-critical loads.
  • Non-critical loads do not interfere with critical
    ones.
  • Using different Vdd and Vth for each cache

8
Which Cache?
  • Which cache do we target?
  • L1 high number of accesses per cycle. Thus, high
    dynamic power may be saved.
  • Why not L2?
  • L2 has small number of accesses compared to L1.
  • Making the whole L2 cache more power efficient
    and slower has small impact on performance.
  • It also contains instructions. It may be
    necessary to analyze also the criticality for
    instructions.

9
Data Cache Organization
L2 cache
Slow cache
Fast cache
Is the load critical?
YES
NO
LOAD
10
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

11
Voltages and Delays
  • Our design point
  • Slow cache latency is twice the Fast cache
    latency.
  • This is achieved for different combinations of
    Vdd and Vth

12
Vdd and Vth
  • We do not know if dynamic power is more or less
    significant than static power
  • Static energy depends on execution time
  • Dynamic energy depends on the accesses
  • Tradeoff
  • Choose Vdd and Vth to reduce dynamic and static
    power by the same percentage.

13
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

14
Instruction Classification (I)
  • Memory accesses can be classified...
  • Depending on their position in the issue queue,
    reorder buffer,...
  • Propagating criticality using tokens from time to
    time and updating a predictor.
  • Depending on how often the accesses miss in
    cache.
  • ...

15
Instruction Classification (II)
  • How do we classify memory accesses
  • Classifying all instructions.
  • An instruction is critical if
  • Its data is used at least by another critical
    instruction issued immediately.
  • The number of cycles elapsed since the
    instruction finishes till it commits is smaller
    than a given threshold

16
Instruction Classification (III)
  • Effectiveness

17
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

18
Framework
  • Processor (4-way)
  • L1 Dcache 2-way 32 bytes/line
  • L2 Cache 512 Kb, 4-way, 64 bytes/line
  • Benchmarks
  • SpecINT2000 and SpecFP2000
  • Simulator
  • Wattch and CACTI

19
Baseline
  • Monolithic cache
  • Size 16Kb, 32Kb and 64Kb
  • Proposed organizations
  • Fast is ¼ size and Slow is ½ size
  • Fast is ¼ size and Slow is ¾ size (same total
    size than baseline, but slow cache is 3-way)
  • Comparison
  • Using Fast cache and Slow Cache hierarchically
    (Fast cache is L0, Slow cache is L1)

20
Criticality-Based Scheme
  • Alternatives
  • Critical access Fast, non-critical Slow
  • Critical access both, non-critical Slow
  • Always access both
  • Different levels of power and performance. Best
    for performance Always access both

21
Results Performance
22
Results Misses
Same trends for 32Kb and 64Kb caches
23
Results Dynamic Power
24
Results Static Power
25
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

26
Conclusions (I)
  • Results
  • Slightly better than monolithic cache for SPECFP,
    a bit worse for SPECINT
  • If we use a hierarchical organization fast cache
    as L0 cache and slow cache as L1
  • Smaller power dissipation than flat organization
  • Very little performance loss vs flat
    organization
  • Much simpler

27
Conclusions (II)
  • It does not perform better because...
  • Criticality is an instruction property, but at
    the end we classify data. Thus, there are
    interferences
  • Non-critical load fetches data A to slow cache
  • Critical load needs A, but it is in the slow
    cache

28
Contents
  • Motivation
  • Exploiting criticality for saving power in
    Dcaches
  • Dynamic and static power tradeoff
  • Instruction classification critical or
    non-critical
  • Results
  • Conclusions
  • Future work

29
Future Work
  • Some ideas for further research
  • Most loads are most of the times critical or
    non-critical. Why not classifying them
    statically?
  • Compiler has information to know if a
    non-critical load fetches data that will be used
    by a critical load
  • Compiler can help hardware

30
Q A
Write a Comment
User Comments (0)
About PowerShow.com