Low Static-Power Frequent-Value Data Caches - PowerPoint PPT Presentation

About This Presentation
Title:

Low Static-Power Frequent-Value Data Caches

Description:

DRG:dynamically resizes cache by monitoring the miss rate. ... FVs can be dynamically captured. FVs are also widespread within ... Dynamically-determined FVs ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 14
Provided by: frank126
Learn more at: http://www.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: Low Static-Power Frequent-Value Data Caches


1
Low Static-Power Frequent-Value Data Caches
  • Chuanjun Zhang, Jun Yang, and Frank Vahid
  • Dept. of Electrical Engineering
  • Dept. of Computer Science and Engineering
  • University of California, Riverside
  • Also with the Center for Embedded Computer
    Systems at UC Irvine
  • This work was in part supported by the National
    Science Foundation and the Semiconductor Research
    Corporation

2
Leakage Power Dominates
  • Growing impact of leakage power
  • Increase of leakage power due to scaling of
    transistors lengths and threshold voltages.
  • Power budget limits use of fast leaky
    transistors.
  • Cache consumes much static power
  • Caches account for the most of the transistors on
    a die.
  • Related work
  • DRGdynamically resizes cache by monitoring the
    miss rate.
  • Cache line decay dynamically turns off cache
    lines.
  • Drowsy cache low leakage mode.

3
Frequent Values in Data Cache
(J. Yang and R. Gupta Micro 2002)
Microprocessor
data
data
data
data
data
data
data
address
address
address
address
address
address
address
L1 DATA CACHE
  • Frequently accessed values behavior

4
Frequent Values in Data Cache
(J. Yang and R. Gupta Micro 2002)
  • 32 FVs account for around 36 of the total data
    cache accesses for 11 Spec 95 Benchmarks.
  • FVs can be dynamically captured.
  • FVs are also widespread within data cache
  • Not just accesses, but also stored throughout.
  • FVs are stored in encoded form.
  • 4 or 5 bits represent 16 or 32 FVs.
  • Non-FVs are stored in unencoded form.
  • The set of frequent values remains fixed for a
    given program run.

FVs
00000000
00000000
00000000
00000000
00100000
00000000
00000000
00000000
FFFFFFFF
FFFFFFFF
FFFFFFFF
FVs accessed
00000000
00000000
00100000
00000000
FF000000
FFFFFFFF
FFFFFFFF
FFFFFFFF
FVs in D
5
Original Frequent Value Data Cache Architecture
  • Data cache memory is separated as low-bit and
    high-bit array.
  • 5 bits encodes 32 FVs.
  • 27 bits are not accessed for FVs.
  • A register file holds the decoded lines.
  • Dynamic power is reduced.
  • Two cycles when accessing Non-FVs.
  • Flag bit 1-FV 0-NFV

6
New FV Cache Design One Cycle Access to Non FV
  • No extra delay in determining accesses of the
    27-bit portion
  • Leakage energy proportion to program execution
    time
  • New driver is as fast as the original by tuning
    the NAND gates transistor parameters
  • Flag bit 0-FV 1-NFV

32 bits
driver
27 bits
decoder output
5 bits
Original cache line architecture
new word line driver
original word line driver
7
Low leakage SRAM Cell and Flag Bit
 
Vdd
Bitline
Gated-Vdd Control
Bitline
Vdd
Bitline
Bitline
Flag bit output
Gated_Vdd Control
Gnd
Gnd
  SRAM cell with a pMOS gated Vdd control.
 Flag bit SRAM cell
8
Experiments
  • SimpleScalar.
  • Eleven Spec 2000 benchmarks
  • Fast Forward the first 1 billion and execute 500M

Configuration of the simulated processor.
9
Performance Improvement of One Cycle to Non-FV
Hit rate of FVs in data cache.  
  • Two cycles impact performance hence increase
    leakage power
  • One cycle access to Non FV achieves 5.5
    performance improvement (and hence impacts
    leakage energy correspondingly)


5.5
Performance (IPC) improvement of one-cycle FV
cache vs. two-cycle FV cache.
10
Distribution of FVs in Data Cache
  • FVs are widely found in data cache memory. On
    average 49.2.
  • Leakage power reduction proportional to the
    percentage occurrence of FVs

Percentage of data cache words (on average) that
are FVs.
11
Static Energy Reduction
  • 33 total static energy savings for data caches.

12
How to Determine the FVs
  • Application-specific processors
  • The FVs can be first identified offline through
    profiling, and then synthesized into the cache so
    that power consumption is optimized for the hard
    coded FVs.
  • Processors that run multiple applications
  • The FVs can be located in a register file to
    which different applications can write a
    different set of FVs.
  • Dynamically-determined FVs
  • Embed the process of identifying and updating FVs
    into registers, so that the design dynamically
    and transparently adapts to different workloads
    with different inputs automatically.

13
Conclusion
  • Two improvements to the original FV data cache
  • One cycle access to Non FVs
  • Improve performance (5.5) and hence static
    leakage
  • Shut off the unused 27 bits portion of a FV
  • The scheme does not increase data cache miss rate
  • The scheme further reduces data cache static
    energy by over 33 on average
Write a Comment
User Comments (0)
About PowerShow.com