Energy Modeling of the Cache Subsystem

1 / 24
About This Presentation
Title:

Energy Modeling of the Cache Subsystem

Description:

LRU updates its state history each time an access is made. Currently completely ignored ... for all the models of new architecture aspects. Report on Methodology ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 25
Provided by: cseIit

less

Transcript and Presenter's Notes

Title: Energy Modeling of the Cache Subsystem


1
Energy Modeling of the Cache Subsystem
  • Ankit Raizada (2006MCS2265)
  • Under Guidance of
  • Prof. Preeti Ranjan Panda

2
Introduction
  • Cache are significant part of processor based
    systems
  • In terms performance
  • In terms area
  • In terms of energy dissipation
  • Tools are available for Design space exploration
    (CACTI, ZOOM)
  • Number of inaccuracies reported in energy
    estimation in this process 12 .

3
Problem Statement
  • To build a model which takes different cache
    parameters, trace of program execution
    corresponding to different types of access during
    program execution and implementation technology
    as input and produces an energy estimates for the
    cache due to sequence of transactions.
  • Inputs
  • Phase I Size, Line size, Associativity,
    Replacement Policy.
  • Phase II Sub-line Size, Write Mode, Bus width,
    Prefetch size
  • Phase III Transaction modeling
  • Read, Write, Miss, Pre fetch, Flush, Invalidate.

4
CACTI Brief Description
  • Cache design exploration tool
  • Finds the most optimal SRAM organization
  • Estimates Area, Timing and Power of the cache

5
Problems
  • No consideration for the policy part of the
    controller
  • Replacement Policy
  • Write back
  • No consideration to functional aspect of cache
  • Energy dissipation in Hit type access and miss
    type access isnt same
  • No option for the Low energy design architectural
    techniques in exploration
  • Way prediction 4
  • Way halting 4
  • Objective function for the optimization of the
    cache design is inflexible
  • Score W1Area W2Latency W3Energy

6
Throat Clearing
  • These problems do not mean that CACTI tool is
    useless
  • CACTI is a design exploration tool so Fidelity of
    results is more important than Accuracy of
    estimations
  • CACTI finds optimal SRAM organization
  • This work (currently) is focused on improving the
    accuracy of estimates

7
Replacement Policy
  • Replacement policy logic is a significant portion
    of Cache Controller logic
  • LRU is pretty widely used policy
  • LRU updates its state history each time an access
    is made.
  • Currently completely ignored by the CACTI (as of
    version 4.2)
  • Need for an experimental model of LRU to estimate
    power from.

8
LRU Model
  • LRU using a simple systolic Array, an
    implementation level model
  • Proposed by J.P. Grossman in A Systolic Array
    for Implementing LRU Replacement
  • Essentially it implements a move-to-back list.

9
LRU Systolic Array
3
4
1
3
0
2
5
LRU
MRU
4
1
0
2
5
3
10
Systolic Array
Compare
1
2
3
0
1
Latch
11
Systolic Array
Compare
1
2
3
0
1
Latch
12
Systolic Array
Compare
1
2
3
0
3
Latch
13
Systolic Array
Compare
1
2
0
0
3
Latch
14
Systolic Array
Compare
2
0
1
3
Latch
15
Array Node
L
?
Cur- rent Index
MUX
M
OR
16
Implementation and Power Evaluation
  • This LRU scheme was implemented in VHDL RTL
  • Advantages
  • For power Estimation
  • Synthesized using Synopsis Design Compiler
  • Europractice cell library for 180 nm process was
    used
  • Static Time simulation was done to obtain traces

17
Contd
  • Finally Prime Power was used to evaluate the
    power (as a result of simulation)
  • Per operation energy can then be easily found
  • E (Avg Power) (Simulation Time) / (Number of
    state updates of LRU)
  • This is what CACTI is discounting

18
Result
  • For a 32K 64 byte cache line 4-way cache on 180nm
    Technology (estimates by CACTI)
  • Least Cycle Time possible 1ns
  • Energy per Read 0.51 nJ
  • Energy per Write 0.05 nJ

19
Result
  • For a 2-bit LRU (4 way associative cache)
  • Clock cycle 1.0 ns (Constrain)
  • Avg Power 3.89 mW
  • Energy for 28 updates of LRU 0.56016 nJ
  • Energy per 1 update 0. 02 nJ

20
Analysis
  • Assuming that Read and Write a equally probable
  • Energy per cache access (0.51 0.05)/2 0.28
    nJ
  • Approximately 6.6 percent of power of the cache
    access
  • By Way Halting ,Way Prediction energy which is
    spent in the SRAM part of cache will go down
    (roughly by the order of the number of ways).

21
Conclusion
  • Discounting the LRU energy is NOT advisable
    especially in a low power cache while estimating
    power

22
Timeframe Minor Project
  • Literature survey ( Mid Feb 07)
  • Problem Analysis and Methodology (1st March 07)
  • Enhancement of CACTI (15th Apr 07)
  • Validation against actual design (May 07)

23
Current Status
  • Initial Study and state of the art determination
    complete!
  • Some of the problems have been identified
  • First cut methodology is ready
  • Following the methodology , replacement policy
    for LRU has been studied, estimated for power.

24
Deliverable
  • Model Implementation
  • HDL for all the models of new architecture
    aspects
  • Report on Methodology
  • Final Report
Write a Comment
User Comments (0)