Title: Energy Modeling of the Cache Subsystem
1Energy Modeling of the Cache Subsystem
- Ankit Raizada (2006MCS2265)
- Under Guidance of
- Prof. Preeti Ranjan Panda
2Introduction
- Cache are significant part of processor based
systems - In terms performance
- In terms area
- In terms of energy dissipation
- Tools are available for Design space exploration
(CACTI, ZOOM) - Number of inaccuracies reported in energy
estimation in this process 12 .
3Problem Statement
- To build a model which takes different cache
parameters, trace of program execution
corresponding to different types of access during
program execution and implementation technology
as input and produces an energy estimates for the
cache due to sequence of transactions. - Inputs
- Phase I Size, Line size, Associativity,
Replacement Policy. - Phase II Sub-line Size, Write Mode, Bus width,
Prefetch size - Phase III Transaction modeling
- Read, Write, Miss, Pre fetch, Flush, Invalidate.
4CACTI Brief Description
- Cache design exploration tool
- Finds the most optimal SRAM organization
- Estimates Area, Timing and Power of the cache
5Problems
- No consideration for the policy part of the
controller - Replacement Policy
- Write back
- No consideration to functional aspect of cache
- Energy dissipation in Hit type access and miss
type access isnt same - No option for the Low energy design architectural
techniques in exploration - Way prediction 4
- Way halting 4
- Objective function for the optimization of the
cache design is inflexible - Score W1Area W2Latency W3Energy
6Throat Clearing
- These problems do not mean that CACTI tool is
useless - CACTI is a design exploration tool so Fidelity of
results is more important than Accuracy of
estimations - CACTI finds optimal SRAM organization
- This work (currently) is focused on improving the
accuracy of estimates
7Replacement Policy
- Replacement policy logic is a significant portion
of Cache Controller logic - LRU is pretty widely used policy
- LRU updates its state history each time an access
is made. - Currently completely ignored by the CACTI (as of
version 4.2) - Need for an experimental model of LRU to estimate
power from.
8LRU Model
- LRU using a simple systolic Array, an
implementation level model - Proposed by J.P. Grossman in A Systolic Array
for Implementing LRU Replacement - Essentially it implements a move-to-back list.
9LRU Systolic Array
3
4
1
3
0
2
5
LRU
MRU
4
1
0
2
5
3
10Systolic Array
Compare
1
2
3
0
1
Latch
11Systolic Array
Compare
1
2
3
0
1
Latch
12Systolic Array
Compare
1
2
3
0
3
Latch
13Systolic Array
Compare
1
2
0
0
3
Latch
14Systolic Array
Compare
2
0
1
3
Latch
15Array Node
L
?
Cur- rent Index
MUX
M
OR
16Implementation and Power Evaluation
- This LRU scheme was implemented in VHDL RTL
- Advantages
- For power Estimation
- Synthesized using Synopsis Design Compiler
- Europractice cell library for 180 nm process was
used - Static Time simulation was done to obtain traces
17Contd
- Finally Prime Power was used to evaluate the
power (as a result of simulation) - Per operation energy can then be easily found
- E (Avg Power) (Simulation Time) / (Number of
state updates of LRU) - This is what CACTI is discounting
18Result
- For a 32K 64 byte cache line 4-way cache on 180nm
Technology (estimates by CACTI) - Least Cycle Time possible 1ns
- Energy per Read 0.51 nJ
- Energy per Write 0.05 nJ
19Result
- For a 2-bit LRU (4 way associative cache)
- Clock cycle 1.0 ns (Constrain)
- Avg Power 3.89 mW
- Energy for 28 updates of LRU 0.56016 nJ
- Energy per 1 update 0. 02 nJ
20Analysis
- Assuming that Read and Write a equally probable
- Energy per cache access (0.51 0.05)/2 0.28
nJ - Approximately 6.6 percent of power of the cache
access - By Way Halting ,Way Prediction energy which is
spent in the SRAM part of cache will go down
(roughly by the order of the number of ways).
21Conclusion
- Discounting the LRU energy is NOT advisable
especially in a low power cache while estimating
power
22Timeframe Minor Project
- Literature survey ( Mid Feb 07)
- Problem Analysis and Methodology (1st March 07)
- Enhancement of CACTI (15th Apr 07)
- Validation against actual design (May 07)
23Current Status
- Initial Study and state of the art determination
complete! - Some of the problems have been identified
- First cut methodology is ready
- Following the methodology , replacement policy
for LRU has been studied, estimated for power.
24Deliverable
- Model Implementation
- HDL for all the models of new architecture
aspects - Report on Methodology
- Final Report