Title: A Highly Configurable Cache Architecture for Embedded Systems
1A Highly Configurable Cache Architecture for
Embedded Systems
- Presented by
- Rania Kilany
2Introduction
- Energy consumption is a major concern in many
embedded computing systems. - Cache Memories consumes 50 of the total energy.
- Desktop systems runs a very wide range of
applications and the cache architecture is set to
work well with the given applications, technology
and cost. - Embedded systems are designed to run a small
range of well-defined applications. So the cache
architecture can have both increased performance
as well as lower energy consumption.
3Energy
- Power dissipation in CMOS circuits
- Static power dissipation due to leakage current
- Dynamic power dissipation due to logic switching
current and the charging and discharging of the
load capacitance. - Energy consumption
- Fetching instruction and data from off-chip
memory because of the high off-chip capacitance
and large off-chip memory storage. - The Microprocessor stalls while waiting for the
instruction and/or data.
4Energy
- The total energy due to memory accesses is as
follows - Energy_mem Energy_dynamic Energy_static .(1)
- Energy_dynamic
- cache_hits energy_hit cache_misses
energy_miss - Energy_miss
- energy_offchip_access energy_uP_stall
energy_cache_block_fill - Energy_static cycles energy_static_per_cycle
5Base Cache
Cache size 8 Kbytes Block size 32
bytes 32-bit address four-way set-associative
5
6
21
6The Impact of Cache Associativity
A Direct Mapped Cache
Four-way set associative cache
Four tags and Four data arrays are read
one tag and one data array are read
Low Miss Rate
Less power per access
More power per access
7The Impact of Cache Associativity
A Direct Mapped Cache
Four-way set associative cache
Four tags and Four data arrays are read
one tag and one data array are read
High Miss Rate
Lowers Cache Miss Rate
Higher energy due to longer time High power for
accessing the next level memory
Reduce Time and Power that would have been caused
by misses
8Example
9Result
- Tuning the associativity to a particular
application is extremely important to minimize
energy - Motivating the need for a cache with configurable
associativity
10Suggested Solution
- The cache can be reconfigured by software to be
direct-mapped, two-way, or four-way set
associative, using a technique called
way-concatenation. - reconfigured Cache have very little size and
performance overhead. - Way-concatenation reduces energy caused by
dynamic power. - Way-shutdown reduces energy caused by static
power if combined with way-concatenation.
11Way-Concatenation for Dynamic Power Reduction
12Way-Concatenation for Dynamic Power Reduction
- Develop a cache architecture whose associativity
could be configured as one, two or four ways,
while still utilizing the full capacity of the
cache. - 6 index bits for a four-way cache
- 7 index bits for a two-way cache
- 8 index bits for a one-way cache.
- Could be extended for 8 or more ways.
13Results
14Time and Area Overhead
- How much the configurability of such a cache
increases access time compared to a conventional
four-way cache?? - the configuration circuit is not on the cache
access critical path, the circuit executes
concurrently with the index decoding. - Set the sizes of the transistors in the configure
circuit such that the speed of the configure
circuit is faster than that of the decoder. - The configure circuit area is negligible.
15Time and Area Overhead
- Change two inverters on the critical path into
NAND gates one inverter after the decoder, and
one after the comparator. - Increasing the sizes of the NAND gates to three
times their original size to reduce the critical
path delay to the original time. - Replacing the inverters by NAND gates with larger
transistors resulted in a less than 1 area
increase of the entire cache.
16Main Observations
- The First observation A way-concatenation cache
results in an average energy savings of 37
compared to a conventional four-way cache, with
savings over 60 for several examples. - Compared to a conventional direct mapped cache,
the average savings are more modest, but the
direct mapped cache suffers large penalties for
some examples up to 284 - The second observation way-concatenation is
better than way-shutdown for reducing dynamic
power. It saves more energy sometimes saving
30-50
17Way-Shutdown for Static Energy Reduction
- Although way-shutdown increases the miss rate for
some benchmarks, for other benchmarks, way
shutdown has negligible impact. - To save static energy, involving a circuit level
technique called gated-Vdd. - When the gated-Vdd transistor is turned off, the
stacking effect of the extra transistor reduces
the leakage energy dissipation.
18Way-Shutdown for Static Energy Reduction
19Conclusions
- To save dynamic power A configurable cache
design method called way-concatenation was
developed. It saves 37 compared to a
conventional four-way set-associative cache - To save static power Extended the configurable
cache to include a way-shutdown method, with
average savings of 40.
20Thank You