A Highly Configurable Cache Architecture for Embedded Systems - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

A Highly Configurable Cache Architecture for Embedded Systems

Description:

The cache can be reconfigured by software to be direct-mapped, two-way, or four ... reconfigured Cache have very little size and performance overhead. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 21
Provided by: present367
Category:

less

Transcript and Presenter's Notes

Title: A Highly Configurable Cache Architecture for Embedded Systems


1
A Highly Configurable Cache Architecture for
Embedded Systems
  • Presented by
  • Rania Kilany

2
Introduction
  • Energy consumption is a major concern in many
    embedded computing systems.
  • Cache Memories consumes 50 of the total energy.
  • Desktop systems runs a very wide range of
    applications and the cache architecture is set to
    work well with the given applications, technology
    and cost.
  • Embedded systems are designed to run a small
    range of well-defined applications. So the cache
    architecture can have both increased performance
    as well as lower energy consumption.

3
Energy
  • Power dissipation in CMOS circuits
  • Static power dissipation due to leakage current
  • Dynamic power dissipation due to logic switching
    current and the charging and discharging of the
    load capacitance.
  • Energy consumption
  • Fetching instruction and data from off-chip
    memory because of the high off-chip capacitance
    and large off-chip memory storage.
  • The Microprocessor stalls while waiting for the
    instruction and/or data.

4
Energy
  • The total energy due to memory accesses is as
    follows
  • Energy_mem Energy_dynamic Energy_static .(1)
  • Energy_dynamic
  • cache_hits energy_hit cache_misses
    energy_miss
  • Energy_miss
  • energy_offchip_access energy_uP_stall
    energy_cache_block_fill
  • Energy_static cycles energy_static_per_cycle

5
Base Cache
Cache size 8 Kbytes Block size 32
bytes 32-bit address four-way set-associative
5
6
21
6
The Impact of Cache Associativity
A Direct Mapped Cache
Four-way set associative cache
Four tags and Four data arrays are read
one tag and one data array are read
Low Miss Rate
Less power per access
More power per access
7
The Impact of Cache Associativity
A Direct Mapped Cache
Four-way set associative cache
Four tags and Four data arrays are read
one tag and one data array are read
High Miss Rate
Lowers Cache Miss Rate
Higher energy due to longer time High power for
accessing the next level memory
Reduce Time and Power that would have been caused
by misses
8
Example
9
Result
  • Tuning the associativity to a particular
    application is extremely important to minimize
    energy
  • Motivating the need for a cache with configurable
    associativity

10
Suggested Solution
  • The cache can be reconfigured by software to be
    direct-mapped, two-way, or four-way set
    associative, using a technique called
    way-concatenation.
  • reconfigured Cache have very little size and
    performance overhead.
  • Way-concatenation reduces energy caused by
    dynamic power.
  • Way-shutdown reduces energy caused by static
    power if combined with way-concatenation.

11
Way-Concatenation for Dynamic Power Reduction
12
Way-Concatenation for Dynamic Power Reduction
  • Develop a cache architecture whose associativity
    could be configured as one, two or four ways,
    while still utilizing the full capacity of the
    cache.
  • 6 index bits for a four-way cache
  • 7 index bits for a two-way cache
  • 8 index bits for a one-way cache.
  • Could be extended for 8 or more ways.

13
Results
14
Time and Area Overhead
  • How much the configurability of such a cache
    increases access time compared to a conventional
    four-way cache??
  • the configuration circuit is not on the cache
    access critical path, the circuit executes
    concurrently with the index decoding.
  • Set the sizes of the transistors in the configure
    circuit such that the speed of the configure
    circuit is faster than that of the decoder.
  • The configure circuit area is negligible.

15
Time and Area Overhead
  • Change two inverters on the critical path into
    NAND gates one inverter after the decoder, and
    one after the comparator.
  • Increasing the sizes of the NAND gates to three
    times their original size to reduce the critical
    path delay to the original time.
  • Replacing the inverters by NAND gates with larger
    transistors resulted in a less than 1 area
    increase of the entire cache.

16
Main Observations
  • The First observation A way-concatenation cache
    results in an average energy savings of 37
    compared to a conventional four-way cache, with
    savings over 60 for several examples.
  • Compared to a conventional direct mapped cache,
    the average savings are more modest, but the
    direct mapped cache suffers large penalties for
    some examples up to 284
  • The second observation way-concatenation is
    better than way-shutdown for reducing dynamic
    power. It saves more energy sometimes saving
    30-50

17
Way-Shutdown for Static Energy Reduction
  • Although way-shutdown increases the miss rate for
    some benchmarks, for other benchmarks, way
    shutdown has negligible impact.
  • To save static energy, involving a circuit level
    technique called gated-Vdd.
  • When the gated-Vdd transistor is turned off, the
    stacking effect of the extra transistor reduces
    the leakage energy dissipation.

18
Way-Shutdown for Static Energy Reduction
19
Conclusions
  • To save dynamic power A configurable cache
    design method called way-concatenation was
    developed. It saves 37 compared to a
    conventional four-way set-associative cache
  • To save static power Extended the configurable
    cache to include a way-shutdown method, with
    average savings of 40.

20
Thank You
Write a Comment
User Comments (0)
About PowerShow.com