Fast Configurable-Cache Tuning with a Unified Second-Level Cache - PowerPoint PPT Presentation

About This Presentation
Title:

Fast Configurable-Cache Tuning with a Unified Second-Level Cache

Description:

Fast Configurable-Cache Tuning with a Unified Second-Level Cache Ann Gordon-Ross and Frank Vahid* Department of Computer Science and Engineering – PowerPoint PPT presentation

Number of Views:166
Avg rating:3.0/5.0
Slides: 13
Provided by: test252
Category:

less

Transcript and Presenter's Notes

Title: Fast Configurable-Cache Tuning with a Unified Second-Level Cache


1
Fast Configurable-Cache Tuning with a Unified
Second-Level Cache
  • Ann Gordon-Ross and Frank Vahid
  • Department of Computer Science and Engineering
  • University of California, Riverside
  • Also with the Center for Embedded Computer
    Systems, UC Irvine

Nikil Dutt Center for Embedded Computer
Systems School for Information and Computer
Science University of California, Irvine
This work was supported by the U.S. National
Science Foundation and by the Semiconductor
Research Corporation
2
Cache Hierarchy Optimizations
ARM920T(Segars 01)
  • The cache hierarchy is a
    good candidate for
    optimizations
  • Applications require
    highly diverse cache
    configurations for optimal
    energy consumption of the
    cache subsystem
  • Over 50 energy savings possible in the cache
    subsystem due to configuration Gordon-Ross 04

3
Previous Cache Tuning Methodologies
  • Previous methods limit configurability to
    facilitate easier heuristic development

I
I
I
Tuner
Microprocessor
Tuner
Microprocessor
Main Memory
Main Memory
D
D
D
Single level cache subsystem with separate caches
- less than 50 configurations
Multi-level cache subsystem with separate caches
- a few hundred configurations
4
Motivation
  • Unified second level caches are commonplace in
    desktop computers and are becoming increasingly
    popular in embedded microprocessors
  • Current cache tuning heuristics do not directly
    apply due to the complexity of tuning in the
    presence of a unified second level of cache -
    circular dependency
  • Search space explodes to 18,000 configurations

A change in any cache effects the performance of
all other caches in the hierarchy
L1 I
L2 U
L1 D
5
Motivation
  • We present an effective and efficient cache
    tuning heuristic for a highly configurable cache
    hierarchy including a unified second level of
    cache.

I
Tuner
Microprocessor
U
Main Memory
D
6
Level One Configurable Cache
  • The base cache consists of 4 2KByte banks that
    may individually be shutdown for size
    configuration
  • Line size is
    configurable
  • Way concatenation allows for
    configurable associativity
  • For evaluation of energy
    savings, we used a base cache
    of size
    8KB with a 32 byte line size and 4 way
    associativity

Way shutdown
2 KB
2 KB
2 KB
2 KB
2 KB
2 KB
2 KB
2 KB
8 KBytes
4 KBytes
8 KBytes 2-way
2 KB
2 KB
2 KB
2 KB
Way concatenation
7
Level Two Configurable Cache
  • For maximum configurability, level two cache
    utilized the Motorola MCORE style way management
  • Ways can be designated as instruction, data,
    unified, or off
  • Line size is configurable
  • For evaluation of energy savings, we used a base
    cache size of 64 KB with a 64 byte line size and
    4 fully unified ways

U-way
D-way
U-way
U-way
I-way
8
Alternating Cache Exploration with Additive Way
Tuning (ACE-AWT)
Tune level one sizes
Tune level one line sizes
Tune level one associativities




D
I
D
D
I
I
Tune level two associativity


Tune level two line size
Tune level two size
D
These steps are difficult because changing size
and associativity is synonymous in a way
management style cache
9
ACE-AWT - First Phase
  • The first phase is applied during size exploration

DONE
10
ACE-AWT - Fine Tuning Phase
  • The fine tuning phase is applied during
    associativity exploration

Start with resulting cache from the first phase
DONE
11
Results - Energy Savings
  • Heuristic achieved near optimal results (when
    optimal could be computed)
  • 62energy savings compared to base cache
  • Yet only searched 0.2 of the search space
  • Also improved performance by 35 compared to base
    cache due to tuned line sizes

12
Conclusions and Future Work
  • We developed an efficient and effective cache
    tuning heuristic to tune a two level cache with a
    unified second level of cache
  • 18,000 possible configurations
  • Compared to a reasonable base cache
    configuration
  • 62 energy savings
  • Explores only 0.2 of the search space
  • 35 improvement in performance
  • Future work includes application of the tuning
    heuristic to different execution phases in the
    application
Write a Comment
User Comments (0)
About PowerShow.com