Title: Dynamic Cache Reconfiguration for Soft Real-Time Embedded Systems
1Dynamic Cache Reconfiguration for Soft Real-Time
Embedded Systems
- Weixun Wang and Prabhat Mishra
- Embedded Systems Lab
- Computer and Information Science and Engineering
- University of Florida
- Ann Gordon-Ross
- Electrical and Computer Engineering
- University of Florida
PRESENTED BY KANIKA CHAWLA SOWMITH
BOYANPALLI PARTH SHAH
2Motivation
- Decrease Energy consumption and improve
performance using dynamic cache reconfiguration
3Introduction
- Real Time Systems ??
- Types
- Hard Real Time Systems
- Firm Real Time Systems
- Soft Real Time Systems
4Previous Work
- Embedded Systems run on Battery
- Important to make these applications energy
efficient - Techniques used
- Dynamic Power Management Sleep mode when
processor is Idle - Dynamic Voltage Scaling Reducing Clock
frequency so that tasks run slower and use less
power. Tradeoff between performance and energy
consumed.
5Reconfigurable Cache
- Modern Techniques use reconfigurable parameters
which we tune dynamically to improve system
performance - One such parameter is cache as due to modern
computing requirements we need large memory
caches which utilize more than half of the power - Hence using different configuration of caches for
different part of the program can decrease energy
consumption
6Cache In Real Time Systems
- Misses Due to preemption
- Prevention
- Cache Locking
- Cache Partitioning
7Reconfigurable cache for RTS
- Not appropriate for real time systems as
conditions change on run time and hence not sure
when which configuration is good. - And with Hard real time systems if cache
reconfiguration results in missing a deadline
then the system fails - Hence we use it only for soft real time systems
where the time constraint is not that strict - This paper discusses reconfiguration only for L1
caches
8Reconfigurable Cache
- Using cache configurations suited for the
application can save 62 energy. - The cache reconfiguration doesnt result in a
performance overhead as it is handled by a
lightweight co processor which has its own
register set
9Profiling
- Choosing best cache configuration during run time
- Intrusinve method run all cache config and
choose which is best. This results in overhead - Auxillary method Run all config at the same time
on an auxillary system. No performance overhead.
But aux system is very power hungry.
10Overview
- Base Cache A cache which is best suited for
entire task with respect to energy and
performance - Phase The period of time between a predefined
potential preemption point and task completion - What happens on an interrupt?
11Overview
Task 1
Task 2
In traditional real-time systems
In our approach
12Phase-based Optimal Cache Selection
- A task is divided by n potential preemption
points or partition points - Each phase has its optimal cache configuration
- Performance-optimal and energy-optimal
- A static profile table is generated for each task
Pn-1
P1
P2
0
Task Execution Time
phase n (n-1/n)
Cn
phase 3 (2/n)
C3
phase 2 (1/n)
C2
phase 1 (0/n)
C1
13(No Transcript)
14(No Transcript)
15Choosing Partition Points
- Ideal instructions b/w each point is one as this
will have max energy savings But this requires
large lookup which increases overhead due to
searching the ideal one among the table - Thus there is a tradeoff between energy savings
and no of partition points used - We observe that for large partition points
adjacent points have the same cache configs - Also having high number of partitioning points is
waste because of 90/10 rule
16(No Transcript)
17Phase-based Optimal Cache Selection
- Potential preemption points may not be the same
as actual preemption points. - They are used for cache configuration selection.
- Partition factor determines the potential
preemption points and resulting phases - Large partition factor leads to large look-up
table - Not feasible due to area constraints
- Large partition factor may not save more energy
- Partition factor around 4 to 7 is profitable
18Scheduling-Aware Cache Reconfiguration
- Statically scheduled systems
- Arrival times, execution times, and deadlines are
known a priori for each task - Statically profile energy-optimal configurations
for every execution period of each task without
violating any task deadlines - Dynamically scheduled systems
- Task preemption points are unknown
- New tasks can enter the system at any time
- Conservative approach
- Aggressive approach
19Conservative Approach
- Energy-optimal cache configuration with equal or
higher performance than base cache - Nearest-neighbor
- Use the nearest partition point to decide which
cache configuration to tune to
- Static Profile table
- Deadline-aware energy-optimal configurations
- Task list entry
- Runtime information
20Conservative Approach
- Algorithm
- Input Task list entry
- Output A deadline-aware cache configuration for
the resumed task Tc - for i 0 to p- 2 do
- if TINTc i/p EINTc lt TINTc (i 1)/p then
- if (EINTc - TINTc i/p) lt (TINTc (i 1)/p-
EINTc) then - PHASETc i/p
- else
- PHASETc (i 1)/p
- end if
- end if
- end for
- if EINTc TINTc (p- 1)/p then
- PHASETc (p- 1)/p
- end if
- CacheTc EOi(PHASETc)
- Return CacheTc
21Aggressive Approach
- Energy-optimal cache configuration
Performance-optimal cache configuration - Includes their execution time as well.
- Ready task list (RTL)
- Contains all the tasks currently in the system
- Static Profile table
- Energy-opt configuration
- Perf.-optimal configuration
- Task list entry
- Runtime information
22Aggressive Approach
- When task T is the only task in the system
- Always tune to energy-optimal cache if possible
- When task T preempts another task
- Run schedulability check
- Discard the lowest priority task if absolutely
necessary - Tune to energy-optimal cache
- if all other tasks in RTL can meet their
deadlines using their performance-optimal caches - When task T is preempted by another task
- Calculate and store runtime information (RIN, CP)
23Experimental Setup
- Configurable cache architecture for L1 cache is
used - Four-bank cache of base size 4KB
- Line size from 16bytes-64 bytes
- Associativity of 1-way 2-way and 4-way
- L2 cache is fixed at 64k unified cache with 4-way
associativity and 32B line size
24Experimental Setup
- SimpleScalar to obtain simulation statistics
- Used external I/O (eio) trace file, checkpointing
and fastforwarding to generate static profile
table - Energy model
- Zhang et al. and CACTI 4.2
- Benchmarks
- EEMBC
- MediaBench
25Energy and Performance Rank(I-Cache)
26Energy and Performance Rank(D-Cache)
27Stability of Statically Determined Configurations
28(No Transcript)
29Effect of Extended profile approach in I-Cache
28 average energy savings using conservative
approach
51 average energy saving using aggressive
approach
30Energy Savings (Data Cache)
17 average energy savings using conservative
approach
22 average energy saving using aggressive
approach
30
31Hardware Overhead
- Profile Table stores 18 cache configurations
- Synthesized using Synopsys Design Compiler
- Assumed lookup frequency of one million
nanoseconds - Table lookup every 500K cycles using 500 MHz CPU
- Average energy penalty is 450 nJ
- Less than 0.02 of overall savings (2825563 nJ)
32Pitfalls Future Scope
- Pitfalls
- As real Time systems are highly time constrained
having many deadline misses will have a major
effect on performance - This technique requires task profiling before
hand which may not be possible in all cases - This technique doesnt work for hard real time
systems - Future Scope
- Increasing this scheme for L2 caches too
33Conclusion
- Dynamic cache reconfiguration is a promising
approach to improve both energy consumption and
overall performance. - Developed a scheduling aware dynamic cache
reconfiguration technique - On average 50 reduction in overall cache energy
consumption in soft real-time systems - Future work
- Hard real-time systems
- Multi-core and multi-processor systems
34(No Transcript)
35THANK YOU