Title: Power Aware Realtime Systems
1Power AwareReal-time Systems
Rami Melhem A joint project with Daniel Mosse,
Bruce Childers, Mootaz Elnozahy
2Outline
- Introduction to real-time systems
- Voltage and frequency scaling (no slide)
- Speed adjustment in frame-based systems
- Dynamic speed adjustment in multiprocessor
environment - Speed adjustment in general periodic systems
- Static speed adjustment for tasks with different
power consumptions - Maximizing computational reward for given energy
and deadline - Tradeoff between energy consumption and
reliability - The Pecan board
3Real-time systems
Hard RT systems
Soft RT systems
Periodic
Aperiodic (frame-based)
non preemptive
non preemptive
preemptive
preemptive
parallel processors
uni-processor
4Periodic, Rate Monotonic scheduling
- n tasks with maximum computation times Ci and
periods Ti, for i1,,n.
C2, T4
C3, T7
- Static priority scheduling (high priority to task
with shorter period).
- If C2 3.1, then C2 will miss its deadline
although the utilization is ,
which is less than 1.
3.1
2
4
7
Liu and Layland RMS utilization bound is n (21/n
- 1)
5Periodic, EDF scheduling
- n tasks with maximum computation times Ci and
periods Ti, for i1,,n.
C2, T4
C3, T7
- Dynamic priority scheduling (high priority to the
task with earlier deadline) - All tasks will meet their deadlines if
utilization is not more than 1.
6speed adjustment in frame-based systems
Static speed adjustment
Assumption all tasks have the same deadline.
Smax
Smin
time
Select the speed based on worst-case execution
time,WCET, and deadline
7Dynamic Speed adjustment techniques for linear
code
WCET
time
PMP
PMP
ACET
time
PMP
Speed adjustment based on remaining WCET
Note a task very rarely consumes its estimated
worst case execution time.
8Dynamic Speed adjustment techniques for linear
code
Remaining WCET
time
PMP
PMP
Remaining time
time
PMP
Speed adjustment based on remaining WCET
9Dynamic Speed adjustment techniques for linear
code
Remaining WCET
time
PMP
PMP
Remaining time
time
PMP
Speed adjustment based on remaining WCET
10Dynamic Speed adjustment techniques for linear
code
time
PMP
PMP
time
Speed adjustment based on remaining WCET
11Dynamic Speed adjustment techniques for linear
code
time
PMP
PMP
time
Speed adjustment based on remaining WCET
12Dynamic Speed adjustment techniques for linear
code
WCET
WCE
WCE
WCE
time
Remaining time
ACET
Remaining av. ex. time
AV
AV
Smax
time
PMP
Smax
WCE
WCE
time
Speed adjustment based on remaining average
execution time
13An alternate point of view
WCET
WCE
WCE
WCE
time
ACET
AV
WCE
time
PMP
stolen slack
Reclaimed slack
WCE
WCE
time
14Dynamic Speed adjustment techniques for
non-linear code
PMP
p1
p3
p2
min
average
max
At a
PMP
- Remaining WCET is based on the longest path
- Remaining average case execution time is based on
the branching probabilities (from trace
information).
152. Periodic, non-frame-based systems
- Each task has a WCET, Ci and a period Ti
- Earliest Deadline First (EDF) scheduling
- Static speed adjustment If utilization U
lt 1, then we can reduce the speed by a factor
of U, and still guarantee that deadlines are met.
Smax
Note Average utilization, Uav can be much less
than U
16Greedy dynamic speed adjustment
correct
- Giving reclaimed slack to the next ready task is
not always a correct scheme
incorrect
PMP
- Theorem A reclaimed slack has to be associated
with a deadline and can be given safely to a task
with an earlier or equal deadline.
17aggressive dynamic speed adjustment
- Theorem if tasks 1, , k are ready and will
complete before the next task arrival, then we
can swap the time allocation of the k tasks. That
is we can add stolen slack to the reclaimed slack
PMP
- Experimental rule Do not be very aggressive and
reduce the speed of a task below a certain speed
(the optimal speed determined by Uav )
18Speed adjustment in Multi-processors
1. the case of independent tasks on two
processors
Canonical execution gt all tasks consume WCET
6
4
2
4
Global queue
6
2
P1
No speed management
4
4
P2
Deadline 12
time
19Dynamic speed adjustment
Non canonical execution gt tasks consume ACET
If we select the initial speed based on WCET, can
we do dynamic speed adjustment and still meet the
deadline?
6,5
6,6
9,3
3,3
P1
Greedy slack Reclamation (GSR)
P2
time
12
20Dynamic speed adjustment
Non canonical execution gt tasks consume ACET
If we select the initial speed based on WCET, can
we do dynamic speed adjustment and still meet the
deadline?
6,5
6,6
9,3
3,3
P1
Greedy slack Reclamation (GSR)
deadline miss
P2
time
12
P1
Slack sharing (SS)
P2
21Dynamic speed adjustment
Non canonical execution gt tasks consume ACET
If we select the initial speed based on WCET, can
we do dynamic speed adjustment and still meet the
deadline?
6,5
6,6
9,3
3,3
P1
Greedy slack Reclamation (GSR)
deadline miss
P2
time
12
P1
Slack sharing (SS)
P2
22Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3
6
Ready_Q
P1
Canonical execution
P2
time
12
23Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3
6
Ready_Q
P1
Canonical execution
P2
time
12
24Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3
6
Ready_Q
4
6
6
3
P1
4
Canonical execution
P2
3
time
12
- Assuming that we adjust the speed statically such
that canonical execution meets the deadline. - Can we reclaim unused slack dynamically and still
meet the deadline?
25Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
Ready_Q
Non-canonical Execution with Slack sharing
P1
2
P2
3
time
12
26Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
Ready_Q
6
Non-canonical Execution with Slack sharing
P1
2
P2
1
time
12
27Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
4
3
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
Ready_Q
6
Non-canonical Execution with Slack sharing
P1
2
P2
1
time
12
28Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
4
3
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
Ready_Q
6
6
Non-canonical Execution with Slack sharing
P1
2
P2
1
6
time
12
29Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
4
3
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
Ready_Q
6
6
6
4
Non-canonical Execution with Slack sharing
P1
2
4
P2
1
6
time
12
30Dynamic speed adjustment (2 processors)
3
2. dependent tasks
2
4
6
Use list scheduling
3,1
6
Ready_Q
4
3
6
6
Canonical execution
P1
2
4
6
P2
3
3
6
time
12
6
3
Ready_Q
6
6
4
Non-canonical Execution with Slack sharing
P1
2
3
4
P2
1
6
time
12
31Dynamic speed adjustment (2 processors)
3
Solution Use a wait_Q to enforce canonical order
in Ready_Q
2
4
6
3,1
6
Wait_Q
Ready_Q
P1
P2
time
12
- A task is put in Wait_Q when its last
predecessor starts execution - Tasks are ordered in Wait_Q by their expected
start time under WCET - Only the head of the Wait_Q can move to the
Ready_Q
32Dynamic speed adjustment (2 processors)
3
Solution Use a wait_Q to enforce canonical order
in Ready_Q
2
4
6
3,1
6
4
3
4
3
Wait_Q
6
Ready_Q
P1
2
P2
1
time
12
- A task is put in Wait_Q when its last
predecessor starts execution - Tasks are ordered in Wait_Q by their expected
start time under WCET - Only the head of the Wait_Q can move to the
Ready_Q
33Dynamic speed adjustment (2 processors)
3
Solution Use a wait_Q to enforce canonical order
in Ready_Q
2
4
6
3,1
6
4
3
Wait_Q
6
6
Ready_Q
P1
2
P2
1
time
12
- A task is put in Wait_Q when its last
predecessor starts execution - Tasks are ordered in Wait_Q by their expected
start time under WCET - Only the head of the Wait_Q can move to the
Ready_Q
34Dynamic speed adjustment (2 processors)
3
Solution Use a wait_Q to enforce canonical order
in Ready_Q
2
4
6
3,1
6
6
4
3
Wait_Q
6
Ready_Q
4
3
4
3
6
P1
2
4
P2
1
3
time
12
- A task is put in Wait_Q when its last
predecessor starts execution - Tasks are ordered in Wait_Q by their expected
start time under WCET - Only the head of the Wait_Q can move to the
Ready_Q
35Dynamic speed adjustment (2 processors)
3
Solution Use a wait_Q to enforce canonical order
in Ready_Q
2
4
6
3,1
6
6
4
3
Wait_Q
6
6
Ready_Q
4
3
4
3
P1
2
4
P2
1
3
6
time
12
- A task is put in Wait_Q when its last
predecessor starts execution - Tasks are ordered in Wait_Q by their expected
start time under WCET - Only the head of the Wait_Q can move to the
Ready_Q
36Theoretical results
For independent tasks, if canonical execution
finishes at time T, then non-canonical execution
with slack sharing finishes at or before time T.
For dependent tasks, if canonical execution
finishes at time T, then, non-canonical
execution with slack sharing and a wait queue
finishes at or before time T.
Implication
- Can optimize energy based on WCET (static speed
adjustment) - At run time, can use reclaimed slack to further
reduce energy (dynamic speed adjustment), while
still guaranteeing deadlines.
37Simulation results
- We simulated task graphs from real-applications
(matrix operations and solution of linear
equations) - We assumed that Power consumption is proportional
to S3 - Typical results are as follows
384. Static optimization when different tasks
consume different power
Start time
Assuming that the power consumption functions are
identical for all tasks.
Task 1
Then to minimize the total energy, all tasks have
to execute at the same speed.
Task 2
If, however, the power functions, Pi(S), are
different for tasks i 1, , n,
Then using the same speed for all tasks does not
minimize energy consumption.
Task 3
deadline
Let Ci number of cycles needed to complete task
i
39Minimizing energy consumption
Example
Three tasks with C1 C2 C3 If Pi(S) ai S2
, for task i , then, energy consumed by task i
is Ei ai / ti .
Start time
If a1 a2 a3 ,
Task 1
t1
then t1 t2 t3 minimizes total energy
D
Task 2
t2
Task 3
t3
deadline
40Minimizing energy consumption
Example
Three tasks with C1 C2 C3 If Pi(S) ai S2
, for task i , then, energy consumed by task i
is Ei ai / ti .
Start time
t1
D
t2
t3
deadline
41Minimizing energy consumption
Start time
The problem is to find Si , i1, , n, such that
to
D
Note that
- We solved this optimization problem, consequently
developing a solution for arbitrary convex power
functions. - Algorithm complexity O(n2 log n)
deadline
42Maximizing the systems reward
General problem assumptions
- tasks have different power/speed functions
- tasks have different rewards as functions of
number of executed cycles
Power
S1
S2
S3
C2
C1
Speed (S)
C3
time
Reward
C1
C2
C3
43Maximizing the systems reward
Theorem If power functions are all quadratic or
cubic in S,
p
Power
S1
S2
S3
Given the speeds, we know how to maximize the
total reward while meeting the deadline.
C1
C3
Speed
C2
time
44Maximizing the systems reward
If power functions of tasks are not all of the
form ai S
p
1) Ignore energy and maximize reward, R, within
the deadline
2) If exceed available energy
Speed
time
45Maximizing the systems reward
If power functions of tasks are not all of the
form ai S
p
1) Ignore energy and maximize reward, R, within
the deadline
2) If exceed available energy - remove D t
from a task such that decrease in R is minimal
- use D t to decrease the speed of a task, such
as to maximize the decrease in energy
consumption
3) Repeat step 2 until the energy constraints are
satisfied
time
D t
46Tradeoff between energy dependability
- Basic hypothesis
- Dependable systems must include redundant
capacity in either time or space (or both) - Redundancy can also be exploited to reduce power
consumption
Time redundancy (checkpointing and rollbacks)
Space redundancy
47Exploring time redundancy
The slack can be used to 1) add checkpoints 2)
reserve recovery time 3) reduce processing
speed
Smax
For a given number of checkpoints, we can find
the speed that minimizes energy consumption,
while guaranteeing recovery and timeliness.
deadline
48Optimal number of checkpoints
More checkpoints more overhead less recovery
slack
r
For a given slack (C/D) and
checkpoint overhead (r/C), we can find the number
of checkpoints that minimizes energy
consumption, and guarantee recovery and
timeliness.
C
D
Energy
of checkpoints
49Non-uniform check-pointing
Observation May continue executing at Smax
after recovery.
Disadvantage increases energy consumption when a
fault occurs (a rare event)
Advantage recovery in an early section can use
slack created by execution of later sections at
Smax
Requires non-uniform checkpoints.
50Non-uniform check-pointing
Can find of checkpoints, their
distribution, and the CPU speed such that
energy is minimized, recovery is guaranteed
and deadlines are met
51The Pecan board
- An experimental platform for power management
research - Profiling power usage of applications.
- Development of active software power control.
- An initial prototype for thin/dense servers based
on Embedded PowerPC processors. - A software development platform for PPC405
processors.
52Block Diagram
- Single-board computer as a PCI adapter
non-transparent bridge. - Up to128MB SDRAM.
- Serial, 10/100 Ethernet, Boot Flash, RTC, FPGA,
PCMCIA. - Components grouped on voltage islands for power
monitoring control.
Software Linux
- Linux for PPC 405GP available in-house
commercially (MontaVista). - PCI provides easy expandability.
- Leverage open-source drivers for networking over
PCI -- Fast access to network-based file systems.