Title: Power Aware Software Architecture
1Power Aware Software Architecture
- Rajesh K. Gupta
- University of California, Irvine
2The Noveau Rich in Computing
- Generational shift in computing devices
- lot more of everything including networking and
communications - lot less of power, energy, volume, weight,
patience - Application is everything, the possibilities are
limitless - System architectures are due for an overhaul
- the architectures are (radically)
changed/challenged - the programming context is changed
- the system software contract is changed
- new awareness location, power, timing,
reactivity, stability
3Outline
- The case for power awareness in
- application development
- system software
- Managing power in the OS
- knobs and strategies
- Making software power aware
- the hardware knobs (DVS, DPM)
- the application knobs (duty cycling, criticality,
aesthetics) - An ongoing experiment
4The Case for Power Awareness
- Limited availability
- Energy and power uses of new devices is markedly
different from laptops and notebook computers - much wider dynamic range of power demand
- increasing share of memory, communication and
signal processing - multiple power use modalities depending upon
application - immortal, paging-mode RX, lifeline TX,
mission-mode
5Power Management Places
- Hardware firmware
- dont know the global state and
application-specific knowledge - Users
- dont know component characteristics, and cant
make frequent decisions - Applications
- operate independently
- and the OS hides machine information from them
- OS plays an important role in allocation, sharing
of critical resource - it is a logical place for dynamic power
management - application-specific constraints and
opportunities for saving energy that can be known
only at that level
6Operating System Directed Power Management
- Significant opportunities in power management lie
with application-specific knobs - quality of service, timing criticality of various
functions - Needs of applications are driving force for OS
power management functions power-based API - collaboration between applications and the OS in
setting energy use policy - OS helps resolve conflicts and promote
cooperation - OS is the most reasonable place, but
- OS should incorporate application information in
power management - OS should expose power state and events to
applications for them to adapt.
7Early results in OS f/v scaling
- Approach 1 Weiser94
- time divided into 10-50 ms intervals
- f V raised or lowered at the beginning of the
interval based on CPU utilization during the
previous interval - 50 savings for a processor in the range 3.3V-5V
- 70 savings for a processor in the range 2.2V-5V
- Approach 2 Govil95
- predicts CPU cycles needed in the next interval
- sets f V accordingly
- many prediction strategies some did well, others
not
8Power Savings Mechanisms
- Dynamic Power Management (DPM)
- When a device is idle, it can transition to
low-power sleep states. . - Dynamic Voltage Scaling (DVS)
- A device can be run at different speeds at
different power levels - Execution of jobs can be slowed down to save
power as long as all jobs are completed by their
deadline. - Plus any application level knobs
- quality and performance measures, application
tolerances
9Implementing DVS
- Often done using slowdown factors
- can be static or dynamic
- For example
- Given a frequency range of fmin ,fmax
- Slowdown factor is frequency scaled to ?min,1,
where ?min fmin /fmax.. - When we use a slowdown factor of ?, we set the
frequency to, f ? fmax . - The voltage is changed to the minimum voltage
supported at f.
10Slowdown Factors
- Much of the work in the context of real-time
systems - makes sense since we need something to tradeoff
against power saved - Known results
- Essentially use schedulability tests to determine
amount of slowdown possible
11Our Work In Context
- DPM for devices with multiple active and multiple
sleep states. - Design and analyze algorithms for systems that
allow both DPM and DVS.
12Dynamic Power Management
- When a device becomes idle, it can transition to
lower power usage state. - A fixed amount of additional time and energy are
required to transition back to active state when
a new request for service arrives. - What is the best time threshold to transition to
the sleep state? - Too soon pay start-up cost too frequently.
- Too late spend too much time in the high-power
state - Generally, transition to sleep state when the
cost of being in active state is at least the
cost of waking up.
13Multi-state Case
- Let there be k1 states
- Let State k be the shut-down state and 0 be the
active state - Let ?i be the power dissipation rate at state i
- Let ?i be the total energy dissipated to move
back to State k - States are ordered such that ?i1 ? ?i
- ?k 0 and ?0 0 (without loss of generality).
- Power down energy cost can be incorporated in the
power up cost for analysis (if additive).
14Lower Envelope Idea
State1
State2
State3
State 4
Energy
t1
t2
t3
Time
For each state i, plot
- LEA can be deterministic or probabilistic
- PLEA is e/(e-1) competitive.
15Experimental Study IBM Mobile Hard Drive
Trace data with arrival times of disk accesses
from Auspex file server archive.
16IBM Mobile Hard Drive
17Goal
- Provide ways by which Application, Operating
System and Hardware can exchange energy/power and
performance related information efficiently. - Facilitate the continuously dialogue / adaptation
between OS / Applications. - Facilitate the implementation of power aware OS
services by providing a software interface to low
power devices - A power-aware API to the end user that enables
one to implement energy-efficient RTOS services
and applications
18Power-aware API Requirements
- Independent of Hardware and RTOS implementations
- enables its use in different hardware platforms
- for this all routines should access the HAL
(Hardware Abstraction Layer) rather than the
Hardware directly - enables its use in different RTOS as well as its
use with different scheduling strategies - do not count on specific RTOS info and/or
specific schedulers - Services provided
- processor frequency scaling and low-power state
transitions - with costs of making such transitions
- battery status (if the system is battery based)
- appropriate routines to control energy-speed and
energy-accuracy knobs available on I/O devices - network interface, serial interface, LCD, etc.
19Power-aware API
- The applications interface provides the following
services - The application is able to
- tell RT information to OS (period, deadlines,
WCET, hardness) - create new threads
- tell OS time predicted to finish a given task
instance - depending on the conditions of the environment
(application dependent and not yet implemented) - OS must be able to predict and tell applications
the time estimated to finish the task - depends on the scheduling scheme used
- A hard task must be killed if its deadline is
missed.
20A Power-Aware Software Architecture
21Power Aware Software Architecture
- PA-API (Power Aware API)
- interfaces applications and OS making the power
aware OS services available to the application
writer. - PA-OSL (Power Aware Operating System Layer)
- implements modified OS services and active
components such as a DPM manager. - PA-HAL (Power Aware Hardware Abstraction Layer)
- interfaces OS and Hardware making the power
control knobs available to the OS programmer.
22Software Architecture
- PA-API - Power aware function calls available to
the application writer. - Some functions of this layer are specific to
certain scheduling techniques. - PA-Middleware - Power aware services
- implemented on the top of the OS (power
management threads, data handling, etc...). - POSIX - Standard interface for OS system calls.
- This isolates PA-API and PA-Middleware from OS.
- PA-OSL - Power aware OS layer.
- Calls related to modified OS services should go
through this level. Also isolates OS from PA-API
and PA-Middleware. - PA-HAL - Power Aware Hardware Abstraction Layer.
- Isolates OS from underlying power aware hardware.
- Modified OS services
- Implementation / modification of OS services in a
power related fashion. Ex scheduler, memory
manager, I/O, etc.
23Layer Functionality
24DVS Related Functions
- paapi_dvs_create_thread_type(),
paapi_dvs_create_thread_instance() - creates type and instance of a task respectively
- paapi_dvs_app_started(), paapi_dvs_app_done()
- delimits execution of useful work in a thread.
Tell the OS whether the task has finished
execution or not. - paapi_dvs_get_time_prediction(),
paapi_dvs_set_time_prediction() - get current execution time prediction for a given
thread - paapi_dvs_set_adaptive_param()
- set the paremeters of the adaptive policy (it
will be described later) for a given task. - paapi_dvs_set_policy()
- choses the policy to be using for DVS
25DVS Related Functions (contd.)
- paosl_dvs_create_task_type_entry(), ...
- create a type and an instance of a thread in the
kernel internal tables of type and instance
respectively - paosl_dvs_killer_thread()
- kills a thread that missed a deadline
- pahal_dvs_initialize_processor_pm()
- initialize structures for processor power
management - pahal_dvs_get_current_frequency(),
pahal_dvs_set_frequency_and_voltage()
pahal_dvs_pre_set_frequency_and_voltage(),
pahal_dvs_get_frequency_levels_info()
pahal_dvs_post_set_frequency_and_voltage() - functions to switch processor among possible
frequencies levels - pahal_dvs_get_lowpower_states_info(),
pahal_dvs_set_lowpower_state() - functions to switch processor among low power
states
26DPM Functions
- paapi_dpm_register_device()
- just register the device to be power managed
- paosl_dpm_deamon()
- implements the actual policy for a specific
device. This deamon uses PA-HAL functions to
decide on how to switch devices among all
possible states. - pahal_dpm_device_switch_state()
- switch devices state
- pahal_dpm_device_check_activity()
- check whether the device has been idle and for
how long. This functions needs support from the
device driver. - pahal_dpm_device_get_info(), pahal_dpm_device_get
_curr_state() - gets information about the device and about its
current state respectively - Others
- functions for helping implementing power
policies. For example - pahal_battery_get_info() gets battery status
27 Current Status
- API specification available from
- http//www.ics.uci.edu/cpereira/pads/
- Implementation
- eCOS RTOS
- open source, Object oriented and highly
configurable RTOS (by means of scripting
language) - Hardware platforms we are currently working with
- Linux-synthetic (emulation of eCos over Linux -
debugging purposes only) - Compaq iPaq Pocket PC - StrongARM SA1110 based
platform - Accelent IDP (Integrated Development Environment)
- also StrongARM SA1110 based. - LRH Intel evaluation board 80200EVB - Intel
Xscale based
28DPM Algorithms Implemented
- A predictive RMS low-power scheduling
- It validates the power-aware API implementation
- assumes periodic tasks and deadline period
- The predictive scheduler implementation is
divided as follows - tables and variables manipulation
- admission control and static slow down factor
- dynamic slow down factor computation (time
prediction) - deadline management (hard deadline tasks)
- The processor frequency and voltage are scaled
according to the time predicted by the OS - The application can also predict the execution
time in order to enhance accuracy.
29Implementation
- eCOS Implementation
- All the timing related information are kept
internally to eCos kernel by means of tables - Some eCos classes were extended with new members
in order to efficiently access the tables - The code is inserted in the eCos kernel source
code by means of symbols definitions - enables automatic kernel synthesis of code
- Plans
- Extend eCos (by means of inheritance) class
instead of just add code into them - Have a tool to generate the implementation of
different scheduling schemes automatically (using
the API) - API implementation on Embedded Linux
30Implementation
80200EVB w/ voltage scaling board and the host
system
Compaq IPAQ running eCos
Maxim board for voltage scaling
31Experiments - XScale Processor
For varying voltage
All measurements executing a busy loop
32Using Power Aware OS Example
- The scheduler adapts frequency according to the
real time parameters passed in as parameter on
the thread type. - The frequency is adjusted by means of factors by
which it is multiplied resulting in lower speed
(a factor can also speed up the processor if it
is gt 1).
deadline
- void main()
-
- mpeg_decoding_t
- paapi_dvs_create_thread_type(100,30,100,hard)
- paapi_dvs_set_policy(SHUTDOWN STATIC
- DYNAMIC ADAPTIVE)
- paapi_dvs_create_thread_instance(
- mpeg_decoding_t, mpeg_decode_thread)
-
- ...
WCET
period
void mpeg_decode_thread() for ()
paapi_dvs_app_started() / original code /
mpeg_frame_decode() paapi_dvs_app_done()
Selects the DVS policy for all threads
Kills the thread instance when deadline is missed
33An Experiment
- Application OS running on 80200 XScale board
- Altera FPGA board generating interrupts to wake
up the processor - Maxim board providing voltage scaling
- Host PC for debugging and for loading the App.
OS into the board
34The Experiment with DVS
- Shutdown when idle
- as soon as CPU becomes idle shutdown the
processor - Shutdown static slow down factors
- offline slow down factors are applied. The CPU is
shutdown when idle. - Shutdown static slow down dynamic slow down
- run-time slow down factors are computed based on
a history of execution times in addition to the
static and shutdown - Shutdown static slow down dynamic slow down
adaptive slow down - a deadline driven factor is also applied in
addition to the other factors and shutdown. This
factor adapts itself according to number of
deadline missed in a previous window of
executions.
35DVS Experiment
- Four parameters are defined for the adaptive
factor - of deadlines missed tolerable (D) every W
executions - Window size (W)
- Lower bound for the factor (L)
- Increments and decrement steps (Inc and Dec)
- For every W executions
- if the number of deadlines missed is less than D
- lower the adaptive factor by Dec if it is greater
than L, otherwise keep it as it was. - if the number of deadlines is greater than D
- increment the adaptive factor by Inc.
36Application Set
- Three different real applications running
concurrently - An MPEG2 decoder
- An ADPCM (Adaptive Differential Pulse Code
Modulation) speech enconding - Floating point FFT application
37Task Set
- We used three tasksets based on the applications
described earlier as shown in the table below
38Frequency Voltage Scaling
- For the 4 schemes and the 3 tasksets experimented
we measured processor power consumption using a
shunt resistor and a DAQ board. - The voltage of the Xscale processor is
dynamically varied according to the frequency as
in the table below
39Results Taskset A
- Column deadlines missed shows the number of
deadlines missed per task (T1, T3, T4) for a
total of 415/207/138 executions respectively. For
the adaptive algorithm, M varies as the number
between parentheses, Inc0.1, Dec0.5, W10 and
D20
40Results Taskset B
- Column deadlines missed shows the number of
deadlines missed per task (T2, T3, T4) for a
total of 130/65/43 executions respectively
41Results Taskset C
- Column deadlines missed shows the number of
deadlines missed per task (T1, T3, T5) for a
total of 130/65/43 executions respectively
42OS-directed DVS Results
43Using Application-level knob
- Example Image Compression Algorithm
- tradeoff image quality against energy available
by varying the compression parameters such as BPP
(bits per pixel) - The image compression algorithm is ran in a
continuous loop with battery polling every 10
secs. - A simple power tradeoff policy is added to adapt
the quality of the image against the battery
voltage left. - Whenever the battery drops 30mV the application
adjusts the image BPP by -0.5 starting at 1.5. - For a cut-off of 4020mV, the battery life is
extended from 290 seconds to 340 seconds.
44- The battery life is extended by 18 with a slight
( not noticeable by human eye) degradation of
image quality
45Concluding Remarks
- Computers with radios present a very wide range
of system optimization opportunities for power,
size and performance - Efficient power and energy management is key to
enabling new range of applications - Energy efficiency is a system-level concern that
cuts across subsystem components, functionality
layers and its implementations - Application programming needs to be energy aware
and provide knobs for the system designer to
incorporate in DPM.
46Yes, but Microsoft...
- and others have already solved the problem?
47Operating System Power Management (OSPM)
- Supported by Microsofts desktop operating
systems via APM - Advanced Power Management - OS/BIOS co-operation
- When OS goes to idle condition it performs an
access to a register that causes an SMI - SMI handler puts system into low power state
- APM required OS to trust the system BIOS
48Current OSPM - ACPI
- Advanced Configuration and Power Management
Interface (ACPI) - OS visible (SCI-based) as opposed to OS invisible
(SMI-based) - OS/drivers/BIOS are in sync regarding power
states - Standard way for the system to describe its
device config. power control h/w interface to
the OS - register interface for common functions
- system control events, processor power and clock
control, thermal management, and resume handling - Info on devices, resources, control mechanisms
- Thermal Management
49(No Transcript)
50(No Transcript)
51ACPI Processor Power States
Power Throttling
Latency C1 lt C2 lt C3
Power C1 gt C2 gt C3
52Overview of ACPI System States
State
CPU
Memory
Context Tracking
Devices
Wake Up
G0
C0 Executing _at_ Full Speed C1C3 Executing in
PM state (ie Thermal Throttle/HLT)
Powered Up Down based on demand D0-D3
Retained Power ON Refresh Normal
Working
Not Executing Context Retained CPU CLK
OFF System CLK ON Power ON
S1
H/W responsible for saving context of CPU, System
I/O, Memory
Devices Power down depending on wakeup power
requirements
Lowest Latency Restart _at_ CSIP 1
Retained Power ON Refresh Normal
Sleeping
Not Executing CPU/Sys Cache Context Lost CPU CLK
OFF System CLK OFF Power ON
Retained Power ON Refresh Standby /
Auto
S2
H/W responsible for saving context of System I/O
Memory OS responsible for saving CPU context
Devices Power down depending on wakeup power
requirements
Latency gt S1 Restart _at_ Boot Vector
Sleeping
S3
Not Executing CPU/Cache Context Lost CPU CLK
OFF System CLK OFF Power OFF
H/W responsible for saving Memory context BIOS
restores Memory Controller Context. OS
responsible for saving CPU System I./O context
Latency gt S2 Restart _at_ Boot Vector
Devices Power down depending on wakeup power
requirements
Retained Power ON Refresh Standby /
Auto
Sleeping
S4
Not Executing CPU/Cache Context Lost Everything
OFF
Latency gt S3 Restart _at_ Boot Vector
OS(S4) / BIOS(S4bios) is responsible for saving
and restoring all system context, including
memory
Devices Power down depending on wakeup power
requirements
Context Lost Power OFF Refresh N/A
S4BIOS Sleeping
G2/S5
OFF
Devices are OFF, Power Button Press will wake up
the system
Latency gt S4 Restart _at_ Boot Vector
OS uses S5 to turn the machine off
OFF
Soft OFF
NOTES - OS chooses the lowest supported sleep
state in which all enabled wakeup devices still
functions under the latency requirements from
apps. - ASL binds each Sx state to a SLP_TYP
value, which based on platform design of power
planes clocking logic det what portions of the
h/w power down. - For each Device, ASL lists
which power resources are needed to maintain a
wakeup capable state - System I/O refers to
Motherboard Devices PIT, PIC, DMAC, NMI
State....OS saves restores this stuff for S3
53Summary of functional areascovered by ACPI
- System Power Management
- ACPI defines mechanisms for putting the computer
as a whole in and out of system sleeping states. - Device Power Management
- ACPI tables describe devices, their power states,
the power planes the devices are connected to,
and controls for putting devices into different
power states. - Processor power management
- While the OS is idle but not sleeping, it will
use commands described by ACPI to put processors
in low-power states. - Device and processor performance management
- DPM to achieve desirable balance between
performance and energy by transitioning devices
and processors into different states when the
system is active.
54ACPI functionalities (cont.)
- Plug and Play
- hierarchically arranged device and configuration
information - System Events
- a general event mechanism for system events such
as thermal events, power management events,
docking, device insertion and removal, and so on - Battery management
- either through a Smart Battery subsystem
interface controlled by the OS directly through
the embedded controller interface, or a Control
Method Battery interface. - Thermal management
- provides a model to allow OEMs to define thermal
zones, thermal indicators, and methods for
cooling thermal zones. - A standard hw and sw interface between OS and
Embedded Controller - allows any OS to provide a standard bus
enumerator that can directly communicate with an
embedded controller in the system, thus allowing
other drivers within the system to communicate
with and use the resources of system embedded
controllers.
55Microsofts OnNow
- Win32 API extension allows applications to
- affect the power management decision making
- adapt to power state
- find out if running on batteries so as to reduce
processing - discover disk state postpone low priority I/O
e.g. paging - Requires changes in hardware, firmware (BIOS),
OS, and application software - bus device power management standards for h/w
- ACPI interface standard between OS hardware
- integration of power management into app control
56OnNow Components
Ref. Microsofts OnNow Power Management
Architecture for Applications
57OnNow Architecture
- Users view system is either on or off
- Reality system transitions among a number of
power states according to OSs power policy - Global power states
- working apps are executing
- sleep software is not executing, CPU is
stopped - OS tracks users activities application
execution states to decide when to enter
sleep monitor user input, hints from
applications - wake-up is time-based or device-based
- off system has shutdown and must reboot
58OnNow Architecture (contd.)
Full on
Device and processor power conservation
occurring according to system usage
Working
power switch
idle
wake-up
AppearsOff
Processor stopped devices off Wake-up events
enabled Timed wake-up enabled
Sleeping
SoftOff
power switch
Off
Able to turn on electronically
- OnNow Global Power States
Ref. Microsofts OnNow Power Management
Architecture for Applications
59OnNow Architecture (contd.)
- Power states for individual devices
- managed by device drivers while system is
working - function of application needs, device
capabilities, and OS information - e.g. shutdown serial port if not in use
- Power states for CPU
- OS transitions CPU between its various low-power
states based on CPU usage - function of power source, processing time, user
preferences etc. - API mechanisms
- for apps to learn about power events status
from OS - for apps to tell their requirements to the OS