Power Aware Software Architecture - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Power Aware Software Architecture

Description:

... algorithm (ski rental) ... when device finishes current job and the queue is empty ... Schedule S is feasible for set of jobs J if for every j in J: Cost of ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 45
Provided by: Cristian94
Category:

less

Transcript and Presenter's Notes

Title: Power Aware Software Architecture


1
Power Aware Software Architecture
  • Rajesh K. Gupta
  • University of California, San Diego

Cristiano Pereira, Ravindra Jejurikar, Yuvraj
Agrawal, Manjari Chachharia, Sandeep Shukla,
Mukesh Rajan
In collaboration with Sandy Irani, Mani Srivastava
2
Computing In New Spaces
  • Generational shift in computing devices
  • lot more of everything including networking and
    communications
  • lot less of power, energy, volume, weight,
    patience
  • Application is everything, the possibilities are
    limitless
  • System architectures are due for an overhaul
  • the architectures are (radically)
    changed/challenged
  • the programming context is changed
  • the system software contract is changed
  • new awareness location, power, timing,
    reactivity, stability

power
3
Outline
  • The case for power awareness in
  • application development
  • system software
  • Managing power in the OS
  • knobs and strategies
  • Making software power aware
  • the hardware knobs (DVS, DPM)
  • the application knobs (duty cycling, criticality,
    aesthetics)
  • An ongoing experiment

4
The Case for Power Awareness
  • Limited availability
  • Energy and power uses of new devices is markedly
    different from laptops and notebook computers
  • much wider dynamic range of power demand
  • increasing share of memory, communication and
    signal processing
  • multiple power use modalities depending upon
    application
  • immortal, paging-mode RX, lifeline TX,
    mission-mode

5
Power Management Places
  • Hardware firmware
  • many techniques for low power design in circuits,
    architectures, etc.
  • but dont know the global state and
    application-specific knowledge
  • Users
  • dont know component characteristics, and cant
    make frequent decisions
  • Applications
  • operate independently
  • and the OS hides machine information from them
  • OS (role in resource allocation and sharing)
  • it is a logical place for dynamic power
    management
  • early results show 50-70 savings due to f/v
    scaling Weiser94
  • application-specific constraints and
    opportunities for saving energy that can be known
    only at that level

6
Operating System Directed Power Management
  • Significant opportunities in power management lie
    with application-specific knobs
  • quality of service, timing criticality of various
    functions
  • Needs of applications are driving force for OS
    power management functions power-based API
  • collaboration between applications and the OS in
    setting energy use policy
  • OS helps resolve conflicts and promote
    cooperation
  • OS is the most reasonable place, but
  • OS should incorporate application information in
    power management
  • OS should expose power state and events to
    applications for them to adapt.

7
Power Savings Mechanisms
A
  • Dynamic Power Management (DPM)
  • When a device is idle, it can transition to
    low-power sleep states.
  • Current trend is to design devices with multiple
    sleep states and provide device driver hooks to
    change these states under OS control.
  • Dynamic Voltage Scaling (DVS)
  • A device can be run at different speeds at
    different power levels
  • Execution of jobs can be slowed down to save
    power as long as all jobs are completed by their
    deadline.
  • Application level knobs
  • quality and performance measures, application
    tolerances

B
C
8
A. Dynamic Power Management
  • When a device becomes idle, it can transition to
    lower power usage state.
  • A fixed amount of additional time and energy are
    required to transition back to active state when
    a new request for service arrives.
  • What is the best time threshold to transition to
    the sleep state?
  • Too soon pay start-up cost too frequently.
  • Too late spend too much time in the high-power
    state
  • Generally, transition to sleep state when the
    cost of being in active state is at least the
    cost of waking up.

A
9
Our Work In This Context
  • We have developed quantitative bounds on the
    quality of DPM algorithms based on Competitive
    Analysis TCAD 01
  • provides a basis for DPM strategy comparison
  • Developed DPM strategies for devices with both
    multiple active and multiple sleep states TCAD
    02
  • Design and analyze algorithms for systems that
    allow both DPM and DVS SODA 03, TECS02
  • Important conclusions
  • Not all power states are useful in a given DPM
    strategy
  • DPM generally useful for improving quality
    measures.

A
10
Competitive Analysis
  • Deterministic algorithm (ski rental)
  • Transition to sleep state when the cost of being
    in active state is at least the cost of waking
    up.
  • Normalize cost of transitioning from sleep to
    active state to 1.
  • Power consumption rate of active state is ?.
  • This algorithm is 2-competitive.
  • 2 is the best possible competitive ratio for any
    deterministic algorithm.
  • Probabilistic algorithm
  • Idle period length generated by known
    distribution with density function p(t).
  • Choose threshold T to minimize cost
  • For any distribution p(t), the expected cost of
    the above algorithm is within e/(e-1) of the
    optimal cost. Furthermore, there is a
    distribution for which no algorithm can be better
    than e/(e-1) times optimal.

A
11
Multi-state DPM Case
  • Let there be k1 states
  • Let State k be the shut-down state and 0 be the
    active state
  • Let ?i be the energy dissipation rate at state i
  • Let ?i be the total energy dissipated to move
    back to State 0
  • States are ordered such that ?i1 ? ?i
  • ?k 0 and ?0 0 (without loss of generality).
  • Power down energy cost can be incorporated in the
    power up cost for analysis (if additive).
  • Now formulate an optimization problem to
    determine the state transition thresholds.

A
12
Lower Envelope Idea
State1
State2
State3
State 4
Energy
For each state i, plot
Time
t1
t2
t3
  • LEA can be deterministic or probabilistic
  • PLEA is e/(e-1) competitive.

A
13
Power-Latency Tradeoff
  • Tasks arrive through time and take time to run
  • If the device is busy when a task arrives, it
    waits in a queue
  • Idle period begins when device finishes current
    job and the queue is empty
  • If device transitions to sleep state in an idle
    period, some latency is incurred as device
    transitions to active state.
  • This in turn effects (shortens) the length of
    future idle periods.
  • Power-Latency tradeoff extremes
  • Minimize latency always stay in the active
    state.
  • Minimize energy usage delay completing any tasks
    until they have all arrived.

A
14
Experimental Study IBM Mobile Hard Drive
Trace data with arrival times of disk accesses
from Auspex file server archive.
A
15
IBM Mobile Hard Drive
A
16
B. Dynamic Voltage Scaling
  • Device which can run at any speed s.
  • Power consumed if running in state s is given by
    convex function P(s).
  • Jobs arrive through time. Job j has
  • Arrival time aj
  • Deadline bj
  • Work required Rj
  • Schedule S (s, job)
  • s(t) is the speed of the device at time t.
  • job(t) is which job is executed at time t.

B
17
Dynamic Voltage Scaling(Dynamic Voltage Scaling
- No Sleep DVS-NS)
  • Schedule S is feasible for set of jobs J if for
    every j in J
  • Cost of Schedule S is

B
18
DVS with Sleep State (DVS-S)
  • Schedule S ( s, job, h )
  • h(t) sleep or on
  • If h(t) sleep, then s(t) 0.
  • Power is a function of speed and state
  • P(s, state) P(s) if state on.
  • P(s, state) 0 if state sleep.
  • P(0) ? is power required to keep device active
    with no tasks running.
  • Let k be the number of times the device
    transitions from sleep state to the on state
  • Cost of a schedule S is

B
19
Critical Speed
  • If the cost to transition from sleep state to the
    on state were 0, the optimal speed for all jobs
    would be the s that minimizes (Rj/s) P(s)
  • This is the s that satisfies P(s) s P(s).
  • Call this Scrit, the critical speed for ?.
  • If we compress the execution of a task by x,
  • we expend additional energy because we execute
    the job faster
  • we save ? x.
  • Scrit is the point at which it is no longer
    beneficial to compress the execution of a task.
  • Our approach
  • Decide on Active/Idle intervals (determined by
    critical speed)
  • Decide on Sleep/On intervals (determined by the
    cost of staying on)

B
20
Implementing DVS
  • Often done using slowdown factors
  • can be static or dynamic
  • For example
  • Given a frequency range of fmin ,fmax
  • Slowdown factor is frequency scaled to ?min,1,
    where ?min fmin /fmax..
  • When we use a slowdown factor of ?, we set the
    frequency to, f ? fmax .
  • The voltage is changed to the minimum voltage
    supported at f.

B
21
Slowdown Factors
  • Much of the work on slowdown factors has been in
    the context of real-time systems
  • makes sense since we need something to tradeoff
    against the power saved
  • Known results
  • Essentially use schedulability tests to determine
    the amount of slowdown possible
  • Along with the attendant assumptions and almost a
    repeat of Real Time research history...

B
22
C. Enable Application Knobs
  • Need API Provide ways by which Application, OS
    and Hardware can exchange energy/power and
    performance related information efficiently.
  • Need Middleware Facilitate a continuous dialogue
    / adaptation between OS / Applications.
  • Need HAL Facilitate the implementation of power
    aware OS services by providing a software
    interface to low power devices

C
23
Power-aware API Requirements
  • Independent of Hardware and RTOS implementations
  • enables its use in different hardware platforms
  • for this all routines should access the HAL
    (Hardware Abstraction Layer) rather than the
    Hardware directly
  • enables its use in different RTOS as well as its
    use with different scheduling strategies
  • do not count on specific RTOS info and/or
    specific schedulers
  • Services provided
  • processor frequency scaling and low-power state
    transitions
  • with costs of making such transitions
  • battery status (if the system is battery based)
  • appropriate routines to control energy-speed and
    energy-accuracy knobs available on I/O devices
  • network interface, serial interface, LCD, etc.

C
24
Power-aware API
  • The applications interface provides the following
    services
  • The application is able to
  • tell RT information to OS (period, deadlines,
    WCET, hardness)
  • create new threads
  • tell OS time predicted to finish a given task
    instance
  • depending on the conditions of the environment
    (application dependent and not yet implemented)
  • OS must be able to predict and tell applications
    the time estimated to finish the task
  • depends on the scheduling scheme used
  • A hard task must be killed if its deadline is
    missed
  • matter of policy in the context of application
    use.

C
25
A Power-Aware Software Architecture
C
26
Power Aware Software Architecture
  • PA-API (Power Aware API)
  • interfaces applications and OS making the power
    aware OS services available to the application
    writer.
  • PA-OSL (Power Aware Operating System Layer)
  • implements modified OS services and active
    components such as a DPM manager.
  • PA-HAL (Power Aware Hardware Abstraction Layer)
  • interfaces OS and Hardware making the power
    control knobs available to the OS programmer.

C
27
Software Architecture
  • PA-API - Power aware function calls available to
    the application writer.
  • Some functions of this layer are specific to
    certain scheduling techniques.
  • PA-Middleware - Power aware services
  • implemented on the top of the OS (power
    management threads, data handling, etc...).
  • POSIX - Standard interface for OS system calls.
  • This isolates PA-API and PA-Middleware from OS.
  • PA-OSL - Power aware OS layer.
  • Calls related to modified OS services should go
    through this level. Also isolates OS from PA-API
    and PA-Middleware.
  • PA-HAL - Power Aware Hardware Abstraction Layer.
  • Isolates OS from underlying power aware hardware.
  • Modified OS services
  • Implementation / modification of OS services in a
    power related fashion. Ex scheduler, memory
    manager, I/O, etc.

C
28
Layer Functionality
C
29
DVS Related Functions
  • paapi_dvs_create_thread_type(),
    paapi_dvs_create_thread_instance()
  • creates type and instance of a task respectively
  • paapi_dvs_app_started(), paapi_dvs_app_done()
  • delimits execution of useful work in a thread.
    Tell the OS whether the task has finished
    execution or not.
  • paapi_dvs_get_time_prediction(),
    paapi_dvs_set_time_prediction()
  • get current execution time prediction for a given
    thread
  • paapi_dvs_set_adaptive_param()
  • set the paremeters of the adaptive policy (it
    will be described later) for a given task.
  • paapi_dvs_set_policy()
  • choses the policy to be using for DVS

C
30
DVS Related Functions (contd.)
  • paosl_dvs_create_task_type_entry(), ...
  • create a type and an instance of a thread in the
    kernel internal tables of type and instance
    respectively
  • paosl_dvs_killer_thread()
  • kills a thread that missed a deadline
  • pahal_dvs_initialize_processor_pm()
  • initialize structures for processor power
    management
  • pahal_dvs_get_current_frequency(),
    pahal_dvs_set_frequency_and_voltage()
    pahal_dvs_pre_set_frequency_and_voltage(),
    pahal_dvs_get_frequency_levels_info()
    pahal_dvs_post_set_frequency_and_voltage()
  • functions to switch processor among possible
    frequencies levels
  • pahal_dvs_get_lowpower_states_info(),
    pahal_dvs_set_lowpower_state()
  • functions to switch processor among low power
    states

C
31
DPM Functions
  • paapi_dpm_register_device()
  • just register the device to be power managed
  • paosl_dpm_deamon()
  • implements the actual policy for a specific
    device. This deamon uses PA-HAL functions to
    decide on how to switch devices among all
    possible states.
  • pahal_dpm_device_switch_state()
  • switch devices state
  • pahal_dpm_device_check_activity()
  • check whether the device has been idle and for
    how long. This functions needs support from the
    device driver.
  • pahal_dpm_device_get_info(), pahal_dpm_device_get
    _curr_state()
  • gets information about the device and about its
    current state respectively
  • Others
  • functions for helping implementing power
    policies. For example
  • pahal_battery_get_info() gets battery status

C
32
Current Status
  • API specification available from
  • http//www.ics.uci.edu/cpereira/pads/
  • Implementation
  • eCOS RTOS
  • open source, Object oriented and highly
    configurable RTOS (by means of scripting
    language)
  • Hardware platforms we are currently working with
  • Linux-synthetic (emulation of eCos over Linux -
    debugging purposes only)
  • Compaq iPaq Pocket PC - StrongARM SA1110 based
    platform
  • Accelent IDP (Integrated Development Environment)
    - also StrongARM SA1110 based.
  • LRH Intel evaluation board 80200EVB - Intel
    Xscale based

33
DPM Algorithms Implemented
  • A predictive RMS low-power scheduling
  • It validates the power-aware API implementation
  • assumes periodic tasks and deadline period
  • The predictive scheduler implementation is
    divided as follows
  • tables and variables manipulation
  • admission control and static slow down factor
  • dynamic slow down factor computation (time
    prediction)
  • deadline management (hard deadline tasks)
  • The processor frequency and voltage are scaled
    according to the time predicted by the OS
  • The application can also predict the execution
    time in order to enhance accuracy.

C
34
Experiments - XScale Processor
For varying voltage
All measurements executing a busy loop
35
Using Power Aware OS Example
  • The scheduler adapts frequency according to the
    real time parameters passed in as parameter on
    the thread type.
  • The frequency is adjusted by means of slowdown
    factors (a factor can also speed up the processor
    if it is gt 1).

deadline
  • void main()
  • mpeg_decoding_t
  • paapi_dvs_create_thread_type(100,30,100,hard)
  • paapi_dvs_set_policy(SHUTDOWN STATIC
  • DYNAMIC ADAPTIVE)
  • paapi_dvs_create_thread_instance(
  • mpeg_decoding_t, mpeg_decode_thread)
  • ...

WCET
period
void mpeg_decode_thread() for ()
paapi_dvs_app_started() / original code /
mpeg_frame_decode() paapi_dvs_app_done()

Selects the DVS policy for all threads
Kills the thread instance when deadline is missed
C
36
An Experiment
  • Application OS running on 80200 XScale board
  • Altera FPGA board generating interrupts to wake
    up the processor
  • Maxim board providing voltage scaling
  • Host PC for debugging and for loading the App.
    OS into the board

37
The Experiment with DVS
  • Shutdown when idle
  • as soon as CPU becomes idle shutdown the
    processor
  • Shutdown static slow down factors
  • offline slow down factors are applied. The CPU is
    shutdown when idle.
  • Shutdown static slow down dynamic slow down
  • run-time slow down factors are computed based on
    a history of execution times in addition to the
    static and shutdown
  • Shutdown static slow down dynamic slow down
    adaptive slow down
  • a deadline driven factor is also applied in
    addition to the other factors and shutdown. This
    factor adapts itself according to number of
    deadline missed in a previous window of
    executions.

38
DVS Experiment
  • Four parameters are defined for the adaptive
    factor
  • of deadlines missed tolerable (D) every W
    executions
  • Window size (W)
  • Lower bound for the factor (L)
  • Increments and decrement steps (Inc and Dec)
  • For every W executions
  • if the number of deadlines missed is less than D
  • lower the adaptive factor by Dec if it is greater
    than L, otherwise keep it as it was.
  • if the number of deadlines is greater than D
  • increment the adaptive factor by Inc.

39
Application Set
  • Three different real applications running
    concurrently
  • An MPEG2 decoder
  • An ADPCM (Adaptive Differential Pulse Code
    Modulation) speech enconding
  • Floating point FFT application

40
Task Set
  • We used three tasksets based on the applications
    described earlier as shown in the table below

41
Frequency Voltage Scaling
  • For the 4 schemes and the 3 tasksets experimented
    we measured processor power consumption using a
    shunt resistor and a DAQ board.
  • The voltage of the Xscale processor is
    dynamically varied according to the frequency as
    in the table below

42
Results Taskset A
  • Column deadlines missed shows the number of
    deadlines missed per task (T1, T3, T4) for a
    total of 415/207/138 executions respectively. For
    the adaptive algorithm, M varies as the number
    between parentheses, Inc0.1, Dec0.5, W10 and
    D20

43
Results Taskset B
  • Column deadlines missed shows the number of
    deadlines missed per task (T2, T3, T4) for a
    total of 130/65/43 executions respectively

44
Results Taskset C
  • Column deadlines missed shows the number of
    deadlines missed per task (T1, T3, T5) for a
    total of 130/65/43 executions respectively

45
OS-directed DVS Results
46
Using Application-level knob
  • Example Image Compression Algorithm
  • tradeoff image quality against energy available
    by varying the compression parameters such as BPP
    (bits per pixel)
  • The image compression algorithm is ran in a
    continuous loop with battery polling every 10
    secs.
  • A simple power tradeoff policy is added to adapt
    the quality of the image against the battery
    voltage left.
  • Whenever the battery drops 30mV the application
    adjusts the image BPP by -0.5 starting at 1.5.
  • For a cut-off of 4020mV, the battery life is
    extended from 290 seconds to 340 seconds.

47
  • The battery life is extended by 18 with a slight
    ( not noticeable by human eye) degradation of
    image quality

48
Concluding Remarks
  • Computers with radios present a very wide range
    of system optimization opportunities for power
    performance
  • Efficient power and energy management is key to
    enabling new range of applications
  • Energy efficiency is a system-level concern that
    cuts across components, functionality layers and
    implementations
  • Application programming needs to be energy aware
    and provide knobs for the system designer to
    incorporate in DPM.

49
Yes, but Microsoft...
  • what are they doing?
Write a Comment
User Comments (0)
About PowerShow.com