Duke - PowerPoint PPT Presentation

About This Presentation
Title:

Duke

Description:

Extending battery life for mobile/wireless devices ... To reduce response time variability when energy is limited. ... Can we achieve a target battery lifetime? ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 71
Provided by: Carla65
Category:
Tags: duke

less

Transcript and Presenter's Notes

Title: Duke


1
Dukes Milly Watt ProjectCarla Ellis
  • Students
  • Sita Badrish
  • Rebecca Braynard
  • Angela Dalton
  • Albert Meixner
  • Shobana Ravi
  • Faculty
  • Alvin Lebeck
  • Amin Vahdat (UCSD)
  • Alumni
  • Xiaobo Fan, Ph.D.
  • Heng Zeng, Ph.D.
  • Surendar Chandra, Ph.D

Systems Architecture
2
Milly Watt Motivation
  • Energy for computing is an important problem(
    not just for mobile computing)
  • Reducing heat production and fan noise
  • Extending battery life for mobile/wireless
    devices
  • Conserving energy resources (lessen environmental
    impact, save on electricity costs)
  • How does software interact with or exploit
    low-power hardware?

3
Milly Watt Vision
  • Energy should be a first class resource at
    upper levels of system design
  • Focus on Architecture, OS, Networking,
    Applications
  • Energy has a impact on every other resource of a
    computing system it is central.
  • HW / SW cooperation to achieve energy goals

4
Energy Management Spectrum
HW / SW Cooperation
  • Software
  • High level
  • Coarse grain
  • OS, compiler or application

Hardware
  • Low level
  • Fine grain
  • Low-power Circuits
  • Voltage Scaling
  • Clock gating
  • Power modes Turning off HW blocks
  • Re-examine interactions between HW and SW,
    particularly within the resource management
    functions of the Operating System

5
Power Budget
CPU
Cache
Memory Bus
I/O Bridge
I/O Bus
Main Memory
Disk Controller
Graphics Controller
Network Interface
Graphics
Disk
Disk
Network
Intel targets
6
Outline
  • Introduction and motivation
  • Milly Watt activities
  • ECOSystem Explicitly managing energy via the OS
    (ASPLOS02, USENIX03)
  • Power-aware memory(ASPLOS00, ISLPED01, PACS02,
    PACS03)
  • FaceOff Sensor-based display power management
    (HOTOS03, Mobisys Context Aware 04)
  • Current and future directions

7
Outline
  • Introduction and motivation
  • Milly Watt activities
  • ECOSystem Explicitly managing energy via the OS
    (ASPLOS02, USENIX03)
  • Power-aware memory(ASPLOS00, ISLPED01, PACS02,
    PACS03)
  • FaceOff Sensor-based display power management
    (HOTOS03, Mobisys Context Aware 04)
  • Current and future directions

8
Outline
  • Introduction and motivation
  • Milly Watt activities
  • ECOSystem Explicitly managing energy via the OS
    (ASPLOS02, USENIX03)
  • Power-aware memory(ASPLOS00, ISLPED01, PACS02,
    PACS03)
  • FaceOff Sensor-based display power management
    (HOTOS03, Mobisys Context Aware 04)
  • Current and future directions

9
Energy Centric Operating System (ECOSystem)
  • Energy can serve as a unifying concept for
    managing a diverse set of resources.
  • We introduce the currentcy abstraction to
    represent the energy resource
  • A framework is needed for explicit monitoring
    and management of energy.
  • We develop mechanisms for currentcy accounting,
    currentcy allocation, and scheduling of currentcy
    use
  • We need policies to achieve energy goals.
  • Need to arbitrate among competing demands and
    reduce demand when energy is limited.

10
Unified Currentcy Model
  • Energy accounting and allocation are expressed in
    a common currentcy.
  • Abstraction for
  • Characterizing power costs of accessing different
    resources
  • Quantifying overall energy consumption
  • Sharing among competing tasks

11
Energy Goals
  • Explicitly manage energy use to reach a target
    battery lifetime.
  • Coast-to-coast flight with your laptop
  • Sensors that need to operate through the night
    and recharge when the sun comes up
  • If that requires reducing workload demand, use
    energy in proportion to tasks importance.
  • Scenario
  • Revising and rehearsing a PowerPoint presentation
  • Spelling and grammar checking threads
  • Listening to MP3s in background

12
Energy Goals
  • Explicitly manage energy use to reach a target
    battery lifetime.
  • Coast-to-coast flight with your laptop
  • Sensors that need to operate through the night
    and recharge when the sun comes up
  • If that requires reducing workload demand, use
    energy in proportion to tasks importance.
  • Scenario
  • Revising and rehearsing a PowerPoint presentation
  • Spelling and grammar checking threads
  • Listening to MP3s in background

13
Energy Goals
  • Deliver good performance given constraints on
    energy availability
  • Fully utilize the battery capacity within the
    target battery lifetime with little leftover
    capacity no lost opportunities.
  • Encourage efficiency in performing desired work.
  • Address observed performance problems (e.g.
    energy-based priority inversions).

14
Challenges
  1. To fully utilize available battery capacity
    within the desired battery lifetime with little
    or no leftover (residual) capacity.
  • Devise an allocation policy that balances supply
    and demand among tasks.
  • Currentcy conserving allocation.

15
Challenges
  1. To produce more robust proportional sharing by
    ensuring adequate spending opportunities.
  • Develop CPU scheduling that considers energy
    expenditures on non-CPU resources.
  • Currentcy-aware scheduling.

16
Challenges
  1. To reduce response time variability when energy
    is limited.
  • Design a scheduling policy that controls the pace
    of currentcy consumption.

17
Challenges
  1. To encourage greater energy efficiency (lower
    average cost) for I/O accesses on power-managed
    disks.
  • Amortize spinup and spindown costs over multiple
    disk requests by shaping request patterns.
  • Buffer management and prefetching strategies.

18
Outline
  • Motivation / Context
  • Background
  • ECOSystem Framework
  • Prototype Implementation Experience
  • Exploring Energy Goals and Policies
  • Conclusions

19
Mechanisms in the ECOSystem Framework
  • Currentcy Allocation
  • Epoch-based allocation periodically distribute
    currentcy allowance
  • Currentcy Accounting
  • Basic idea Pay as you go for resource use no
    more currentcy ? no more service.

20
Currentcy Flow
App
App
App
OS
  1. Determine overall amount of currentcy available
    per energy epoch.
  2. Distribute available currentcy proportionally
    among tasks.

21
Currentcy Flow
App
App
App
OS
  1. Deduct currentcy from tasks account for
    resource use.

22
Device Specific Accounting
  • CPU hybrid of sampling and task switch
    accounting
  • Disk tasks directly pay for file accesses,
    sharing of spinup spindown costs.
  • Network local source or destination task pays
    based on length of data transferred

23
ECOSystem Prototype
  • Modifications to Linux on Thinkpad T20
  • Initially managing 3 devices CPU, disk, WNIC
  • Embedded power model
  • Calibrated by measurement
  • Power states of managed devices tracked
  • Orinoco card doze 0.045W, receive 0.925W, send
    1.425W.

24
Experimental Evaluation V1.0
  • Validate the embedded energy model.
  • Can we achieve a target battery lifetime?
  • Can we achieve proportional energy usage among
    multiple tasks?
  • Assess the performance impact of limiting energy
    availability.

25
Achieving Target Battery Lifetime
  • Using CPU intensive benchmark and varying overall
    allocation of currentcy, we can achieve target
    battery lifetime.

26
Proportional Energy Allocation
Battery lifetime isset to 2.16
hours(unconstrainedwould be 1.3 hr) Overall
allocation equivalentto an average power
consumption of 5W.
27
Proportional CPU Utilization
Performance ofcompute boundtask (ijpeg)
scalesproportionally withcurrentcy allocation
28
But - Netscape Performance Impact
Some applicationsdont gracefullydegrade with
drastically reducedcurrentcy allocations
29
Previous Experiments
  • Validated the embedded energy model.
  • Demonstrated that we can achieve a target battery
    lifetime.
  • Demonstrated we can achieve proportional energy
    usage among multiple tasks.

30
Experiences
  • Identified performance implications of limiting
    energy availability that motivate further policy
    development
  • Mismatches between user-supplied specifications
    and actual needs of the task
  • Scheduling not offering opportunities to spend
    allocation
  • I/O devices and other activity causing a form of
    inversion

31
Challenge
  • To fully utilize available battery capacity
    within the desired battery lifetime with little
    or no leftover (residual) capacity.
  • Devise an allocation policy that balances supply
    and demand among tasks.
  • Currentcy conserving allocation.

32
Problem Residual Energy
Allocation Shares
Caps
Demand
OS
  • Allocations do not reflect actual consumption
    needs

33
Problem Residual Energy
Allocation Shares
Caps
Demand
OS
  • A tasks unspent currentcy (above a cap) is
    being thrown away to maintain steady battery
    discharge.
  • Leftover energy capacity at end of lifetime.

34
Currentcy Conserving Allocation
Allocation Shares
Caps
Demand
OS
  • Two-step policy. Each epoch
  • Adjust per-task caps to reflect observed need
  • Weighted average of currentcy used in previous
    epochs.

35
Currentcy Conserving Allocation
Allocation Shares
Demand
OS
  1. Redistribute overflow currentcy

36
Currentcy Conserving AllocationExperiment
  • Workload
  • Computationally intensive ijpeg image encoder
  • Image viewer, gqview, with think time of 10
    seconds and images from disk
  • Performance levels out at 6500mW allocation.
  • Total allocation of 12W, shares of 8W for gqview
    (too much) and 4W for ijpeg (capable of 15.5W).
  • Comparing against total allocation correction
    method in original prototype.

37
Currentcy Conserving AllocationResults
B
A
total alloc
gqview alloc
ijpeg alloc
lt1 remaining capacity
38
Challenge
  • To produce more robust proportional sharing by
    ensuring adequate spending opportunities.
  • Develop CPU scheduling that considers energy
    expenditures on non-CPU resources.
  • Currentcy-aware scheduling or energy-centric
    scheduling.

39
Problem Scheduling/ Allocation Interactions
  • Allocation shares may be appropriately specified
    and consistent with demand, but the ability to
    spend depends on scheduling policies that control
    the opportunities to access resource.
  • Priority Inversion a task with small allocation
    but large CPU component can dominate a task with
    larger allocation but demands on other devices.
  • Scheduling should be aware of currentcy
    expenditures throughout the system.

40
Problem Scheduling/ Allocation Interactions
  • Traditional schedulers
  • Explicitly deal with CPU time and processes on
    ready queue
  • May implicitly compensate for time spent off
    ready queue
  • Energy-aware
  • Deals with energy use outside of CPU
  • Currentcy explicitly captures progress using
    multiple devices

CPUenergy
gqview
think
diskenergy
41
Energy-Centric Scheduling
  • The next task to be scheduled for CPU is the one
    with the lowest amount of currentcy spent in this
    epoch relative to its share
  • Captures currentcy spent on any device.
  • Dynamic share weighted by the tasks static
    share divided by currentcy spent in last epoch.
  • Compensation for previous lack of spending
    opportunities

42
Energy-Centric SchedulingExperiment
  • Workload
  • Computationally intensive ijpeg
  • Image viewer, gqview, with think time of 10
    seconds and disk access (700mW)
  • Performance levels out at 6500mW allocation.
  • Given equal allocation shares, total allocation
    varied
  • Comparing against round-robin and stride based on
    static share value.

43
Energy-Centric SchedulingResults
Gqview power consumption
44
Energy-Centric SchedulingResults
Ijpeg power consumption
45
Benefits of Currentcy
  • Currentcy abstraction
  • Provides a concrete representation of energy
    supply and demand allowing explicit
    energy/power management.
  • Provides unified view of energy impact of
    different devices enabling multi-device,
    system-wide resource management
  • Comparable, quantifiable, tradeoffs can be
    expressed
  • Encourages analogies to economic models
    motivating a rich set of policies.

46
Contributions
  • ECOSystem is a powerful framework for managing
    energy explicitly as a first-class OS resource.
  • Currentcy model is capable of formulating
    non-trivial energy goals and serving as the basis
    for solutions
  • Reducing residual battery capacity when lifetime
    reached
  • Ensuring that scheduling works with currentcy
    allocation towards proportional energy sharing
  • Smoothing out response time variation
  • Encouraging greater disk energy efficiency

47
Power Aware DRAM
  • Memory with multiple power states has become
    available
  • Fast access, high power
  • Low power, slow access
  • New take on memory hierarchy
  • How to exploit this opportunity?

48
Exploiting the Opportunity
  • Interaction between power state model and access
    locality
  • How to manage the power state transitions?
  • Memory controller policies
  • Quantify benefits of power states
  • What role does software have?
  • Energy impact of allocation of data/text to
    memory.

49
Power State Transitioning
completionof last request in run
requests
time
gap
Ideal caseAssume we wantno added latency
gap m th-gtl tl-gth tbenefit
50
Benefit Boundary
gap m th-gtl tl-gth tbenefit
51
Power State Transitioning
completionof last request in run
requests
time
gap
th-gtl
tl-gth
phigh
phigh
On demand case- adds latency oftransition back up
plow
ph-gtl
pl-gth
52
Power State Transitioning
completionof last request in run
requests
time
gap
threshold
th-gtl
tl-gth
phigh
phigh
On demand case- adds latency oftransition back up
Threshold based- delays transition down
ph-gtl
plow
pl-gth
53
Power-Aware DRAM Main Memory Design
  • Assume we access control each chip individually
  • 2 dimensions to affect energy policy HW
    controller / OS
  • Energy strategy
  • Cluster accesses to already powered up chips
  • Interaction between power state transitions and
    data locality

CPU/
Software control
Page Mapping Allocation
OS
Hardware control
ctrl
ctrl
ctrl
Chip 0
Chip 1
Chip n-1
Power Down
Active
Standby
54
Power Aware DRAM
Read/Write Transaction
RambusRDRAM Power States
Active 300mW
6000 ns
6 ns
Power Down 3mW
Standby 180mW
60 ns
Nap 30mW
55
Dual-state HW Power State Policies
access
Active
  • All chips in one base state
  • Individual chip Active while pending requests
  • Return to base power state if no pending access

No pending access
access
Standby/Nap/Powerdown
Active
Access
Base
Time
56
Quad-state HW Policies
access
access
  • Downgrade state if no access for threshold time
  • Independent transitions based on access pattern
    to each chip
  • Competitive Analysis
  • rent-to-buy
  • Active to nap 100s of ns
  • Nap to PDN 10,000 ns

no access for Ta-s
Active
STBY
no access for Ts-n
access
access
Nap
PDN
no access for Tn-p
Active
STBY
Nap
Access
PDN
Time
57
Page Allocation and Power-Aware DRAM
  • Physical address determines which chip is
    accessed
  • Assume non-interleaved memory
  • Addresses 0 to N-1 to chip 0, N to 2N-1 to chip
    1, etc.
  • Entire virtual memory page in one chip
  • Virtual memory page allocation influences
    chip-level locality

CPU/
Page Mapping Allocation
OS
Virtual Memory Page
ctrl
ctrl
ctrl
Chip 0
Chip 1
Chip n-1
58
Page Allocation Polices
  • Virtual to Physical Page Mapping
  • Random Allocation baseline policy
  • Pages spread across chips
  • Sequential First-Touch Allocation
  • Consolidate pages into minimal number of chips
  • One shot
  • Frequency-based Allocation
  • First-touch not always best
  • Allow (limited) movement after first-touch

59
The Design Space
2 Can the OS help?
1 Simple HW
2 state model
3 Sophisticated HW
4 Cooperative HW SW
4 state model
60
Evaluation Methodology
  • Metric EnergyDelay Product
  • Avoid very slow solutions
  • Energy Consumption (DRAM only)
  • Processor Cache do affect runtime
  • Trace-Driven Simulation
  • Windows NT personal productivity applications
    (Etch traces from U. Washington)
  • Simplified processor and memory model
  • Execution-Driven Simulation
  • SPEC benchmarks (subset of integer)
  • SimpleScalar w/ detailed RDRAM timing and power
    models

61
Methodology Continued
  • Trace-Driven Simulation
  • Windows NT personal productivity applications
    (Etch at Washington)
  • Simplified processor and memory model
  • Eight outstanding cache misses
  • Eight 32Mb chips, total 32MB, non-interleaved
  • Execution-Driven Simulation
  • SPEC benchmarks (subset of integer)
  • SimpleScalar w/ detailed RDRAM timing and power
    models
  • Sixteen outstanding cache misses
  • Eight 256Mb chips, total 256MB, non-interleaved

62
Summary of Simulation Results (EnergyDelay
product, RDRAM, ASPLOS00)
Nap is best dual-state policy 60-85
Additional 10 to 30 over Nap
2 state model
Best Approach 6 to 55 over dual-nap-seq, 80
to 99 over all active.
Improvement not obvious, Could be equal to
dual-state
4 state model
63
Other Questions
  • How to determine the best thresholds in memory
    controller design?
  • Are more sophisticated OS page allocation (or
    migration) policies useful?
  • How do power-state components (power-aware DRAM)
    and dynamic voltage scaling (processors)
    interact?
  • Is there a policy based on adaptive thresholds
    for transitioning power-state devices (in general
    -- memory, disks, wireless)?

64
Naïve Power-awareness
50MHz
100MHz
Memory
CPU/
200MHz
State Trans
1000MHz
execution
slack
Active
Memory Power State Transitions
cache miss
idle
Powerdown
Standby
65
Naïve Power-awareness
  • Lowest energy achieved at 400MHz
  • Memory remains powered on too long in low
    frequencies
  • CPU energy too high in high frequencies
  • Result conflicts with conventional DVS
  • Memory has to be taken into account

66
Aggressive Power-awareness
50MHz
100MHz
Memory
CPU/
200MHz
State Trans
1000MHz
execution
slack
Active
Memory Power State Transitions
cache miss
idle
Powerdown
Standby
Powerdown
67
Aggressive Power-awareness
  • Lowest frequency wins again
  • CPU energy becomes dominant
  • Memory energy greatly reduced and stabilizes
  • Effective power-aware memory contributes to
    realizing the potential of DVS

68
Contributions
  • Demonstrated dramatic improvements in
    energydelay for power-aware page allocation
  • Frequency-based allocation little impact
  • Device-level general power management
  • Based on histogram of gaps in moving window to
    capture non-stationarity in access pattern
  • Efficient tree algorithm updates energy and
    searches threshold space
  • DVS and Power-aware memory interactions explored
  • Technique for DVS to choose optimal frequency
    with the consideration of memory effect

69
FaceOff
  • Goal to reduce systemenergy consumption by
    using low power sensors to match I/O behavior
    more directly to user behavior and context.
  • A display is only necessary if someone is looking
    at it.

70
Image Capture
Face Detector
Main Control Loop
No Faceoff
Faceon
71
Prototype
  • IBM ThinkPad T21 running RedHat Linux
  • Base max CPU power consumption 18 Watts
  • Display 7.6 Watts
  • Logitech QuickCam Web Cam
  • Power Consumption 1.5 Watts
  • X10 ActiveHome Wireless Motion Sensor and
    Receiver
  • Software components
  • Image capture, face detection, display power
    state control (ACPI)

72
Face Detection
  • Simple skin detection used for prototype

73
Feasibility Study
  • What is the potential for energy savings?
  • Best case scenarios to measure opportunity
  • Assume perfect accuracy
  • User behavior start it and leave, return on
    completion.
  • What is the effect on System Performance
  • Network file transfer (113 MB)
  • CPU intensive process (Linux kernel compile)
  • MP3 Song (no display necessary)
  • How responsive is the system?

74
File Transfer
Tradeoff of energy costs CPU image processing
plus camera power vs.display energy during idle
timeout.
75
Kernel Compile Traces
76
Energy and Time Comparisons
Energy (J) Default With FaceOff Savings
File transfer 6795 4791 29.5
Kernel compile 12507 11023 11.9
MP3 4714 3403 28
Time (s) Default With FaceOff Overhead
File transfer 348.6 351.3 .8
Kernel compile 575 603.5 4.9
MP3
No effect on playback
77
Responsiveness Timing
polling latency
detection latency
Face arrives (or departs)
Image acquired
detection complete display signaled
Total responsiveness latency
78
Detection Latency Under Load
Workload Average (99 Confidence) Maximum Minimum
Network Transfer 1757ms 305ms 116ms
Kernel Compile 2305ms 669ms 51ms
MP3 1543ms 229ms 84ms
79
On-going Work on FaceOff
  • Continue work on optimizing responsiveness
    overhead
  • Comprehensive user study
  • Survey of usability
  • Characterization of real deployment usage
    patterns
  • End-to-end experiment
  • Energy measurement under realistic usage

80
Milly Watt Project Future Directions
Distributed systems sensor networks
New platformsMotes withTinyOScurrentcy
New energy goalsefficiencyapplicationcoopera
tion
ECOSystem
New devices policiesintegrating the
displayeconomics-based file system
81
For More Information
  • www.cs.duke.edu/ari/millywatt/
  • email carla_at_cs.duke.edu

82
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com