Challenges in Distributed Energy Adaptive Computing - PowerPoint PPT Presentation

About This Presentation
Title:

Challenges in Distributed Energy Adaptive Computing

Description:

Challenges in Distributed Energy Adaptive Computing K. Kant NSF and GMU * K. Kant, Modeling Challenges in Distributed Energy Adaptive Computing – PowerPoint PPT presentation

Number of Views:217
Avg rating:3.0/5.0
Slides: 36
Provided by: kkant
Category:

less

Transcript and Presenter's Notes

Title: Challenges in Distributed Energy Adaptive Computing


1
Challenges in Distributed Energy Adaptive
Computing
  • K. Kant
  • NSF and GMU

2
  • Information communication Technology (ICT) has
    a problem
  • Performance Centric ? Energy Sustainability
    centric
  • How do we get there?

3
ICT Power Growth until 2020
  • Increase in spite of power efficient designs
  • Clients 8x in number, 3X in power
  • Data Centers gt 2X increase
  • Network 3X increase

Network
Clients
Transmission, conversion distribution
Data Center
4
Current StateUnsustainable Computing
5
Data Center Infrastructure
  • Resource intensive Water, cabling, metal,
  • 50 power wasted before getting to racks

6
Distribution Infrastructure
10 distribution loss High carbon impact
IT LOAD
2.5MW Generator 180 Gallons/hour
13.2kv
208V
1 loss in switch gear and conductors
115kv
UPS
480V
13.2kv
13.2kv
1.0 loss 99.0 efficient
6 loss 94 efficient
0.3 loss 99.7 efficient
0.5 loss 99.5 efficient
7
50 Rack Power Wasted
Component Total Used Comments
CPU 80 60 Operating at 100 utilization
Fans 50 25 Temp. directed fan at 100 util
Memory (32 GB) 88 24 2GB DIMMS, 4W idle, 19W active
Hard drives 40 10 6 SATA drives, 25 busy
I/O adapters 20 4 25 disk, 15 network
Motherboard 22 12 N/S bridges devices, VRs,
Total DC power 300 135
Power supply loss 50 7 14 ? 5 loss of AC input pwr
AC input power 350 142 gt 50 of power is wasted
8
Sustainable Computing
9
Renewable Energy Push
  • Limit energy draw from grid
  • Less infrastructure
  • Less losses
  • but variable supply

Need better power adaptability
10
High Temperature DCs
  • Chiller-less operation
  • Less energy/materials, but space inefficient
  • High temperature operation
  • Smaller Toutlet Tinlet
  • More throttling
  • More failure prone (?)

X
Need smarter thermal adaptability
11
Overdesign
  • Overdesign is the norm today
  • Huge power supplies, fans, heat sinks, server
    cases, high rack capacity, UPS capacity,
  • Engineered for worst case ? Rarely encountered
  • Huge power wastage, waste of materials, energy,
  • What if we right-size everything?
  • Highly energy efficient but need smarter control

Better energy adaptability to deal w/ frugal
design
12
Energy Adaptive Computing
  • EAC strives to do dynamic end to end adjustment
    to
  • Workload adaptation for graceful QoS degradation
    under energy limitations
  • Infrastructure adaptation to cope with temporary
    energy deficiencies.
  • Requires coordinated power/thermal mgmt of
    computation, network storage.
  • Enhances sustainability of IT infrastructure

13
EAC Instances
14
Client-server EAC
  • Transparently adapt to client energy states
  • State on-AC, normal, low-battery,
  • Service contract Ci setup QoS, operational
    QoS
  • Adaptation Challenges
  • Communicating enforcing contracts.
  • Group adaptation of clients forced by
    network/servers ?

15
Cluster EAC
  • Adaptation to intra inter-DC limits
  • Multi-level Server, rack DC levels
  • Adaptation Challenges
  • Estimate collect power deficits/surplus at
    multiple levels
  • Coordination across large range of devices
  • Location based services
  • Coordination across levels
  • Simultaneously handle client-server loop

16
P2P EAC
  • Adaptation based on available energy
  • Content video resolution, audio coding,
  • Network modulate wireless radio usage (?)
  • Energy proportional use of peer resources
  • Energy driven content replication
    reorganization
  • Adaptation Challenges
  • Satisfying QoS ?
  • Balancing src/dest usage vs. relay node energy
    usage ?

17
ChallengesSome specific Issues
18
Power Estimation Challenges
  • Notion of effective power?
  • Additive relationship Workload ? power
  • Why is this hard? Interference
  • Available power
  • Determined by power, thermal perhaps other
    issues (noise).
  • Required at multiple levels facility, enclosure,
    machine,

19
Network Role in EAC
  • Energy Adaptation
  • Aggressive control of switch/router ports
  • Speed, state width controls
  • Traffic consolidation across paths
  • Adaptation induced congestion
  • Propagation (e.g., ECN, EBCN) response
  • Computation communication tradeoff ?
  • Redirection ?
  • Network protocol support for adaptation?

20
Other Issues
  • EAC Security
  • Attacks on power sources
  • Energy Attacks on IT, e.g.,
  • Demanding too much, cyclic demands,
  • Storage adaptation
  • Storage devices, controllers network.
  • Coordinated end to end control is hard!
  • Formal models to understand impact of energy
    adaptation.

21
Energy Adaptation in Data Centers
22
Adaptation Methods
  • Workload Adaptation
  • Coarse grain Shut down low priority tasks
  • Fine grain Graceful QoS degradation, e.g.,
  • Batched service, poorer resolution,
  • Infrastructure Adaptation
  • Operation at lower speeds (DVFS)
  • Effective use of low power modes width
    control.
  • Workload adaptation always done first

23
Infrastructure Adaptation
  • Need a multilevel scheme
  • Individual assets up to entire data center
  • Need both supply demand side adaptations

24
Supply Side Adaptation
  • Supply side Limits
  • Hard caps at higher levels (true limit) vs.
    soft (artificial) caps at lower levels.
  • Limits may be a result of thermal/cooling issues.
  • Load consolidation
  • An essential part of energy efficient operation
  • Load consolidation vs. soft capping
  • Need to address workload adaptation changes as a
    result of supply increase decrease.

25
Demand Side Adaptation
  • Adaptation to fluctuating demand
  • Transactional workload Migrate queries or app
    VMs?
  • Issues w/ combined supply demand side
    adaptations
  • Imbalance One node squeezed while other has
    surplus power
  • Ping-pong Control Oscillatory migration of
    workload
  • Error accumulation down the hierarchy.

26
A Proposed Algorithm
  • Unidirectional control
  • Load migration moves up the hierarchy, from local
    to global.
  • Local migrations are temporary do not trigger
    changes to soft caps on supply.
  • Target Node selection
  • Based on bin packing (best-fit decreasing)
  • Allows for more imbalance, which can be exploited
    for workload consolidation
  • Properties
  • Avoids ping-pong, attempts to minimize imbalance

27
Experimental Results
  • Scenario
  • 3 levels, 18 identical servers (44 55)
  • 3 applications, total of 25 app instances
  • Any app can run on any server
  • Demand Poisson (active power 8 utilization)

28
Migration Frequency
  • Migration drivers consolidation vs. energy
    deficiency
  • Low util ? Consolidation, High util ? Energy
    deficiency
  • Other characteristics
  • Migration frequency low in all cases
  • No ping-pong observed

29
Thermal Impacts
  • Additional Issues
  • Energy consumption limited by thermal/cooling
    issues, not energy availability
  • Migrations required to limit temperature
  • Temperature power have nonlinear relationship
  • Need to account for both power thermal effects

30
Results w/ Thermal Effects
  • Imbalanced cooling
  • Servers 1-14 Ta25o C, Servers 15-18 Ta40oC
  • Temperature limit 65oC
  • Power demand is adjusted by the alg. to account
    for higher temperature

31
Conclusions
  • Need to go beyond energy efficiency
  • Design devices/systems to minimize life-cycle
    energy footprint
  • Creatively adapt to available energy to operate
    at the edge
  • Ongoing/future work
  • Coordinated server, network storage mgmt.
  • Explore tradeoffs between QoS, power savings and
    admission control performance

32
Thank you!
33
Power Inefficiencies
Wasted leakage clock power
Rack supply
90-95 efficient
CPU
Voltage Regulators
280V
Server PSU
DRAM Mem controller
12, 5V
70-90 efficient
Fans
Adapters
Storage
95 efficient
Idle wasted power
34
Operating Regimes
35
So, Whats the Problem
Client
Client
  • Local constraints controls ? end-to-end impacts
  • DC to DC load shift
  • Service disruption post-shift impact
  • Client request to alter content
  • Less or more work for server
  • Potential conflicting controls

Network
Core Network
Write a Comment
User Comments (0)
About PowerShow.com