SensorBased Fast Thermal Evaluation Model For Energy Efficient HighPerformance Datacenters - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

SensorBased Fast Thermal Evaluation Model For Energy Efficient HighPerformance Datacenters

Description:

Recirculation Minimized Scheduling: XInt. Formalizing optimization problem. To minimize cooling energy cost, we only need to minimize maximal inlet temperature ... – PowerPoint PPT presentation

Number of Views:175
Avg rating:3.0/5.0
Slides: 35
Provided by: qinghu
Category:

less

Transcript and Presenter's Notes

Title: SensorBased Fast Thermal Evaluation Model For Energy Efficient HighPerformance Datacenters


1
?Sensor-Based Fast Thermal Evaluation Model For
Energy Efficient High-Performance Datacenters
  • Q. Tang, T. Mukherjee, Sandeep K. S. Gupta
  • Department of Computer Sc. Engg.
  • Arizona State University
  • Phil Cayton, Intel Corp.

2
Heating problem in Data Center
  • Power densities are increasing exponentially
    along with Moore Law
  • Current cooling solutions at various levels
  • Chip / component level
  • Server/board level
  • Rack level
  • Data center level

3
Two steps of reducing heating effects
  • Design and deployment stage (Civil Mechanical
    Engineering Approach )
  • Increasing air conditioner capacity
  • Designing optimized layout to facilitate air
    circulation
  • Operation stage (Computer Science Approach)
  • Example dynamically assigning tasks to avoid
    overheated servers and to achieve thermal
    balancing
  • Assigning task to servers who consume less energy

4
Thermal Management of Datacenter
  • Motivation and significance
  • Compute Intensive Applications (Online Gaming,
    Computer Movie Animation, Data Mining) requiring
    increased utilization of Data Center
  • Maximizing computing capacity is a demanding
    requirement
  • New blade servers can be packed more densely
  • Energy cost is rising dramatically
  • Goal
  • Improving thermal performance
  • Lowering hardware failure rate
  • Reducing energy cost

5
Typical layout of a datacenter
  • Rack outlet temperature Tout
  • Rack inlet temperature Tin
  • Air conditioner supply temperature Ts

6
Schematic View of Thermal Management
7
Thermal-Aware Scheduling versusDatacenter Energy
Cost
8
Thermal Scheduling Problem Statement
  • We present results of thermal-aware scheduling to
    improve the (blade server based) energy efficient
    of datacenter
  • Given a total task C, how to divide it among N
    server node to finish computing task with minimal
    total energy cost ?

9
Energy Conservation
Outlet Airflow
Server Power Consumption Pi Depending on amount
of computing task
Inlet Airflow, a mixture of Supplied cold air and
Recirculated hot air
10
Thermal Management
  • Different task assignment result in different
    power consumption distribution
  • Different power consumption distribution results
    in different temperature distribution
  • Different temperature distribution results in
    different total energy cost

11
Example
Inlet temperature distribution without Cooling
Cooling lowered Inlet temperature lowered
blow redline threshold
Different scheduling Results different
inlet Temperature distribution
Demand for cooling load /energy
Scheduling 1
25?C
Demand for cooling load/energy
Scheduling 2
25?C
12
Total Energy Cost of Datacenter
  • Computing energy cost
  • Cooling energy cost
  • ?keep the maximal inlet temperature below the
    redline temperature of devices 25?C
  • COP Coefficient Of Performance (COP)
  • Total Energy Cost

the amount of heat removed
COP
the energy consumed by the cooling device.
13
Observation
  • Even with the same computing power dissipation,
    different temperature distribution may demand
    different cooling load, results in different
    total energy cost
  • We can manipulating task scheduling to achieve
    best temperature distribution, consequently
    minimize total energy cost

14
Naive Scheduling Algorithm
15
Uniform Outlet Profile
Temperature rise due to power consumption
  • Why Naive
  • Based on observation and intuition
  • No mathematical formalization
  • Uniform Outlet Profile (UOP)
  • Assigning tasks in a way trying to achieve
    unifrom outlet temperature distribution Tc
  • Assigning more task to nodes with low inlet
    temperature (water filling process)

Tc
Inlet Temperature
16
Uniform Task
  • Uniform Task (UT)
  • Assigning all chassis the same amount of tasks
    (power consumptions)
  • All nodes experience the same power consumption
    and temperature rise

17
Minimum Computing Energy
  • Minimum computing energy (cooling inlet)
  • Assigning tasks in a way to keep the number of
    active (power on) chassis as small as possible

18
Abstract Heat Flow Mode Cross Interference
Coefficients
19
Abstract Heat Flow Model
  • Observation
  • Airflow pattern are stable (confirmed through CFD
    simulation)
  • Hypothesis
  • The amount of recirculated heat is stable, can be
    characterized
  • Define aij the percentage of recirculated heat
    from node i to node j

20
Cross Interference among Server Nodes
  • Cross Interference Coefficients (CIC)
  • Define aij the percentage of recirculated heat
    from node i to node j
  • Cross interference coefficients
  • Cross Interference Matrix
  • Correlations among power consumption (utilization
    rate), temperature, and cross interference

21
Fast Thermal Evaluation
  • Use profiling process to calculate cross
    interference coefficients
  • Temperature Prediction

A Configuration of Distributed System
Numerical Simulation (hours)
Fast Thermal Evaluation (real time)
Thermal Performance Evaluation
22
Recirculation Minimized Scheduling XInt
23
Formalizing optimization problem
  • To minimize cooling energy cost, we only need to
    minimize maximal inlet temperature
  • Formalized optimization problem based on abstract
    heat flow model, can be converged into LP, ILP,
    linear, nonlinear problems according to different
    models and policies

24
Simulation Results
25
Simulation Environment
  • 2 Row Datacenter
  • Ten standard 42U racks
  • Each rack has five Dell 1855 Blade server
  • CFD simulation is used for evaluate temperature
    distribution
  • (Flovent from Flomerics)

26
DataCenter model
Node 50
Node 5
Node 30
Node 2
Node 25
Node 1
27
Cross Interference Coefficients
  • Confirmed with datacenter reality
  • Strong interference to neighboring nodes

28
Fast Thermal Evaluation Results
  • Provides fast and accurate temperature prediction
  • Practical for online real-time thermal management

29
Simulation Results Cooling Cost
30
Simulation Results Analysis Summary
  • XInt consistently outperforms all other
    scheduling algorithms
  • Compared with MinHR, XInt is more practicabel
  • Task oriented scheduling vs. Power oriented
    scheduling
  • Online, real-time
  • XInt is mathematically formalized

31
Future Works
  • Integrating with cluster management software
    platforms
  • Moab, Torque, etc
  • Considering task priorities and time constraints

32
Questions ?
33
Related Works
  • Consil vs Fast Thermal Evaluation
  • Deduction vs. Prediction
  • Current vs. future, which is more important for
    proactive and preventive thermal management
  • MinHR vs. XInt
  • Both characterize recirculation in similar
    granulites
  • Aggregated effects vs. point to point
  • Offline vs. online
  • Power oriented vs. Task oriented

34
Supply Heat Index (SHI)
  • Roughly characterize recirculation
  • Cannot differentiate the same SHI but different
    temperature distribution
Write a Comment
User Comments (0)
About PowerShow.com