Power Aware Domain Migration in a Virtualized Cluster presentation

About This Presentation

Transcript and Presenter's Notes

Title: Power Aware Domain Migration in a Virtualized Cluster

1
Power Aware Domain Migration in a Virtualized
Cluster

2
Motivation

CPU Demand
Time
3
Existing techniques

4
A more general solution

5
PARD vs. PADD
6
Conceptual diagram
CPU Demand
Time
Tb
Ta
Node 1
Node 1
Dom 1
Dom 2
Dom 1
Node 2
Node 2 (off)
Dom 2
7
Problem

Problem is easy to state
What is the best way to distribute domains within
a cluster of physical nodes such that resource
demands are satisfied while minimizing the energy
(and therefore number of required nodes)?
Solution is a bit harder

8
Questions

How is resource utilization measured?
What window-size?
When do we shut down a node?
Which node gets shut down?
Where do we send the victim domains?
When do we bring up a node?
Which domains get migrated to the new node?
What safety margins are needed to handle
transient spikes? How are they enforced?

9
How to proceed?

10
Simulator design (1)

Active
Standby
Off
11
Simulator design (2)

12
Simulator design (3)

Reduction run a reduce function on raw
utilization samples within a given reduction
window
Functions average, max, nth percentile, always
return X

Max
Mean
Raw data
13
Simulator algorithm

14
Simulator details

Safety margins
Local delta Extra CPU slack for each node
To handle transient demand spikes
Calculated dynamically according to a policy
parameter
Static user-specified value
Sum of (standard deviation of utilization over
reduction window) for each domain
Sum of (100 - average domain utilization) for
each domain
Global delta Extra CPU slack for the whole
cluster
Keeps some extra nodes running so we dont have
to wait on boot-up while demand increases
Calculated dynamically, can use same policies as
local delta

15
When is demand satisfied?

Depends on workload
Throughput-oriented workload
First pass unmet CPU demand
Each sample, compare total domain demand D
against node CPU performance capacity P
If PD, then demand is satisfied
Else, D-P is the unmet demand for that sample
Add to a running total
A simulation is successful if unmet_demand lt
MAX_UNMET_DEMAND
MAX_UNMET_DEMAND empirically derived, small

16
Summary

17
Preliminary test

18
Preliminary results

Bad news
Most non-naïve policies led to large unmet demand
(5800 CPU seconds)
Naïve Always assume 100 demand
No successful simulation saved any power
naïve techniques simply ran on 4 nodes all the
time
Good news
The maximum reduction function worked as well
as naïve
Should perform even better on workloads with
stable, low demand
E.g. run-to-completion where CPU is not the
bottleneck
Several unsuccessful runs did save power
If we can curb the unmet demand, well be in
business!

19
Conclusion

20
References

1 K. Rajamani and C. Lefurgy. On evaluating
request-distribution schemes for saving energy in
server clusters. Proceedings of the IEEE
International Symposium on Performance Analysis
of Systems and Software, 2003.

21
Unmet demand
22
Energy savings
or the lack thereof
23
Boot timing (1)
Cold boot
Restoring from hibernate mode
24
Boot timing (2)

Write a Comment

User Comments (0)

About PowerShow.com

Power Aware Domain Migration in a Virtualized Cluster PowerPoint PPT Presentation