Distributed Resource Management for Virtualized System Clusters VMware R presentation

About This Presentation

Transcript and Presenter's Notes

Title: Distributed Resource Management for Virtualized System Clusters VMware R

1
Distributed Resource Management for Virtualized
System ClustersVMware RD
2
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Live Migration
Architecture for Distributed Communication and
Control
Resource Management Basics
Distributed Resource Scheduling
Distributed Power Management
Distributed High Availability
Summary

3
Virtualized Systems Cluster Infrastructure Live
Migration

Example VMware VMotion
Hot migrate VM across hosts
Transparent to guest OS, apps
Minimal downtime (sub-second)
Requirements current
Globally accessible storage (SAN/NAS)
Same subnet (no forwarding proxy)
Compatible processors
Details
Bitmap tracks modified pages
Pre-copy iteration sends modified pages
Repeatedly pre-copy diff until converge
Exploit meta-data (shared, swapped)

4
Virtualized Systems Cluster Infrastructure
Example Architecture
5
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Resource Management Basics
Goals
Controls
Distributed Resource Scheduling
Distributed Power Management
Distributed High Availability
Summary

6
Resource Management Goals

Performance isolation
Prevent virtual machines (VMs) from monopolizing
resources
Guarantee predictable service rates
Efficient utilization
Exploit undercommitted resources
Overcommit with graceful degradation
Exploit opportunities to reduce power consumption
Easy administration
Flexible dynamic partitioning
Meet absolute service-level agreements
Control relative importance of VMs
Respect availability constraints

7
Resource Controls Overview

Useful Features
Express absolute service rates e.g., 512MHz,
1GB
Express relative importance e.g., VM A to get 2x
CPU of VM B
Grouping VMs for isolation, sharing e.g., VMs
A,B to share 1GHz
Challenges
Simple enough for novices
Powerful enough for experts
Mapping application-level metrics to physical
resource consumption
E.g., What MHz is needed to guarantee 100
transactions/second?
Scaling from single host to cluster of (say)
32/64/128 servers

8
Basic Resource Controls

Shares
Specify relative importance
Entitlement directly proportional to shares
Abstract relative units, only ratios matters
Reservation
Minimum guarantee, even when system overcommitted
Concrete absolute units (MHz, MB)
Admission control sum of reservations capacity
Limit
Upper bound on consumption, even when
undercommitted
Concrete absolute units (MHz, MB)

9
Resource Pools

Motivation
Allocate aggregate resources for sets of VMs
Isolation between pools, sharing within pools
Flexible hierarchical organization
Access control and delegation
What is a resource pool?
Named object with permissions
Reservation, limit, and shares for each resource
Parent pool, child pools, VMs

10
Resource Controls Exploration Areas

Additional controls
Real-time latency guarantees
Application-level metrics
Users think in terms of transaction rates,
response times
Labor-intensive, requires detailed
domain/app-specific knowledge
Automate mapping to physical resource controls

11
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Resource Management Basics
Distributed Resource Scheduling
Overview
Example
Exploration Areas
Distributed Power Management
Distributed High Availability
Summary

12
Distributed Resource Scheduling Overview

Useful features
Choose initial host for VM power on
Dynamic rebalancing by migrating running VMs
between hosts
Configurable automation and migration threshold
levels
Provide host evacuation for flexible host
downtime
Support optional constraints on VM colocation on
hosts
Preserve resources for failover
Challenges
Placement and migration decisions involve
multiple resources
Resource pools can span multiple hosts
Determining appropriate migration threshold
controls
Assorted failures modes (hosts, connectivity,
etc.)

13
Distributed Resource Scheduling Example VMware
DRS

Cluster-wide resource management
Hierarchical organization and delegation
Flexible grouping, sharing, and isolation
Configurable automation levels, migration
aggressiveness
Configurable VM affinity/anti-affinity rules
Preserves unfragmented spare resources for
failover
Automatic virtual machine placement and migration
Choose initial host when VM powers on
Optimize load balance across hosts
Dynamic rebalancing using VMotion
React to dynamic load changes
Evacuate hosts for maintenance and/or power-off

14
Example VMware DRS Balancing Details

Compute VM entitlements
Based on resource pool and VM resource settings
VM demand includes usage and unsatisfied demand
Dont give VM more than it demands
Reallocate extra resources fairly
Compute host loads
Load ? utilization unless all VMs equally
important
Sum entitlements for VMs on host
Normalize by host capacity
Consider possible migrations
Evaluate cluster balance impact and risk-adjusted
cost/benefit
Incorporate migration cost for involved hosts
Recommend best moves (meeting specified threshold)

15
VMware DRS Simple Balancing Exampleall VMs in
same resource pool with same shares
Recommendation to improve imbalance migrate VM2
16
Distributed Resource Scheduling Exploration Areas

I/O resource management
Quality of service for networking, storage
End-to-end control difficult, complex
switching/routing fabric
Lack of standards, even in non-virtualized
environments
Proactive migrations
Detect longer-term trends
Move VMs based on predicted load while minimizing
impact on current load
Large-scale WAN/Grid management

17
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Resource Management Basics
Distributed Resource Scheduling
Distributed Power Management
Overview
Example
Exploration Areas
Distributed High Availability
Summary

18
Distributed Power Management Overview

Useful features
Migrate VMs to allow hosts to be powered-off when
demand low
Power on hosts when demand rises or needed to
satisfy constraints
Work in concert with Distributed Resource
Scheduling and Distributed High Availability
goals and constraints
Configurable automation and utilization threshold
levels
Challenges
Non-homogeneous hosts can result in utilization
hot or cold spots
Sudden unpredicted rise in demand can cause
performance impact
Benefits depend on demand valleys of non-trivial
duration
Variety of host wake methods with usability
pros/cons

19
Distributed Power Management Example VMware DPM
Response to Reduced Demand

Implements useful features
Consolidates virtual machines (VMs) onto fewer
hosts powers hosts off when demand is low
Powers hosts back on when needed to meet workload
demand or to satisfy constraints
Optional add-on to VMware Distributed Resource
Scheduler (DRS)

Power Off
DRS Cluster with DPM enabled
20
Example VMware DPM Operation in VC2.5/ESX 3.5

Lightly-used Hosts ? Consider Host Power-Off
Conservative considers 40-minutes load history
All VMs on selected host are migrated to other
hosts
Weighs trade-offs between costs and benefits of
power-off
Host is powered off
Heavily-used Hosts ? Consider Host Power-On
Responsive considers 5-minute load history
Send wake-on-LAN packet or BMC command 2009 to
host
Host boots up
DRS load-balancing kicks in and some VMs migrated
to host
Target Host Utilization
Range centered around 63 by default

21
Example VMware DRS/DPM Interactions

Responsibilities
DRS balances load to satisfy service-level
agreements
DPM reduces running cluster capacity to save
power
Both DRS and DPM respect any resources needed for
failover
Interactions
DRS rebalances with DPM-recommended power
actionsin what-if simulations
DPM evaluates the impact of potential power
actions based onDRS rebalancing results
Recommendations
Final host power actions and VMotions
Recommended to user, or applied automatically

22
Distributed Power Management Exploration Areas

Use VM demand prediction to drive proactive host
power-on
Incorporate additional metrics in host off/on
selection
Examples host power efficiency, temperature
Operate in cooperation with host-level power
management
Example Choose hosts for off/on based on
power-management features

23
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Resource Management Basics
Distributed Resource Scheduling
Distributed Power Management
Distributed High Availability
Overview
Example
Exploration Areas
Summary

24
Distributed High Availability Overview

Useful features
Provide various methods to specify resources to
reserve to restart VMs upon failures of their
hosts in a virtualized system cluster
Express whether failover resource reservation is
strict or best-effort
Decentralized host failure detection and quick VM
restart
Work in concert with DRS and DPM goals and
constraints
Challenges
Preserving unfragmented failover resources across
hosts
Avoiding conservative allocation of spare
resources when running VMs have widely different
resource reservations and needs
Support robust failover operation in many
possible failure situations
Provide cluster failover status information in a
user-friendly way

25
Distributed High Availability Example VMware HA

Specify resources to be set aside for failover
Number of host failures to tolerate
Percentage of cluster capacity 2009
Specific hosts to set aside for failover 2009
Detect failover and respond
Cluster hosts send each other heartbeats when a
host fails to do so for some period, failover
response action is launched
For failed hosts, their running VMs are restarted
on other hosts
Safe restart supported by locking in ESX server
storage system
At failover, want unfragmented powered-on
resources
CPU,Memory resources for each VM to failover
available on 1 host
Resources must be on powered-on host
DRS/DPM/HA proactively maintain appropriate spare
resources

26
Distributed High Availability Exploration Areas

Provide alternative to VM restart on host failure
via continuing the VM from a secondary copy
already executing on another cluster host 2009
enabled by VM snapshot and record/replay
Support DRS/DPM aided failover if decentralized
restart fails use migrations and host power-on
if needed to get resources
Explore detecting/responding to partial host
failure modes run on host with diminished
working capacity

27
Outline

Distributed Resource Management for Virtualized
System Clusters
Infrastructure
Resource Management Basics
Distributed Resource Scheduling
Distributed Power Management
Distributed High Availability
Summary

28
Summary

Virtualized System clusters with appropriate
supporting infrastructure can benefit from
distributed resource scheduling, distributed
power management, and distributed high
availability.
Each of these technologies has a number of areas
open for exploration and innovation.

Write a Comment

User Comments (0)

About PowerShow.com

Distributed Resource Management for Virtualized System Clusters VMware R PowerPoint PPT Presentation