Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers - PowerPoint PPT Presentation

About This Presentation

Title:

Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers

Description:

UNIVERSITY OF MASSACHUSETTS, AMHERST Department of Computer Science ... University of Massachusetts Amherst. http://www.cs.umass.edu/~bhuvan ... – PowerPoint PPT presentation

Number of Views:64

Avg rating:3.0/5.0

Slides: 46

Provided by: cse56

Learn more at: https://www.cse.psu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers

1
Dynamic Resource Management in Internet Data
Centers

Bhuvan Urgaonkar
Laboratory for Advanced Systems Software
University of Massachusetts Amherst
http//www.cs.umass.edu/bhuvan

2
Internet Applications

Proliferation of Internet applications

auction site
online game
online store

Growing significance in personal, business
affairs
Focus Internet server applications

3
Internet Workloads Are Dynamic

Multi-time-scale variations
Time-of-day, hour-of-day
Flash crowds
User threshold for
response time
8-10 s
Key issue Provide good
response time under varying workloads

1200
0
0
1
2
3
4
5
Time (days)
Arrivals per min
140K
0
0 12 24
Time (hours)
4
Data Centers

Clusters of servers
Hosting platforms
Rent resources to third-party applications
Performance guarantees in return for revenue
Benefits
Applications dont need to maintain their own
infrastructure
Rent server resources, possibly on demand
Platform provider generates revenue by renting
resources

5
Goals of a Data Center

Satisfy application performance guarantees under
dynamic workloads
E.g., average response time, throughput
Maximize resource utilization
E.g., maximize the number of hosted applications
Question How should a data center manage its
resources to meet these goals?

6
Manual Resource Allocation
WC Soccer 1998

Resource over-provisioning
Resource wastage
A bad estimate could result in under-allocation
Manual reallocation
Slow allocation time

Challenge How to handle dynamic workloads while
efficiently utilizing resources?
7
Dynamic Resource Management

How to map an application to servers in the data
center?
How to provide good performance under dynamic
workloads?
How to remain operational under extreme overloads?

Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
8
Dynamic Resource Management

How to map an application to servers in the data
center?
How to provide good performance under dynamic
workloads?
How to remain operational under extreme overloads?

Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
9
Talk Outline

Motivation
Data Center Models
Application Placement
Dynamic Capacity Provisioning
Summary and Future Research

10
Data Center Models

Small applications
Require only a fraction of a server
Shared Web hosting, 20/month to run own Web site
Shared hosting multiple applications on a server
Co-located applications compete for server
resources

11
Data Center Models

Large applications
May span multiple servers
eBay site uses thousands of servers!
Dedicated hosting at most one application per
server
Allocation at the granularity of a single server

12
Application Placement
OSDI02

How to map application to servers in the data
center?
Step 1 Finding applications resource
requirement
Automatic requirement inference technique
Step 2 Identifying servers to host the
application
Easy in dedicated hosting
Just assign the desired number of available
servers!
Non-trivial in shared hosting
Opportunity for statistical multiplexing of
resources on a server
Multi-dimensional Knapsack

13
Resource Requirement Inference
ON-OFF PROCESS
Measurement Interval
time
CDF
1
0.99
Cumulative Probability
A
B
0
1
Fractional usage
14
Requirement Inference Technique

Profiling process of determining resource usage
Run the application on an isolated server
Subject the application to a real workload
Determine CPU and network usage
Use the Linux trace toolkit Yaghmour00
Track scheduling events, packet transmissions
times
Implementation on a Linux cluster
Apache Web server using SPECWeb99
Streaming media server with VBR MPEG-1 clients
Postgres database server
Quake game server

15
Application Profiles

Observation Resource usage can be bursty
Peak requirement much higher than a high
percentile
Insight Provisioning for the tail can save
resources!
Under-provisioning of resources
Occasional violations of resource guarantees

16
Controlled Resource Under-provisioning

Allow applications to specify a violation
tolerance V
Provision for the (100-V)th percentile of
resource usage
Requirements do not necessarily peak
simultaneously
Probability of violations even less than V
Similar to resource overbooking in airline
industry
Determine which servers have enough capacity
sk (100-v)th percentile, C server capacity
SK scpu Ccpu SK snet Cnet

k
k
17
Resource Utilization Gains
Placement of Apache Web Servers
Placement of Streaming Media Servers
1400
350
No Viol
No Viol
1200
Viol1
Viol1
300
1000
250
800
200
Web Servers Placed
600
Media Servers Placed
150
400
100
200
50
0
0
0
20
40
60
80
100
120
140
0
20
40
60
80
100
120
140
Data center size
Data center size

1 violations can more than double number of
applications!
Small under-provisioning can yield large gains
Bursty applications yield larger benefits

18
Impact of Under-provisioning on Application
Performance

Provisioning for the tail results in tolerable
degradation
Large resource savings possible with small
degradation

19
Application Placement Summary

Server applications tend to have bursty usage
Save resources in shared data centers running
small applications
Determine resource usage behavior
Under-provision resources
Controlled performance degradation
Theoretical properties of application placement
NP-hard, approximation algorithms

OSDI02
PDCS04
20
Talk Outline

Motivation
Data Center Models
Application Placement
Dynamic Capacity Provisioning
Summary and Future Research

21
Dynamic Capacity Provisioning
Monitor workload
Compute future demand
Adjust allocation

Key idea increase or decrease allocated
resources to handle workload fluctuations
To handle increased workload
Shared hosting increase resource share
Dedicated hosting start replicas on additional
servers
Focus Dedicated hosting, large applications

Chandra03, Chase01
22
Dynamic Capacity Provisioning

Servers
23
Dynamic Capacity Provisioning

Predictors

Allocator
Monitor

Servers
24
Internet Application Architecture
queries
search moby
response
Melvilles Moby Dick Music CDs by Moby
HTTP
J2EE
Database
request processing in an online bookstore

Multi-tier architecture
Each tier uses services provided by its successor
Session-based workloads
Caching, replication

25
Existing Application Models

Models for Web servers Chandra03, Doyle03
Do not model Java server, database etc.
Black-box models Kamra04, Ranjan02
Unaware of bottleneck tier
Extensions of single-tier models Welsh03
Fail to capture interactions between tiers
Existing models inadequate for multi-tier
Internet applications

26
Baseline Application Model
SIGMETRICS05
clients
application

Model consists of two components
Sub-system to capture behavior of clients
Sub-system to capture request processing inside
the application

27
Modeling Clients
Z
Client 1
Z
Client 2
application
clients
Z
Client N
Q0

Clients think between successive requests
Infinite server system to capture think time Z
Captures independence of Z from processing in
application

28
Modeling Request Processing
pM1
p3
p1
p2
S1
S2
SM
Q1
Q2
QM
N
tier 1
tier 2
tier M

Transitions defined to capture circulation of
requests
Request may move to next queue or previous queue
Multiple requests are processed concurrently at
tiers
Processor sharing scheduling discipline
Caching effects get captured implicitly!

29
Putting It All Together
pM1
p3
p1
p2
Z
S1
S2
SM
client
Z
client
Q1
Q2
QM
Q0
N
tier 1
tier 2
tier M

A closed-queuing model that captures a given
number of simultaneous sessions being served

30
Model Solution and Parameter Estimation
SIGMETRICS05

Mean Value Analysis (MVA) Algorithm
Computes mean response time
Visit ratios
Equivalent to trans. probs. for MVA
Vi ?i / ?req ?req at policer, ?i from logs
Service times
Use residence time Xi logged at tier i
For last tier, SM XM
Si Xi ( Vi1 / Vi ) Xi1
Think time
Measured at the entry point of application

31
Evaluation of Baseline Model

Auction site RUBiS
One server per tier

Apache
JBOSS
Mysql
75
150

Concurrency limits not captured

32
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
dropped requests

Requests may be dropped due to concurrency limits
Need to model the finiteness of queues!

33
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
drop
QM
Q1
drop
pM
drop
p1
drop
drop
drop
S1
SM

Approach Subsystems to capture dropped requests
Distinguish the processing of dropped requests

34
Response Time Prediction

Enhanced model can capture concurrency limits

35
Query Caching at the Database

Caching effects
Captured by tuning Vi and/or Si

Bulletin-board site RUBBoS
50 sessions
SELECT SQL_NO_CACHE causes Mysql to not cache the
response to a query

More model enhancements
Replication at tiers
Multiple session classes

36
Dynamic Capacity Provisioning

Servers
37
Handling Unanticipated Workloads

Allocation for an application may be insufficient
Short-term fluctuations are difficult to predict
Errors in parameter estimation may cause
under-allocation
Reactor Allocate additional servers over time
scale of a few minutes if
Observed workload exceeds predicted workload
Request drop rate exceeds a threshold
Repeated invocations may be needed
Policer If incoming session rate gt current
capacity
Turn away excess sessions
Highly scalable policing

World Wide Web05
38
Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning

40 Linux servers
Gigabit switches
Multi-tier applications
Auction (RUBiS)
Bulletin-board (RUBBoS)
Apache, JBOSS (replicable)
Mysql database

39
Dynamic Capacity Provisioning

Auction application RUBiS
Factor of 4 increase in 30 min

Server allocations
Workload
Response time

Server allocations increased to match increased
workload
Response time kept below 2 seconds

40
Talk Outline

Motivation
Data Center Models
Application Placement
Dynamic Capacity Provisioning
Summary and Future Research

41
Summary

Dynamic resource management in data centers
Application Placement
Improve utilization by under-provisioning
Dynamic Capacity Provisioning
Analytical model for Internet applications
Predictive provisioning
Reactive provisioning
Handling Extreme Overloads
Scalable policing

42
Future Research Directions
Focus Large-scale emerging distributed systems

Virtual machine based hosting
Trade-off between fast switching and VM overheads
Malicious flash crowds, DoS attacks
Security mechanisms
Sensor networks
Constrained environment
How to provide desired performance to overlying
applications?
Mobile computing
Resource-deficient clients
How to design Internet servers for such clients?

43
Thank you!
More information at http//www.cs.umass.edu/
bhuvan
44
Agile Switching Using Virtual Machine Monitors

VMMs allow multiple virtual m/c on a server
E.g., Xen, VMWare,

dormant
dormant
active
active
VM1
VM1
VM2
VM3
VM2
VM3
VMM
VMM

Use VMMs to enable fast switching of servers
Switching time only limited by residual sessions

45
Model Solution and Parameter Estimation
SIGMETRICS05

Mean Value Analysis (MVA) Algorithm
Computes mean response time
Visit ratios
Equivalent to trans. probs. for MVA
Vi ?i / ?req ?req at policer, ?i from logs
Service times
Use residence time Xi logged at tier i
For last tier, SM XM
Si Xi ( Vi1 / Vi ) Xi1
Think time
Measured at the entry point of application

46
Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning