Title: Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers
1Dynamic Resource Management in Internet Data
Centers
- Bhuvan Urgaonkar
- Laboratory for Advanced Systems Software
- University of Massachusetts Amherst
- http//www.cs.umass.edu/bhuvan
2Internet Applications
- Proliferation of Internet applications
auction site
online game
online store
- Growing significance in personal, business
affairs - Focus Internet server applications
3Internet Workloads Are Dynamic
- Multi-time-scale variations
- Time-of-day, hour-of-day
- Flash crowds
- User threshold for
response time
8-10 s - Key issue Provide good
- response time under varying workloads
1200
0
0
1
2
3
4
5
Time (days)
Arrivals per min
140K
0
0 12 24
Time (hours)
4Data Centers
- Clusters of servers
- Hosting platforms
- Rent resources to third-party applications
- Performance guarantees in return for revenue
- Benefits
- Applications dont need to maintain their own
infrastructure - Rent server resources, possibly on demand
- Platform provider generates revenue by renting
resources
5Goals of a Data Center
- Satisfy application performance guarantees under
dynamic workloads - E.g., average response time, throughput
- Maximize resource utilization
- E.g., maximize the number of hosted applications
- Question How should a data center manage its
resources to meet these goals?
6Manual Resource Allocation
WC Soccer 1998
- Resource over-provisioning
- Resource wastage
- A bad estimate could result in under-allocation
- Manual reallocation
- Slow allocation time
Challenge How to handle dynamic workloads while
efficiently utilizing resources?
7Dynamic Resource Management
- How to map an application to servers in the data
center? - How to provide good performance under dynamic
workloads? - How to remain operational under extreme overloads?
Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
8Dynamic Resource Management
- How to map an application to servers in the data
center? - How to provide good performance under dynamic
workloads? - How to remain operational under extreme overloads?
Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
9Talk Outline
- Motivation
- Data Center Models
- Application Placement
- Dynamic Capacity Provisioning
- Summary and Future Research
10Data Center Models
- Small applications
- Require only a fraction of a server
- Shared Web hosting, 20/month to run own Web site
- Shared hosting multiple applications on a server
- Co-located applications compete for server
resources
11Data Center Models
- Large applications
- May span multiple servers
- eBay site uses thousands of servers!
- Dedicated hosting at most one application per
server - Allocation at the granularity of a single server
12Application Placement
OSDI02
- How to map application to servers in the data
center? - Step 1 Finding applications resource
requirement - Automatic requirement inference technique
- Step 2 Identifying servers to host the
application - Easy in dedicated hosting
- Just assign the desired number of available
servers! - Non-trivial in shared hosting
- Opportunity for statistical multiplexing of
resources on a server - Multi-dimensional Knapsack
13Resource Requirement Inference
ON-OFF PROCESS
Measurement Interval
time
CDF
1
0.99
Cumulative Probability
A
B
0
1
Fractional usage
14Requirement Inference Technique
- Profiling process of determining resource usage
- Run the application on an isolated server
- Subject the application to a real workload
- Determine CPU and network usage
- Use the Linux trace toolkit Yaghmour00
- Track scheduling events, packet transmissions
times - Implementation on a Linux cluster
- Apache Web server using SPECWeb99
- Streaming media server with VBR MPEG-1 clients
- Postgres database server
- Quake game server
15Application Profiles
- Observation Resource usage can be bursty
- Peak requirement much higher than a high
percentile - Insight Provisioning for the tail can save
resources! - Under-provisioning of resources
- Occasional violations of resource guarantees
16Controlled Resource Under-provisioning
- Allow applications to specify a violation
tolerance V - Provision for the (100-V)th percentile of
resource usage - Requirements do not necessarily peak
simultaneously - Probability of violations even less than V
- Similar to resource overbooking in airline
industry - Determine which servers have enough capacity
- sk (100-v)th percentile, C server capacity
- SK scpu Ccpu SK snet Cnet
k
k
17Resource Utilization Gains
Placement of Apache Web Servers
Placement of Streaming Media Servers
1400
350
No Viol
No Viol
1200
Viol1
Viol1
300
1000
250
800
200
Web Servers Placed
600
Media Servers Placed
150
400
100
200
50
0
0
0
20
40
60
80
100
120
140
0
20
40
60
80
100
120
140
Data center size
Data center size
- 1 violations can more than double number of
applications! - Small under-provisioning can yield large gains
- Bursty applications yield larger benefits
18Impact of Under-provisioning on Application
Performance
- Provisioning for the tail results in tolerable
degradation - Large resource savings possible with small
degradation
19Application Placement Summary
- Server applications tend to have bursty usage
- Save resources in shared data centers running
small applications - Determine resource usage behavior
- Under-provision resources
- Controlled performance degradation
- Theoretical properties of application placement
- NP-hard, approximation algorithms
OSDI02
PDCS04
20Talk Outline
- Motivation
- Data Center Models
- Application Placement
- Dynamic Capacity Provisioning
- Summary and Future Research
21Dynamic Capacity Provisioning
Monitor workload
Compute future demand
Adjust allocation
- Key idea increase or decrease allocated
resources to handle workload fluctuations - To handle increased workload
- Shared hosting increase resource share
- Dedicated hosting start replicas on additional
servers - Focus Dedicated hosting, large applications
Chandra03, Chase01
22Dynamic Capacity Provisioning
Servers
23Dynamic Capacity Provisioning
Predictors
Allocator
Monitor
Servers
24Internet Application Architecture
queries
search moby
response
Melvilles Moby Dick Music CDs by Moby
HTTP
J2EE
Database
request processing in an online bookstore
- Multi-tier architecture
- Each tier uses services provided by its successor
- Session-based workloads
- Caching, replication
25 Existing Application Models
- Models for Web servers Chandra03, Doyle03
- Do not model Java server, database etc.
- Black-box models Kamra04, Ranjan02
- Unaware of bottleneck tier
- Extensions of single-tier models Welsh03
- Fail to capture interactions between tiers
- Existing models inadequate for multi-tier
Internet applications
26Baseline Application Model
SIGMETRICS05
clients
application
- Model consists of two components
- Sub-system to capture behavior of clients
- Sub-system to capture request processing inside
the application
27Modeling Clients
Z
Client 1
Z
Client 2
application
clients
Z
Client N
Q0
- Clients think between successive requests
- Infinite server system to capture think time Z
- Captures independence of Z from processing in
application
28Modeling Request Processing
pM1
p3
p1
p2
S1
S2
SM
Q1
Q2
QM
N
tier 1
tier 2
tier M
- Transitions defined to capture circulation of
requests - Request may move to next queue or previous queue
- Multiple requests are processed concurrently at
tiers - Processor sharing scheduling discipline
- Caching effects get captured implicitly!
29Putting It All Together
pM1
p3
p1
p2
Z
S1
S2
SM
client
Z
client
Q1
Q2
QM
Q0
N
tier 1
tier 2
tier M
- A closed-queuing model that captures a given
number of simultaneous sessions being served
30Model Solution and Parameter Estimation
SIGMETRICS05
- Mean Value Analysis (MVA) Algorithm
- Computes mean response time
- Visit ratios
- Equivalent to trans. probs. for MVA
- Vi ?i / ?req ?req at policer, ?i from logs
- Service times
- Use residence time Xi logged at tier i
- For last tier, SM XM
- Si Xi ( Vi1 / Vi ) Xi1
- Think time
- Measured at the entry point of application
31Evaluation of Baseline Model
- Auction site RUBiS
- One server per tier
Apache
JBOSS
Mysql
75
150
- Concurrency limits not captured
32Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
dropped requests
- Requests may be dropped due to concurrency limits
- Need to model the finiteness of queues!
33Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
drop
QM
Q1
drop
pM
drop
p1
drop
drop
drop
S1
SM
- Approach Subsystems to capture dropped requests
- Distinguish the processing of dropped requests
34Response Time Prediction
- Enhanced model can capture concurrency limits
35Query Caching at the Database
- Caching effects
- Captured by tuning Vi and/or Si
- Bulletin-board site RUBBoS
- 50 sessions
- SELECT SQL_NO_CACHE causes Mysql to not cache the
response to a query
- More model enhancements
- Replication at tiers
- Multiple session classes
36Dynamic Capacity Provisioning
Servers
37Handling Unanticipated Workloads
- Allocation for an application may be insufficient
- Short-term fluctuations are difficult to predict
- Errors in parameter estimation may cause
under-allocation - Reactor Allocate additional servers over time
scale of a few minutes if - Observed workload exceeds predicted workload
- Request drop rate exceeds a threshold
- Repeated invocations may be needed
- Policer If incoming session rate gt current
capacity - Turn away excess sessions
- Highly scalable policing
World Wide Web05
38Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning
- 40 Linux servers
- Gigabit switches
- Multi-tier applications
- Auction (RUBiS)
- Bulletin-board (RUBBoS)
- Apache, JBOSS (replicable)
- Mysql database
39Dynamic Capacity Provisioning
- Auction application RUBiS
- Factor of 4 increase in 30 min
Server allocations
Workload
Response time
- Server allocations increased to match increased
workload - Response time kept below 2 seconds
40Talk Outline
- Motivation
- Data Center Models
- Application Placement
- Dynamic Capacity Provisioning
- Summary and Future Research
41Summary
- Dynamic resource management in data centers
- Application Placement
- Improve utilization by under-provisioning
- Dynamic Capacity Provisioning
- Analytical model for Internet applications
- Predictive provisioning
- Reactive provisioning
- Handling Extreme Overloads
- Scalable policing
42Future Research Directions
Focus Large-scale emerging distributed systems
- Virtual machine based hosting
- Trade-off between fast switching and VM overheads
- Malicious flash crowds, DoS attacks
- Security mechanisms
- Sensor networks
- Constrained environment
- How to provide desired performance to overlying
applications? - Mobile computing
- Resource-deficient clients
- How to design Internet servers for such clients?
43Thank you!
More information at http//www.cs.umass.edu/
bhuvan
44Agile Switching Using Virtual Machine Monitors
- VMMs allow multiple virtual m/c on a server
- E.g., Xen, VMWare,
dormant
dormant
active
active
VM1
VM1
VM2
VM3
VM2
VM3
VMM
VMM
- Use VMMs to enable fast switching of servers
- Switching time only limited by residual sessions
45Model Solution and Parameter Estimation
SIGMETRICS05
- Mean Value Analysis (MVA) Algorithm
- Computes mean response time
- Visit ratios
- Equivalent to trans. probs. for MVA
- Vi ?i / ?req ?req at policer, ?i from logs
- Service times
- Use residence time Xi logged at tier i
- For last tier, SM XM
- Si Xi ( Vi1 / Vi ) Xi1
- Think time
- Measured at the entry point of application
46Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning
- 40 Linux servers
- Gigabit switches
- Multi-tier applications
- Auction (RUBiS)
- Bulletin-board (RUBBoS)
- Apache, JBOSS (replicable)
- Mysql database