Dynamic Resource Management in Internet Hosting Platforms - PowerPoint PPT Presentation

1 / 55

About This Presentation

Title:

Dynamic Resource Management in Internet Hosting Platforms

Description:

Dynamic Resource Management in Internet Hosting Platforms Ph.D. Thesis Defense Bhuvan Urgaonkar Advisor: Prashant Shenoy – PowerPoint PPT presentation

Number of Views:179

Avg rating:3.0/5.0

Slides: 56

Provided by: psu114

Learn more at: https://www.cse.psu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Dynamic Resource Management in Internet Hosting Platforms

1
Dynamic Resource Management in Internet Hosting
Platforms

Ph.D. Thesis Defense
Bhuvan Urgaonkar
Advisor Prashant Shenoy

2
Internet Applications

Proliferation of Internet applications

auction site
online game
online retail store

Growing significance in personal, business
affairs
Focus Internet server applications

3
Hosting Platforms

Data Centers
Clusters of servers
Storage devices
High-speed interconnect
Hosting platforms
Rent resources to third-party applications
Performance guarantees in return for revenue
Benefits
Applications dont need to maintain their own
infrastructure
Rent server resources, possibly on demand
Platform provider generates revenue by renting
resources

4
Goals of a Hosting Platform

Meet service-level agreements
Satisfy application performance guarantees
E.g., average response time, throughput
Maximize revenue
E.g., maximize the number of hosted applications
Question How should a hosting platform manage
its resources to meet these goals?

5
Challenge 1 Dynamic Workloads

Multi-time-scale variations
Time-of-day, hour-of-day
Overloads
E.g., Flash crowds
User threshold for
response time
8-10 s
Key issue How to provide good
response time under varying workloads?

1200
0
0
1
2
3
4
5
Time (days)
Arrivals per min
140K
0
0 12 24
Time (hours)
6
Challenge 2 Complexity of Applications

Complex software architecture
Diverse software components
Web servers, Java application servers, databases
Multiple classes of clients
How to provide differentiated service?
Replicable components
How many replicas to have?
Tunable configuration parameters
E.g., MaxClient in Apache
How to set these parameters?
Key issue How to capture all this complexity?

7
Talk Outline

Motivation
Thesis contributions
Application modeling
Dynamic provisioning
Scalable request policing
Conclusions

8
Hosting Platform Models

Small applications
Require only a fraction of a server
Shared Web hosting, 20/month to run own Web site
Shared hosting multiple applications on a server
Co-located applications compete for server
resources

9
Hosting Platform Models

Large applications
May span multiple servers
eBay site uses thousands of servers!
Dedicated hosting at most one application per
server
Allocation at the granularity of a single server

10
Thesis Contributions

Dynamic resource management in hosting platforms
Shared Hosting
Statistical multiplexing and under-provisioning
OSDI 2002
Application placement PDCS 2004
Dedicated Hosting
Analytical model for an Internet application
SIGMETRICS 2005
Dynamic provisioning Autonomic Computing 2005
Scalable request policing PODC 2004, WWW 2005

11
Talk Outline

Motivation
Thesis contributions
Application modeling
Dynamic provisioning
Scalable request policing
Conclusions

12
Internet Application Architecture
queries
search moby
response
Melvilles Moby Dick Music CDs by Moby
HTTP
J2EE
Database
request processing in an online bookstore

Multi-tier architecture
Each tier uses services provided by its successor
Session-based workloads

13
Baseline Application Model
SIGMETRICS05
clients
application

Model consists of two components
Sub-system to capture behavior of clients
Sub-system to capture request processing inside
the application

14
Modeling Clients
Z
Client 1
Z
Client 2
application
clients
Z
Client N
Q0

Clients think between successive requests
Infinite server system to capture think time Z
Captures independence of Z from processing in
application

15
Modeling Request Processing
pM1
p3
p1
p2
S1
S2
SM
Q1
Q2
QM
N
tier 1
tier 2
tier M

Transitions defined to capture circulation of
requests
Request may move to next queue or previous queue
Multiple requests are processed concurrently at
tiers
Processor sharing scheduling discipline
Caching effects get captured implicitly!

16
Putting It All Together
pM1
p3
p1
p2
Z
S1
S2
SM
client
Z
client
Q1
Q2
QM
Q0
N
tier 1
tier 2
tier M

A closed-queuing model that captures a given
number of simultaneous sessions being served

17
Mean-value Analysis
1
client
n
client
Q1
Q2
QM
n1
Q0
client
A2(n1)
AM(n1)
A1(n1)
L1(n)
L2(n)
LM(n)

Product-form closed queuing network
Lm average length of Qm
Am average number of clients in Qm seen by
arriving client
Am (n1) Lm (n)
Iterative algorithm to compute mean queue
lengths, sojourn times

18
Parameter Estimation

Visit ratios
Equivalent to trans. probs. for MVA
Vi ?i / ?req ?req at sentry, ?i from logs
Service times
Use residence time Xi logged at tier i
For last tier, SM XM
Si Xi ( Vi1 / Vi ) Xi1
Think time
Measured at the application sentry

19
Evaluation of Baseline Model

Auction site RUBiS
One server per tier

Apache
JBOSS
Mysql
75
150

Concurrency limits not captured

20
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
dropped requests

Requests may be dropped due to concurrency limits
Need to model the finiteness of queues!

21
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
drop
QM
Q1
drop
pM
drop
p1
drop
drop
drop
S1
SM

Approach Subsystems to capture dropped requests
Distinguish the processing of dropped requests

22
Estimating Drop Probabilities and Delay Values

Drop probability
Step 1 Estimate throughput using MVA assuming no
concurrency limits
Step 2 Estimate pidrop as the drop probability
of M/M/1/Ki queue
Delay value for tier i
Subject the application to offline workload that
causes limit to be exceeded only at tier i
record response time of failed requests

Ki
t
t(1-pidrop)
Tputt
tpidrop
High limit
Low limit
High limit
23
Response Time Prediction

Enhanced model can capture concurrency limits

24
Replication and Load Imbalances
Apache
Mysql
JBOSS

Causes of imbalance
Sticky sessions
Variation in session durations and resource
requirements
Imbalance factor for jth most-loaded replica of
tier i
imbalance(i, j) num_arrivals(i, j) /
num_arrivals(i)
Scale visit ratio
Vi, j Vi imbalance(i, j)

25
Capturing Load Imbalance
Number of requests (per-replica)
Response times (based on load)
1000
1800
1600
800
1400
Replica 1
Least loaded
600
1200
Number of requests
Replica 2
Medium loaded
1000
Avg. resp. time (msec)
400
Replica 3
800
Most loaded
600
200
Average
400
0
200
30
90
150
210
210
270
0
Time (sec)
Observed
Perfect Load balancing
Enhanced Model

Session affinity causes load imbalance
Imbalance shifts among replicas
Our enhancement helps improve
response time prediction

Mysql
Apache
JBOSS
26
Talk Outline

Motivation
Thesis contributions
Application modeling
Dynamic provisioning
Scalable request policing
Conclusions

27
Dynamic Provisioning
Auto. Computing05
Monitor workload
Compute current/ future demand
Adjust allocation

Key idea increase or decrease allocated servers
to handle workload fluctuations
Monitor incoming workload
Compute current or future demand
Match number of allocated servers to demand

28
Dynamic Provisioning at Multiple Time-scales

Predictive provisioning
Certain Internet workloads patterns can be
predicted
E.g., time-of-day effects, increased workload
during Thanksgiving
Provision using model at time-scale of hours or
days
Reactive provisioning
Applications may see unpredictable fluctuations
E.g., Increased workload to news-sites after an
earthquake
Detect such anomalies and react fast (minutes)

29
Request Policing
Sentry policing
drop

Key Idea If incoming req. rate gt current
capacity
Turn away excess requests
Why police when you can provision?
Provisioning is not instantaneous
Residual sessions on reallocated server
Application and OS installation and configuration
overheads
Overhead of several (5-30) minutes

30
Existing Work

Lots of existing work on request policing
Kanodia00, Li00, Verma03, Welsh03, Abdelzaher99,
Shortcomings of existing work
Does not attempt to integrate policing and
provisioning
Does not address scalability of the policer!
The policer itself may become the bottleneck
during overloads

31
Policer Design Goals

Each class should sustain its guaranteed
admission rate
Class-based differentiation and revenue
maximization
Challenging due to online nature of the problem
An admitted request may cause a more important
request arriving later to be dropped
Approach Preferential admission to higher class
requests
Scalability
The policer should remain operational even under
extremely high arrival rates

32
Overview of Policer Design
PODC04 / WWW05
Admission control
dgold
Class gold
admitted
dsilver
Class silver
Classifier
dropped
dbronze
Class bronze
Leaky buckets
Class-specific queues

Our policer has three components
Request classifier and per-class leaky buckets
Class-specific queues
Admission control

33
Class-based Differentiation
Admission control
dgold
Class gold
admitted
dsilver
Class silver
Classifier
dropped
dbronze
Class bronze
Leaky buckets
Class-specific queues

Each incoming request undergoes classification
Per-class leaky buckets used to ensure that rates
guaranteed in SLA are admitted

34
Revenue Maximization
Admission control
dgold
Class gold
admitted
dsilver
Class silver
Classifier
dropped
dbronze
Class bronze
Leaky buckets
Class-specific queues

Idea Different delays in processing requests of
different classes
More important requests processed more frequently
Methodology to compute delay values in online
manner
Bounds probability of a request denying admission
to a more important request Appendix B of thesis

35
Admission Control
Admission control
dgold
Class gold
admitted
dsilver
Class silver
Classifier
dropped
dbronze
Class bronze
Leaky buckets
Class-specific queues

Goal Ensure that an admitted request meets its
response time target
Measurement-based admission control algorithm
Use information about current load on servers and
estimated size of new request to make decision

36
Scalability of Admission Control

Idea 1 Reduce the per-request admission control
cost
Admission control on every request may be
expensive
Bursty arrivals during overloads gt batches get
formed
Delays for class-based differentiation gt batches
get formed
Admission control that operates on batches
instead of requests
Idea 2 Sacrifice accuracy for computational
overhead
When batch-based processing becomes prohibitive
Threshold-based scheme
E.g., Admit all Gold requests, drop all Silver
and Bronze requests
Thresholds chosen based on observed arrival rates
and service times
Extremely efficient
Wrong threshold gt bad response times or fewer
requests admitted

37
Scaling Even Further

Protocol processing overheads will saturate
sentry resources at extremely high arrival rates
Indiscriminate dropping of requests will occur
Important requests may be turned away without
even undergoing the admission control test
Loss in revenue!
Sentry should still be able to process each
arriving request!
Idea Dynamic capacity provisioning for sentry
Pull in an additional sentry if CPU utilization
of existing sentries exceeds a threshold (e.g.,
90)
Round-robin DNS to load balance among sentries

38
Class-based Differentiation

Three classes of requests Gold, Silver, Bronze
Policer successful in providing preferential
admission to important requests

39
Threshold-based Higher Scalability

Threshold-based processing allows the policer to
handle upto 4 times higher arrival rate
Single sentry can handle about 19000 req/s

40
Threshold-based Loss of Accuracy

Higher scalability comes at a loss in accuracy of
admission control
More violations of response time targets

41
Talk Outline

Motivation
Thesis contributions
Application modeling
Dynamic provisioning
Scalable request policing
Summary and Future Research

42
Thesis Contributions

Dynamic resource management in hosting platforms
Shared Hosting
Statistical multiplexing and under-provisioning
OSDI 2002
Application placement PDCS 2004
Dedicated Hosting
Analytical model for Internet applications
SIGMETRICS 2005
Dynamic provisioning Autonomic Computing 2005
Scalable request policing PODC 2004, WWW 2005

43
Future Research Directions

Virtual machine based hosting
Recent research has shown feasibility of
migrating VMs across nodes
Adds a new dimension to the capacity provisioning
problem
Characterizing multi-tier workloads
Workloads for standalone Web servers are
well-characterized
E.g., typical service times at Java tier or query
processing times?
Offshoot of this study workloads generators for
multi-tier applications
Automated determination of provisioning
parameters
Predictor and reactor invoked based on manually
chosen frequencies
System administrators use rules-of-thumb gt
error-prone

44
Thanks to

Advisor
Prashant Shenoy
Thesis committee
Emery Berger, Jim Kurose, Don Towsley, Tilman
Wolf
Collaborators
Abhishek Chandra, Pawan Goyal, Giovanni
Pacifici, Timothy Roscoe, Arnold
Rosenberg, Mike Spreitzer, Asser Tantawi
All my teachers
Paul Cohen, Mani Krishna, Don Towsley
Friends and family

Questions or comments?

46
Query Caching at the Database

Caching effects
Captured by tuning Vi and/or Si

Bulletin-board site RUBBoS
50 sessions
SELECT SQL_NO_CACHE causes Mysql to not cache the
response to a query

47
Agile Switching Using Virtual Machine Monitors

VMMs allow multiple virtual m/c on a server
E.g., Xen, VMWare,

dormant
dormant
active
active
VM1
VM1
VM2
VM3
VM2
VM3
VMM
VMM

Use VMMs to enable fast switching of servers
Switching time only limited by residual sessions

48
Prototype Data Center
Server Node
Application capsules Sentries
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning

40 Linux servers
Gigabit switches
Multi-tier applications
Auction (RUBiS)
Bulletin-board (RUBBoS)
Apache, JBOSS (replicable)
Mysql database

49
Sentry Provisioning (XXX)
50
System Overview
Server Node
Application capsules Sentries
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning