Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers

Description:

UNIVERSITY OF MASSACHUSETTS, AMHERST Department of Computer Science ... University of Massachusetts Amherst. http://www.cs.umass.edu/~bhuvan ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 46
Provided by: cse56
Learn more at: https://www.cse.psu.edu
Category:

less

Transcript and Presenter's Notes

Title: Dynamic%20Resource%20Management%20in%20Internet%20Data%20Centers


1
Dynamic Resource Management in Internet Data
Centers
  • Bhuvan Urgaonkar
  • Laboratory for Advanced Systems Software
  • University of Massachusetts Amherst
  • http//www.cs.umass.edu/bhuvan

2
Internet Applications
  • Proliferation of Internet applications

auction site
online game
online store
  • Growing significance in personal, business
    affairs
  • Focus Internet server applications

3
Internet Workloads Are Dynamic
  • Multi-time-scale variations
  • Time-of-day, hour-of-day
  • Flash crowds
  • User threshold for
    response time
    8-10 s
  • Key issue Provide good
  • response time under varying workloads

1200
0
0
1
2
3
4
5
Time (days)
Arrivals per min
140K
0
0 12 24
Time (hours)
4
Data Centers
  • Clusters of servers
  • Hosting platforms
  • Rent resources to third-party applications
  • Performance guarantees in return for revenue
  • Benefits
  • Applications dont need to maintain their own
    infrastructure
  • Rent server resources, possibly on demand
  • Platform provider generates revenue by renting
    resources

5
Goals of a Data Center
  • Satisfy application performance guarantees under
    dynamic workloads
  • E.g., average response time, throughput
  • Maximize resource utilization
  • E.g., maximize the number of hosted applications
  • Question How should a data center manage its
    resources to meet these goals?

6
Manual Resource Allocation
WC Soccer 1998
  • Resource over-provisioning
  • Resource wastage
  • A bad estimate could result in under-allocation
  • Manual reallocation
  • Slow allocation time

Challenge How to handle dynamic workloads while
efficiently utilizing resources?
7
Dynamic Resource Management
  • How to map an application to servers in the data
    center?
  • How to provide good performance under dynamic
    workloads?
  • How to remain operational under extreme overloads?

Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
8
Dynamic Resource Management
  • How to map an application to servers in the data
    center?
  • How to provide good performance under dynamic
    workloads?
  • How to remain operational under extreme overloads?

Application Placement OSDI02,PDCS04
Dynamic Capacity Provisioning Auto Computing05
Scalable Policing World Wide Web05
9
Talk Outline
  • Motivation
  • Data Center Models
  • Application Placement
  • Dynamic Capacity Provisioning
  • Summary and Future Research

10
Data Center Models
  • Small applications
  • Require only a fraction of a server
  • Shared Web hosting, 20/month to run own Web site
  • Shared hosting multiple applications on a server
  • Co-located applications compete for server
    resources

11
Data Center Models
  • Large applications
  • May span multiple servers
  • eBay site uses thousands of servers!
  • Dedicated hosting at most one application per
    server
  • Allocation at the granularity of a single server

12
Application Placement
OSDI02
  • How to map application to servers in the data
    center?
  • Step 1 Finding applications resource
    requirement
  • Automatic requirement inference technique
  • Step 2 Identifying servers to host the
    application
  • Easy in dedicated hosting
  • Just assign the desired number of available
    servers!
  • Non-trivial in shared hosting
  • Opportunity for statistical multiplexing of
    resources on a server
  • Multi-dimensional Knapsack

13
Resource Requirement Inference
ON-OFF PROCESS
Measurement Interval
time
CDF
1
0.99
Cumulative Probability
A
B
0
1
Fractional usage
14
Requirement Inference Technique
  • Profiling process of determining resource usage
  • Run the application on an isolated server
  • Subject the application to a real workload
  • Determine CPU and network usage
  • Use the Linux trace toolkit Yaghmour00
  • Track scheduling events, packet transmissions
    times
  • Implementation on a Linux cluster
  • Apache Web server using SPECWeb99
  • Streaming media server with VBR MPEG-1 clients
  • Postgres database server
  • Quake game server

15
Application Profiles
  • Observation Resource usage can be bursty
  • Peak requirement much higher than a high
    percentile
  • Insight Provisioning for the tail can save
    resources!
  • Under-provisioning of resources
  • Occasional violations of resource guarantees

16
Controlled Resource Under-provisioning
  • Allow applications to specify a violation
    tolerance V
  • Provision for the (100-V)th percentile of
    resource usage
  • Requirements do not necessarily peak
    simultaneously
  • Probability of violations even less than V
  • Similar to resource overbooking in airline
    industry
  • Determine which servers have enough capacity
  • sk (100-v)th percentile, C server capacity
  • SK scpu Ccpu SK snet Cnet

k
k
17
Resource Utilization Gains
Placement of Apache Web Servers
Placement of Streaming Media Servers
1400
350
No Viol
No Viol
1200
Viol1
Viol1
300
1000
250
800
200
Web Servers Placed
600
Media Servers Placed
150
400
100
200
50
0
0
0
20
40
60
80
100
120
140
0
20
40
60
80
100
120
140
Data center size
Data center size
  • 1 violations can more than double number of
    applications!
  • Small under-provisioning can yield large gains
  • Bursty applications yield larger benefits

18
Impact of Under-provisioning on Application
Performance
  • Provisioning for the tail results in tolerable
    degradation
  • Large resource savings possible with small
    degradation

19
Application Placement Summary
  • Server applications tend to have bursty usage
  • Save resources in shared data centers running
    small applications
  • Determine resource usage behavior
  • Under-provision resources
  • Controlled performance degradation
  • Theoretical properties of application placement
  • NP-hard, approximation algorithms

OSDI02
PDCS04
20
Talk Outline
  • Motivation
  • Data Center Models
  • Application Placement
  • Dynamic Capacity Provisioning
  • Summary and Future Research

21
Dynamic Capacity Provisioning
Monitor workload
Compute future demand
Adjust allocation
  • Key idea increase or decrease allocated
    resources to handle workload fluctuations
  • To handle increased workload
  • Shared hosting increase resource share
  • Dedicated hosting start replicas on additional
    servers
  • Focus Dedicated hosting, large applications

Chandra03, Chase01
22
Dynamic Capacity Provisioning
















Servers
23
Dynamic Capacity Provisioning


Predictors










Allocator
Monitor




Servers
24
Internet Application Architecture
queries
search moby
response
Melvilles Moby Dick Music CDs by Moby
HTTP
J2EE
Database
request processing in an online bookstore
  • Multi-tier architecture
  • Each tier uses services provided by its successor
  • Session-based workloads
  • Caching, replication

25
Existing Application Models
  • Models for Web servers Chandra03, Doyle03
  • Do not model Java server, database etc.
  • Black-box models Kamra04, Ranjan02
  • Unaware of bottleneck tier
  • Extensions of single-tier models Welsh03
  • Fail to capture interactions between tiers
  • Existing models inadequate for multi-tier
    Internet applications

26
Baseline Application Model
SIGMETRICS05
clients
application
  • Model consists of two components
  • Sub-system to capture behavior of clients
  • Sub-system to capture request processing inside
    the application

27
Modeling Clients
Z
Client 1
Z
Client 2
application
clients
Z
Client N
Q0
  • Clients think between successive requests
  • Infinite server system to capture think time Z
  • Captures independence of Z from processing in
    application

28
Modeling Request Processing
pM1
p3
p1
p2
S1
S2
SM
Q1
Q2
QM
N
tier 1
tier 2
tier M
  • Transitions defined to capture circulation of
    requests
  • Request may move to next queue or previous queue
  • Multiple requests are processed concurrently at
    tiers
  • Processor sharing scheduling discipline
  • Caching effects get captured implicitly!

29
Putting It All Together
pM1
p3
p1
p2
Z
S1
S2
SM
client
Z
client
Q1
Q2
QM
Q0
N
tier 1
tier 2
tier M
  • A closed-queuing model that captures a given
    number of simultaneous sessions being served

30
Model Solution and Parameter Estimation
SIGMETRICS05
  • Mean Value Analysis (MVA) Algorithm
  • Computes mean response time
  • Visit ratios
  • Equivalent to trans. probs. for MVA
  • Vi ?i / ?req ?req at policer, ?i from logs
  • Service times
  • Use residence time Xi logged at tier i
  • For last tier, SM XM
  • Si Xi ( Vi1 / Vi ) Xi1
  • Think time
  • Measured at the entry point of application

31
Evaluation of Baseline Model
  • Auction site RUBiS
  • One server per tier

Apache
JBOSS
Mysql
75
150
  • Concurrency limits not captured

32
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
dropped requests
  • Requests may be dropped due to concurrency limits
  • Need to model the finiteness of queues!

33
Handling Concurrency Limits
Z
S1
S2
SM
Z
Q1
Q2
QM
Q0
N
drop
QM
Q1
drop
pM
drop
p1
drop
drop
drop
S1
SM
  • Approach Subsystems to capture dropped requests
  • Distinguish the processing of dropped requests

34
Response Time Prediction
  • Enhanced model can capture concurrency limits

35
Query Caching at the Database
  • Caching effects
  • Captured by tuning Vi and/or Si
  • Bulletin-board site RUBBoS
  • 50 sessions
  • SELECT SQL_NO_CACHE causes Mysql to not cache the
    response to a query
  • More model enhancements
  • Replication at tiers
  • Multiple session classes

36
Dynamic Capacity Provisioning
















Servers
37
Handling Unanticipated Workloads
  • Allocation for an application may be insufficient
  • Short-term fluctuations are difficult to predict
  • Errors in parameter estimation may cause
    under-allocation
  • Reactor Allocate additional servers over time
    scale of a few minutes if
  • Observed workload exceeds predicted workload
  • Request drop rate exceeds a threshold
  • Repeated invocations may be needed
  • Policer If incoming session rate gt current
    capacity
  • Turn away excess sessions
  • Highly scalable policing

World Wide Web05
38
Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning
  • 40 Linux servers
  • Gigabit switches
  • Multi-tier applications
  • Auction (RUBiS)
  • Bulletin-board (RUBBoS)
  • Apache, JBOSS (replicable)
  • Mysql database

39
Dynamic Capacity Provisioning
  • Auction application RUBiS
  • Factor of 4 increase in 30 min

Server allocations
Workload
Response time
  • Server allocations increased to match increased
    workload
  • Response time kept below 2 seconds

40
Talk Outline
  • Motivation
  • Data Center Models
  • Application Placement
  • Dynamic Capacity Provisioning
  • Summary and Future Research

41
Summary
  • Dynamic resource management in data centers
  • Application Placement
  • Improve utilization by under-provisioning
  • Dynamic Capacity Provisioning
  • Analytical model for Internet applications
  • Predictive provisioning
  • Reactive provisioning
  • Handling Extreme Overloads
  • Scalable policing

42
Future Research Directions
Focus Large-scale emerging distributed systems
  • Virtual machine based hosting
  • Trade-off between fast switching and VM overheads
  • Malicious flash crowds, DoS attacks
  • Security mechanisms
  • Sensor networks
  • Constrained environment
  • How to provide desired performance to overlying
    applications?
  • Mobile computing
  • Resource-deficient clients
  • How to design Internet servers for such clients?

43
Thank you!
More information at http//www.cs.umass.edu/
bhuvan
44
Agile Switching Using Virtual Machine Monitors
  • VMMs allow multiple virtual m/c on a server
  • E.g., Xen, VMWare,

dormant
dormant
active
active
VM1
VM1
VM2
VM3
VM2
VM3
VMM
VMM
  • Use VMMs to enable fast switching of servers
  • Switching time only limited by residual sessions

45
Model Solution and Parameter Estimation
SIGMETRICS05
  • Mean Value Analysis (MVA) Algorithm
  • Computes mean response time
  • Visit ratios
  • Equivalent to trans. probs. for MVA
  • Vi ?i / ?req ?req at policer, ?i from logs
  • Service times
  • Use residence time Xi logged at tier i
  • For last tier, SM XM
  • Si Xi ( Vi1 / Vi ) Xi1
  • Think time
  • Measured at the entry point of application

46
Prototype Data Center
Server Node
Applications Request policer
Resource monitoring Parameter estimation
Control Plane
Application placement Dynamic provisioning
  • 40 Linux servers
  • Gigabit switches
  • Multi-tier applications
  • Auction (RUBiS)
  • Bulletin-board (RUBBoS)
  • Apache, JBOSS (replicable)
  • Mysql database
Write a Comment
User Comments (0)
About PowerShow.com