Title: Continuous Resources Allocation in Internet Data Centers
1Continuous Resources Allocation in Internet Data
Centers
- Youssef Hamadi
- Microsoft Research Cambridge
2Internet Data Center
Website hosting
- Total availability in hosting
- 24/24, 7/7
- Power plant
- Secured access
3Outline
- Constraint Programming overview
- Problem modelling
- Online problem solving
- Experiments
- Advanced Reservation in Grid Infrastructures
- Conclusion
4Constraint Programming
- Problem Variables Constraints
- Variables Xlb..ub
- Constraints
- Input/output variables
- Events domain-reduction
- Action
- Logic, e.g., XYZ
- Operational, space reduction
- Algorithmic, complexity
- Resolution Constraint Propagation Speculative
search
5Constraint Programming
THINK
GUESS
Constraint propagation
Fix point
Speculative search
6Constraint Programming
THINK
GUESS
7Constraint Programming
8Internet Data Center
Website hosting
- Total availability in hosting
- 24/24, 7/7
- Power plant
- Secured access
9Problem Modelling
Internet Data Center
SE
Switches
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
SR
Servers
C1
C3
C2
C4
C6
C5
C7
C9
C8
C10
C12
C11
C1
C3
C2
C4
C6
C5
C7
C9
C8
C10
C12
C11
10Problem Modelling
Internet Data Center set of limitations
Mesh switch BSM_limit
Edge switch BSE_limit
Rack switch BSR_limit
Server CPU_limit Speed_limit Memory_limit
Storage_limit Disk_bandwidth_limit BC_limit
11Problem Modelling
Multi-tier application
Internet
Web servers
Application servers
Databases
12Problem Modelling
Multi-tier application set of requirements
Internet
Web servers
Process CPU_charge Speed_charge Memory_charge
Storage_charge Disk_bandwidth_charge
Bandwidth_charge
Application servers
Databases
13Optimal Resource Allocation
Internet Data Center
Limitations
Requirements
Multi-tier application
14Modelling
- Variables?
- Values?
- Constraints?
15Modelling, variables
- Switch capacities
- Sm.in/Sm.out
- Sri.in/Sri.out
- Sei.in/Sei.out
N
- Network Capacities
- Si.in/Si.out N
- Allocated process
- Process N
- Tier1, Tier2, Tier3 boolean
16Modelling, variables
- G (X, E)
- X, set of constrained processes
- E, comm. topology
- Allocated server
- Server N
17Modelling, constraints
- 1. Static capacity filtering
- IDCs servers keep compatible processes
- ?Sk, ?Pk, k?Sk.Process, iff,
- Sk.CPU Pk.CPU,
- Sk.Speed Pk.Speed,
- Sk.Memory Pk.Memory,
- Sk.Storage Pk.Storage,
- Sk.DiskSpeed Pk.DiskSpeed,
18Modelling, constraints
- 1. Static capacity filtering
- Application processes keep compatible servers
- ?Pk, ?Sk, k?Pk.Server, iff,
- Pk.CPU
- Pk.Speed
- Pk.Memory
- Pk.Storage
- Pk.DiskSpeed
19Modelling, constraints
- 2. Symmetrical referencing
ServerSi
ProcessP1 Tier11 Tier20 Tier30
20Modelling, constraints
- 3. Mutual exclusion
- ? Sk, all_different(Sk.Process)
21Modelling, constraints
- 4. Tier propagation
- From a hosted process to the associated tier
variables
P0, P1, P2, P3, P4, P5, P6, P7 tier1 1
1 0 0 0 0 0 0 tier2
0 0 1 1 1 0 0 0
tier3 0 0 0 0 0 1 1
1
Sk.Tier1 tier1Sk.Process Sk.Tier2
tier2Sk.Process Sk.Tier3 tier3Sk.Process
ProcessP0, P1 Tier11 Tier20 Tier30
22Modelling, constraints
- 5. Bandwidth capacities, For each Rack Srk,
- Srk.in
- Srk.out
client
bc01
n1 2
b
a
bc12
n2 3
c
d
e
bc23
a
c
d
f
g
h
n3 3
23Modelling, soft constraints
b01 b12 b23
2
4
6
24Extended Modelling
- Symmetry break in a Rack switch
Usually, Sj ? Sk If Sj.Process ?
Sk.Process, Sj.Process according to monotonic properties of filtered
domains)
a,b,c
a,b,c
a,b,c
a b c
25Extended Modelling
- Symmetry break in the application
? Pi, Pj at the same tier, ?(Pi) ?(Pj) Ordering
constraint Pi.Server
26Optimal Resource Allocation
Internet Data Center
Limitations
Constraint Solver
Requirements (SLAs)
27Outline
- Constraint Programming an overview
- Problem modelling
- Online problem solving
- Experiments
- Conclusion
28The problem is Online (I)
- Web site annual load (Arlitt al. ACM-TOIT01)
request
time
Advertising campaign
Christmas
Competitor break
bank holiday
29The problem is Online (II)
- Evolution of website traffic
Web servers
Static content
Application servers
Dynamic content
Databases
30The problem is Online (III)
31Online resource allocations in IDC
The major part of the users use personalization!
32Online resource allocations in IDC
33Constraint Programming
P
Problems
m
CP modelling
DechterDechter88
34Constraint Programming
Component failure
Se2.capacity_in 0 Se2.capacity_out 0
Applications lifecycle reduction
Pi.Server -1
35Online Architecture
IDC
Learning
Phase transition parameters
Search statistics
Search parameters
Add/remove constraints
Search control
Load /topology variations
monitoring
results
Search module
Heuristics
Setup
Contract negotiation, SLAs,
Cost
solve
management
Feasibility, Solve, State,
36Experiments
- IDC with 1024 servers,
- 8 Edge switches, 8 Rack switches, 16 servers/Rack
- 3-tiers application
- (3,1,1), (3,2,2)
- /- (50000 constraints, 5000 variables)
37Advanced Reservation in Grid infrastructures
- Gridline Project
- Microsoft Research Cambridge
- York University
38Advanced Reservations
- Definition The process of requesting various
resources for use at a later time. - GGF Definition "An advance reservation is a
possibly limited or restricted delegation of a
particular resource capability over a defined
time interval, obtained by the requester from the
resource owner through a negotiation process." - Example of resource capabilities number of
processors, amount of memory, disk space,
software licences, network bandwidth, etc.
39Advanced Reservations
- Gridline
- Goal Maximize the utility of some resource
broker - How Compute an optimal subset of customers
40Advanced Reservations
- Order
- Start/End date,
- Proposed Price,
- Proposed Penalty,
- QoS
41Sample Instance of TKP
Uniform Capacity 10
Bid1 6 units, 11
Bid2 6 units, 10
Bid3 5 units, 20
t2
t3
t4
t5
t1
42Time
43Relative Quality
44Conclusion
45Conclusion
- Online Architecture for resources managements in
IDCs. - Constraint Programming modelling periodically
refined. - Future work
- Extend cost function
- real price of migrations, cooling requirements
46Conclusion
- Gridline (joint work with York University)
- Resource allocation for the Grid
- Advanced reservation allocation CP-AI-OR05
- Workflow scheduling.
- http//www.cs.york.ac.uk/aig/constraints/Grid/
47Conclusion
- Gridline (joint work with York University)
- Resource allocation for the Grid
- Advanced reservation allocation CP-AI-OR05
- Workflow scheduling.
- http//www.cs.york.ac.uk/aig/constraints/Grid/