Title: Optimizing Cost and Performance for Multihoming
1Optimizing Cost and Performance for Multihoming
Lili QiuMicrosoft Research liliq_at_microsoft.com
Joint Work withD. K. Goldenberg, H. Xie, Y. R.
Yang, Yale University Y. Zhang, ATT Labs
Research
ACM SIGCOMM 2004
2Multihoming Smart Routing
- Multihoming
- A popular way of connecting to Internet
- Smart routing
- Intelligently distribute traffic among multiple
external links
3Potential Benefits
- Improve performance
- Potential improvement 25 Akella03
- Similar to overlay routing Akella04
- Improve reliability
- Two orders of magnitude improvement in fault
tolerance of end-to-end paths Akella04 - Reduce cost
Q How to realize the potential benefits?
4Our Goals
- Goal
- Design effective smart routing algorithms to
realize the potential benefits of multihoming - Questions
- How to assign traffic to multiple ISPs to
optimize cost? - How to assign traffic to multiple ISPs to
optimize both cost and performance? - What are the global effects of smart routing?
5Related Work
- Techniques for implementing multihoming
- BGP peering, DNS-based, NAT-based (e.g.,
RFC2260, Cisco, GCLC04, Radware, F5) - Complementary to our work
- Performance evaluation Akella03,Akella04
- Quantify the potential benefits of multihoming
- Unaddressed challenge how to achieve this in
practice - Smart routing
- Commercial products (e.g., RouteScience,
Internap, Proficient, ) - Technical details are unavailable
- Hash-based load balancing Cao01, Guo04
- Optimizes neither performance nor cost
6Network Model
- Network performance metric
- Latency (also an indicator for reliability)
- Extend to alternative metrics
- log (1/(1-lossRate)), or latencywlog(1/(1-lossRa
te)) - ISP charging models
- Cost C0 C(x)
- C0 a fixed subscription cost
- C a piece-wise linear non-decreasing function
mapping x to cost - x charging volume
- Total volume based charging
- Percentile-based charging (95-th percentile)
7Percentile Based Charging
Sorted volume
Interval
N
95N
Charging volume traffic in the (95N)-th sorted
interval
8Why cost optimization?
- A simple example
- A user subscribes to 4 ISPs, whose latency is
uniformly distributed - In every interval, the user generates one unit of
traffic - To optimize performance
- ISP 1 1, 0, 0, 0,
- ISP 2 0, 1, 0, 0,
- ISP 3 0, 0, 1, 0,
- ISP 4 0, 0, 0, 1,
- 95th-percentile 1 for all 4 ISPs
- 95th-percentile 1 using one ISP
- Cost(4 ISPs) 4 cost(1 ISP)
Optimizing performance alone could result in high
cost!
9Cost Optimization Problem Specification (2 ISPs)
Volume
Time
N
1
2
10Cost Optimization Problem Specification (2 ISPs)
Sorted volume
Volume
P1
Sorted volume
Time
P2
Goal minimize total cost C1(P1)C2(P2)
11Issues Insights
- Challenge traditional optimization techniques do
not work with percentiles - Key determine each ISPs charging volume
- Results
- Let V0 denote the sum of all ISPs charging
volume - Theorem 1 Minimize cost ?? Minimize V0
- Theorem 2 V0 1- ?k1..N(1-qk) quantile of
original traffic, where qk is ISP ks charging
percentile
12Cost Optimization Problem Specification (2 ISPs)
Sorted volume
Volume
P1
Sorted volume
Time
P2
P1 P2 ? 90-th percentile of original traffic
13Intuition for 2-ISP Case
- ISP 1 has ? 5 intervals whose traffic exceeds P1
- ISP 2 has ? 5 intervals whose traffic exceeds
P2 - The original traffic (ISP 1 ISP 2 traffic) has
? 10 intervals whose traffic exceeds P1P2 - P1P2 ? 90-th percentile of original traffic
14Sketch of Our Algorithm
- Determine charging volume for each ISP
- Compute V0
- Find pk that minimize ?k ck(pk) subject to
?kpkV0 using dynamic programming - Assign traffic given charging volumes
- Non-peak assignment ISP k is assigned ? pk
- Peak assignment
- First let every ISP k serve its charging volume
pk - Dump all the remaining traffic to an ISP k that
has bursted for fewer than (1-qk)N intervals
15Additional Issues
- Deal with capacity constraints
- Perform integral assignment
- Similar to bin packing (greedy heuristic)
- Make it online
- Traffic prediction
- Exponential weighted moving average (EWMA)
- Accommodate prediction errors
- Update V0 conservatively
- Add margins when computing charging volumes
16Optimizing Cost Performance
- One possible approach design a metric that is a
weighted sum of cost and performance - How to determine relative weights?
- Our approach optimize performance under cost
constraints - Use cost optimization to derive upper bounds of
traffic that can be assigned to each ISP - Assign traffic to optimize performance subject to
the upper bounds
17Evaluation Methodology
- Traffic traces (Oct. 2003 Jan. 2004)
- Abilene traces (NetFlow data on Internet2)
- RedHat, NASA/GSFC, NOAA Silver Springs Lab, NSF,
National Library of Medicine - Univ. of Wisconsin, Univ. of Oregon, UCLA, MIT
- MSNBC Web access logs
- Realistic cost functions Feb. 2002 Blind RFP
- Delay traces
- NLANR traces 3 months RTT measurements between
pairs of 140 universities - Map delay traces to hosts in traffic traces
18Baseline Algorithms
- Round robin
- In each interval, assign traffic to a single ISP
- Rotate in a round robin fashion
- Equal split
- In each interval, split traffic equally among
ISPs - Similar to hash-based load balancing
- Offline local fractional
- Minimize the total cost for each interval
independently - Dedicated links
- Flat rate and independent of usage
19Cost Comparison for Different Traces
Our algorithms significantly out-perform the
alternatives.
20Cost Comparison for Varying Links
For all ISPs, our cost optimization performs
well.
21Cost Performance Evaluation
Optimizing performance alone often doubles the
cost.
22Cost Performance Evaluation (Cont.)
Our dual metric optimization achieves low cost
and latency.
23Global Effects of Smart Routing
- Selfish nature of smart routing
- Each user optimizes its own cost performance
without considering its impact on other traffic - Need to understand its global effects
- Questions
- How well does smart routing perform when traffic
assignment affects link latency? - How well do different smart routing users
co-exist? - How well do smart routing users co-exist with
single-homed users?
24Evaluation Methodology
- Abilene traffic traces
- Rocketfuel inter-domain topology
- 170 nodes, 600 edges
- With propagation delay and OSPF weights
- M/M/1 queuing model
- Routing
- A user selects best performing ISP subject to
cost constraints - Inter-domain shortest AS hop count
- Intra-domain OSPF
- Compute traffic equilibria as in QYZS03
25Global Effects Summary
- Impact of self interference is small
- Smart routing users co-exist well with each other
- Smart routing users co-exist well with
single-homed users
26Conclusions
- Contributions
- First paper on jointly optimizing cost and
performance for multihoming - Propose a series of novel smart routing
algorithms that achieve both low cost and good
performance - Under traffic equilibria, smart routing improves
performance without hurting other traffic - Future work
- Further evaluation through Internet experiments
- Dynamics of interactions among different users
- Design better charging models
27