Title: Designing a Predictable Backbone Network with Valiant Load Balancing
1Designing a Predictable Backbone Network with
Valiant Load Balancing
Nick McKeown Stanford University All the hard
stuff was done by Rui Zhang-Shen
Clean Slate Design for the Internet http//cleansl
ate.stanford.edu
NSF 100 x 100 Clean Slate Program http//100x100ne
twork.org
2Backbone network
3US Backbone Networks Observations
- 50 nodes interconnected by long-haul optical
links - Increasingly rich mesh topology
- Built over mesh of WDM or TDM circuits and
switches - Reduce hop count and delay
- Fault tolerance
- Load balancing
- Low utilizationlinks over-provisioned
- Uncertainty in traffic matrix the network is
designed for - Headroom for future growth
- Prepare to take over when links or routers fail
- Minimize congestion and delay variation
4Traffic Matrices
To
From
Traffic matrix is hard to predict
5What fraction of traffic matrices can they
support?
Verio
Abilene
Sprint
ATT
Verio, ATT, and Sprint topologies courtesy of
RocketFuel
6Desired Characteristics
- Dependable
- Continues to operate when traffic patterns change
in the short and long term - Continues to operate under failure
- Recovers quickly
- Efficient
- And at no extra cost
7Why is this hard?
r
r
1
2
r
3
N
r
4
r
r
8Why is this hard?
r
r
1
2
r
3
N
r
4
r
r
9Our Approach
- The operator already estimates ri
- Requires only local knowledge of users and market
estimates - Use Valiant Load Balancing (VLB)
- Supports all traffic matrices
- History
- L. G. Valiant, G. Brebner, 1981-82
- Parallel communication
- Statistical delay guarantee
- C.-S. Chang, etc. I. Keslassy etc., 2001-05
- Switch scheduling
- Throughput guarantee
- Optimality
10Valiant Load-Balancing
r
r
1
2
r
3
N
r
4
r
r
11Valiant Load-Balancing
2r/N
r
r
r
r
2
4
- In practice
- The mesh could be a mesh of lambdas or TDM
circuits - Send on direct path, and only spread when
network is congested.
r
r
12Aside Routers based on VLB
- Can you build a router switched backplane based
on VLB? - Appealing possibilities
- 100 throughput for any arrival pattern
- No per-packet arbitration and scheduling
- Passive switch fabric consumes almost zero power
Switch Rack lt 100W
Linecards
Linecards
Linecards
40 x 40 MEMS
1
2
55
56
Scaling Routers using Optics Sigcomm 2003
13Failures
- Node failures
- Takes away corresponding links and traffic
- Still a full mesh network
- Links failures
- Asymmetric network
- Many scenarios
14Fault Tolerance
- Load balance traffic over available paths
- To tolerate any k link or router failures,
sufficient to increase the capacity each link by - Example A 50 node network requires 11 more
capacity to withstand any 5 failures.
15Heterogeneous Network
R ?iri
r2
r1
Homogeneous c 2r/N
rN
r3
cij 2rirj /R
r4
ri
Gravity Configuration
16Heterogeneous Network
- As before, the total capacity we need with VLB is
twice what wed need if we knew the traffic
matrix (and it was static). - With oblivious routing we need an extra
- capacity.
17Is VLB efficient?
- Not knowing the traffic matrix means we need a
total capacity 2-times larger than if we did. - But we never know the traffic matrix, and it
changes. So the cost is surprisingly small. - Anecdotally, a network that can support all
traffic matrices and behaves predictably on
failure requires less capacity than existing
networks.
18Interconnecting Backbones
- Peering parameters
- Rp is maximum peering traffic
- qi0 for peering nodes, qi0 for non-peering
nodes, ?iqi1 - Peering link capacity Rpqi
19Within a VLB Network
- Assume peering condition is fixed
- Given Rp qi
- Variables pi
- Spread traffic over the peering links
20Spread over peering links
1
1
1
- cij ri pj rj pi
- min(ri,Rp)(max(pj,qj)-pj) min(rj,Rp)(max(pi,q
i)-pi) - If Rp gt ri, optimal solution pi qi cij ri
qj rj qi - Efficient use of peering links
- Supports all traffic matrices as before
21Other questions
- Delay-sensitive applications
- How much does it matter?
- It may matter for interactive voice, video,
gaming - Dealing with it Express paths ,Adaptive
load-balancing