Title: Interview talk at various universities and labs
1Staying Connected in a Connected World
Dina Katabi Nate Kushman, Srikanth Kandula, and
Bruce Maggs
2Losing Connectivity Because of BGP Dynamics
- Route changes cause up to 30 packet loss for
more than 2 minutes Labovitz00 - Routing events can cause multiple loss bursts,
and one loss burst can last for up to 20s
Wang06 - Popular and unpopular prefixes experience losses
due to BGP dynamics Wang05 - VoIP outages are highly correlated with BGP
updates Kushman06
3Links, Links Everywhere But Not a Path to
Forward!
- We keep ASs connected as long as the graph is
connected
4 Focus on Forwarding
- Dont worry about BGPs routing
- During convergence BGP can create transient loops
and no-paths - Address transient loops or no-paths by forwarding
packets on pre-computed failover paths - This talk describes our solution to the transient
no-path problem
5Why Forwarding?
- Convergence is unlikely to be fast enough
- Even a few seconds of disconnectivity affects
realtime apps such as VoIP and Gaming - Strict timing constraints limit innovation
- E.g., prevent a future BGP that considers path
capacity
6Transient No-Path Problem
ATT.
Sprint.
Jen
Tim
All of Tims neighbors are using him to get to
MIT ? Nobody tells Tim an alternate path
MIT
7Transient No-Path Problem
ATT.
Sprint.
Tim knows no alternate path to MIT
Jen
Tim drops ATTs and Jens packets to MIT, and
his own
Tim
LOSS!
MIT
8Transient No-Path Problem
Eventually, Tim withdraws path from ATT and Jen
ATT.
Sprint.
Jen
ATT and Jen stop sending packets to Tim
Tim
MIT
9Transient No-Path Problem
Eventually, Tim withdraws path from ATT and Jen
ATT.
Sprint.
Jen
ATT and Jen stop sending packets to Tim
Tim
ATT announces the Sprint path to Tim Jen ?
Traffic flows
MIT
Transient No-Path causes temporary
disconnectivity
10How do we solve Tims problem?
- Tell Tim a failover path before the link fails
- rather than after it, as is often the case in
current BGP
11Help Tim Help You!
ATT advertises to Tim ATT? Sprint ? MIT as a
failover path
ATT.
Sprint.
Jen
Link Fails ? Tim immediately sends traffic on
failover path
Tim
Internally, ATT tunnels Tims traffic toward
Sprint
MIT
No Loss !
12Can ATT advertise a failover path to every
neighbor?
- No, because
- Excessive overhead
- ATT cant tell whether packets are for primary
or failover path
Constraint An AS can advertise only one failover
path, and only to its next-hop AS
13Goal Staying Connected
ATT.
Sprint.
- If Tims link to destination fails
- and
- After convergence Tim will have a path to
destination
X
Tim should have a failover path to the
destination when the link fails
14How do we achieve the goal given the constraint?
We can only pick which failover path an AS
advertises to its next-hop AS
15ATT.
Jen
x
Tim
Nick
Dest
The most disjoint path protects against more link
failures
16Resilient BGP (R-BGP)
- Each AS advertises to its next-hop AS, a
failover path which is the path most disjoint
from its primary
Theorem 1 If any AS using the down link will
have a path after convergence, then R-BGP
guarantees that the AS immediately above the down
link knows a failover path when the link fails.
17 Theorem 2 All ASs that will eventually learn a
valley-free path to the destination are
guaranteed no BGP-caused packet loss during
convergence
A path is valley-free if no AS transits between
two non-customers ASs
18Experimental Results
- Event-driven simulation
- Dual-homed AS loses one link
- Find percentage of ASs that see temporary
disconnectivity to the dual-homed AS
X
MIT
- State-of-the-art policy inference based on
Xia04 and Subramanian02
19Compared Schemes
- Current BGP
- Most-disjoint failover path
- Most disjoint path may not be policy compliant.
Still an AS may want to advertise it because - It is temporary
- The AS protects its own traffic
20Compared Schemes
- Current BGP
- Most-disjoint failover path
- Most-disjoint policy-compliant failover path
21Results
Percentage of ASs with transient disconnectivity
9 with current BGP
0 With most-disjoint path
22Results
Percentage of ASs with transient disconnectivity
9 with current BGP
0.5 with policy-compliant most-disjoint path
0 With most-disjoint path
Policy compliant failover paths may be sufficient
23Conclusion
- BGP loses connectivity even when the graph is
connected - R-BGP solves this problem by advertising a single
failover path downstream - BGPs convergence stays unaffected
- Simple and powerful
24Questions?
25Characteristics of R-BGP
- One path per neighbor ? just like BGP
- No change of BGP convergence properties
26 ATT
B
Tunnel
A
Failover traffic is tunneled inside an AS
27ATT.
Sprint.
Tim withdraws primary path from ATT
Jen
ATT moves to second best path through Jen
Tim
ATT withdraws failover from Tim because its
next-hop is Jen, and sends Tim Jens path
Outage!
MIT
Tim drops packets!
The problem is avoided if Tims withdrawal
indicates the down link
28(No Transcript)