Understanding the LargeScale Dynamics of Internet Routing Protocols The Global Internet: Measurement PowerPoint PPT Presentation

presentation player overlay
1 / 37
About This Presentation
Transcript and Presenter's Notes

Title: Understanding the LargeScale Dynamics of Internet Routing Protocols The Global Internet: Measurement


1
Understanding the Large-Scale Dynamics of
Internet Routing ProtocolsThe Global Internet
Measurement Modeling and AnalysisLeiden, the
Netherlands September, 2000
Craig Labovitz, Roger Wattenhofer Srinivasan
Venkatachary Microsoft Research labovit,
rogerwar, cheenu_at_microsoft.com
Abha Ahuja, Farnam Jahanian, Abhijit
Bose University of Michigan ahuja, farnam,
abose_at_umich.edu
2
Motivation
  • Large-scale, distributed protocols/systems often
    exhibit unexpected behaviors in deployment
  • Self-synchronization
  • BGP/TCP pathologies
  • Global TCP flow synchronization
  • Modeling routing behaviors critical for improved
    end-to-end performance and reliability
  • Can we model Internet routing dynamics?
  • What are the properties of fault propagation (and
    recovery) of Internet paths?

3
Conventional Routing Wisdom(IETF, IAB, Books,
ISPs, etc)
  • Internet routing is robust under faults
  • Supports path re-routing and restoral on the
    order of seconds
  • BGP has good convergence properties
  • Does not exhibit looping/bouncing problems of RIP
  • Internet fail-over will improve with faster
    routers and faster links
  • More redundant connections (multi-homing) to
    Internet will always improve site fault-tolerance
  • Internet topology/diameter of all paths small (lt
    3)

4
In This Talk
  • We will show that most of the conventional wisdom
    about routing convergence is not accurate
  • Measurement of BGP failures
  • Measurement of BGP dynamics following failures
  • Analysis/intuition behind delayed BGP routing
    convergence
  • Impact of policy and topology on BGP convergence

5
Basic Methodology
  • Deploy probe machines at IXPs around the world
  • Write home-brewed Unix software tools to collect
    (and later, inject) BGP (and OSPF/ISIS/RIP)
    routing information from lots of commercial
    providers

6
Internet BGP Update Volume
  • Withdraws in millions until 2/1998 due to
    withdraw looping/Cisco bug. Dramatic drop after
    IOS release
  • Announcements growing after 6/98 due to MED
    policy and convergence?

7
Open Question
  • After a fault in a path to multi-homed site, how
    long does it take for the majority of Internet
    routers to fail-over to the secondary path?
  • Routing table convergence (backbone routers reach
    steady-state) after a fault
  • End-to-end paths stable (normal levels of loss
    and latency)

BGP
Primary ISP
Customer
BGP
Backup ISP
8
Experiments
  • Inject BGP faults (announcements/withdraws) of
    varied prefix and ASPath lengths into
    topologically and geographically diverse ISP
    peering sessions
  • Monitor impact faults through 1) recordings of
    default-free BGP peering sessions with 20
    tier1/tier2 ISPs and 2) active ICMP measurements
    (512 byte/second to 100 random web sites)
  • Wait two years (and 250,000 faults)

9
Fault Scenarios
  • Tup -- A new route is advertised
  • Tdown -- A route is withdrawn (i.e. single-homed
    failure)
  • Tshort -- Advertise a shorter/better ASPath (i.e.
    primary path repaired)
  • Tlong -- Advertise a longer/worse ASPath
    (i.e.primary path fails)

10
Major Convergence Results
  • Routing convergence requires an order of
    magnitude longer than expected (10s of minutes)
  • Routes converge more quickly following Tup/Repair
    than Tdown/Failure events (bad news travels more
    slowly)
  • Curiously, withdrawals (Tdown) generate several
    times the number of announcements than
    announcements (Tup)

11
Example of BGP Convergence
  • TIME BGP Message/Event
  • 104030 Route Fails/Withdrawn by AS2129
  • 104108 2117 announce 5696 2129
  • 104132 2117 announce 1 5696 2129
  • 104150 2117 announce 2041 3508 3508 4540 7037
    1239 5696 2129
  • 104217 2117 announce 1 2041 3508 3508 4540 7037
    1239 5696 2129
  • 104305 2117announce 2041 3508 3508 4540 7037
    1239 6113 5696 2129
  • 104335 2117 announce 1 2041 3508 3508 4540 7037
    1239 6113 5696 2129
  • 104359 2117 sends withdraw
  • BGP log of updates from AS2117 for route via
    AS2129
  • One BGP withdrawal triggers 6 announcements and
    one withdrawal from 2117
  • Increasing ASPath length until final withdraw

12
CDF of BGP Routing Table Convergence Times
New Route Long-gtShort Fail-over
Short-gtLong Fail-Over
Failure
  • Less than half of Tdown events converge within
    two minutes
  • Tup/Tshort and Tdown/Tlong form equivalence
    classes
  • Long tailed distribution (up to 15 minutes)

13
Impact of Delayed Convergence
  • Why do we care about routing table convergence?
    It deleteriously impacts end-to-end Internet
    paths
  • ICMP experiment results
  • Loss of connectivity, packet loss, latency, and
    packet re-ordering for an average of 3-5 minutes
    after a fault
  • Why? Routers drop packets for which they do not
    have a valid next hop. Also problems with cache
    flushing in some older routers.

14
End-to-End Impact Failover
  • ICMP loss to 100 randomly chosen web sites with
    VIF source address of our probe
  • Tlong/Tshort exhibit similar relationship as
    before

15
Delayed Convergence Background
  • Well known that distance vector protocols exhibit
    poor convergence behaviors
  • Counting to infinity, looping, bouncing problem
  • RIP redefines infinity and adds split-horizon,
    poison reverse, etc.
  • Still, slow convergence (N3) and not scalable
  • BGP advertises ASPaths instead of distance
  • ASPath Solves counting to infinity and RIP
    looping problem, but
  • BGP can still explore invalid paths during
    convergence (i.e. the bouncing problem)

16
Problems with Distance Vector ProtocolsCounting
to Infinity
B
A
R
R 5
R 7
17
BGP Convergence Example
18
N gt 4?
AS6453
AS2497
6453 1239 5696 237
AS6113
2497 5696 237
6113 2914 237
AS6461
6461 5696 237
AS1239
1239 5696 237
AS5696
5696 237
AS2914
2914 237
AS237
237
AS701
701 6461 5696 237
AS5000
5000 237
AS1
AS1673
1 5696 237
1673 5696 237
19
Intuition for Delayed BGP Convergence
  • There exists possible ordering of messages such
    that BGP will explore ALL possible ASPaths of ALL
    possible lengths
  • BGP is O(N!), where N number of default-free BGP
    speakers in a complete graph with default policy
  • Although seemingly very different protocols, BGP
    and RIP share very similar convergence behaviors.
    Major difference
  • RIP explores metrics (1N)
  • BGP ASPath provides multiple ways to represent
    metric (path) of length N, or (N-1)!

20
BGP and RIP
  • Both exhibit routing table loops
  • Both learn invalid state from neighbors (based
    incomplete knowledge) and propagate invalid state
    information to neighbors
  • Both employ hold-downs
  • RIP 30 second timer
  • BGP MinRouteAdver
  • Adds synchronization in best case

21
Lower Bound on BGP
  • If assume optimal ordering of messages, what is
    the best we can expect from BGP?
  • In practice, BGP timers (MinRouteAdver) provide
    synchronization and limit possible orderings of
    messages
  • MinRouteAdver timer specifies interval between
    successive updates sent to a peer for a given
    prefix
  • Useful for bundling updates together
  • According to RFC, MinRouteAdver applies only
    announcements
  • But, interaction of MinRouteAdver and vendor
    ASPath loop detection implementation introduce
    artificial delay

22
MinRouteAdver
  • Minimum interval between successive updates sent
    to a peer for a given prefix
  • Allow for greater efficiency/packing of updates
  • Rate throttle
  • Applied only to announcements (at least according
    to BGP RFC)
  • Applied on (prefix destination, peer) basis, but
    implemented on (peer) basis

23
MinRouteAdver
  • 30(N-3) delay due to creation mutual
    dependencies. Provide proof that N-3 rounds
    necessarily created during bounded BGP
    MinRouteAdver convergence
  • Rounds due to
  • Ambiguity in the BGP RFC and lack sender-side
    loop detection
  • Inclusion of BGP withdrawals with MinRouteAdver
    (in violation of RFC)

24
MinRouteAdver Rounds
  • Implementation of MinRouteAdver timer and
    receiver-side loop detection timer leads to 30
    second rounds O(n-3)30 seconds time complexity

25
Impact of Policy and Topology
  • In practice, Internet is not a complete graph and
    ASes maintain complex routing policies
  • Given ISP policies and an Internet topology for a
    route, can we estimate the time required for
    convergence?
  • Most analysis of Internet topology is based on
    steady-state or low frequency snapshots
  • How does steady-state topology compare to set of
    all possible paths?

26
Comparing ISP Convergence Latencies
  • CDF of faults injected into three Mae-West
    providers and observed at Japanese ISP
  • Significant variations between providers

27
Observed Fault Injection Topologies
ISP 4
MAE-WEST
  • In steady-state, topologies between ISP1, ISP2,
    ISP3 similar all direct BGP peers of ISP4. Does
    not explain variation
  • Most studies report steady-state diameter of the
    Internet relatively small (lt 3 AS)

28
Factors Impacting BGP Propagation
  • Each AS router adds between 0-45 MinRouteAdver
    Delay
  • IBGP
  • MinRouteAdver race conditions

29
ISP1-ISP4 Paths During Failure
ISP 4
Steady State
FAULT
R1
ISP 1
  • Only one back up path (length 3)

30
ISP2-ISP4 Paths During Failure
ISP 4
Steady State
FAULT
R2
ISP 2
31
ISP3-ISP4 Paths During Failure
ISP 4
Steady State
FAULT
R3
ISP 3
32
Relationship Between Backup Paths and Convergence
  • Convergence related to length longest possible
    backup ASpath between two nodes

33
Conclusion and Next Steps
  • Internet does not posses effective inter-domain
    fail-over (15 minutes is a long time for phone
    call)
  • Majority of BGP convergence delay due to vendor
    implementation decisions of MinRouteAdver and
    loop detection
  • In practice, Internet is not a complete graph and
    same degree of message re-ordering unlikely. Our
    current work
  • What is the impact of ISP policy and topology on
    BGP convergence?
  • Can we improve BGP convergence times?

34
MTTF of Backbone Networks
  • Informally How long before a network is
    unreachable?
  • Majority of Internet routes unreachable within 30
    days

35
Mean Time to Fail-Over
  • How long before traffic is re-routed?
  • Majority of Internet routes which possess backup
    paths fail-over every 3 days

36
Internet Route Repair
  • How long before a network is reachable again?
  • Long-tailed distribution with plateau at 30
    minutes. Why this plateau?

37
Simulation Results
Write a Comment
User Comments (0)
About PowerShow.com