Routing Convergence - PowerPoint PPT Presentation

About This Presentation
Title:

Routing Convergence

Description:

Routing Convergence Global Routing – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 54
Provided by: ackr
Category:

less

Transcript and Presenter's Notes

Title: Routing Convergence


1
Routing Convergence
  • Global Routing

2
Internet Routing Convergence
  • An Experimental Study of Delayed Internet
    Routing Convergence
  • Craig Labovitz, Abha Ahuja, Farnam Jahanian,
    Abhijit Bose
  • ACM Sigcomm September 2000

3
Hierarchical Routing -- Review
  • Untruths about Internet Routing
  • all routers identical
  • network flat
  • not true in practice
  • administrative autonomy
  • internet network of networks
  • each network admin may want to control routing in
    its own network
  • scale with 50 million destinations
  • cant store all dests in routing tables!
  • routing table exchange would swamp links!

4
Hierarchical Routing
  • aggregate routers into regions, autonomous
    systems (AS)
  • routers in same AS run same routing protocol
  • inter-AS routing protocol
  • routers in different AS can run different
    inter-AS routing protocol
  • special routers in AS
  • run inter-AS routing protocol with all other
    routers in AS
  • also responsible for routing to destinations
    outside AS
  • run intra-AS routing protocol with other gateway
    routers

5
Intra-AS and Inter-AS routing
  • Gateways
  • perform inter-AS routing amongst themselves
  • perform intra-AS routers with other routers in
    their AS

b
a
a
C
B
d
A
network layer
inter-AS, intra-AS routing in gateway A.c
link layer
physical layer
6
Intra-AS and Inter-AS routing
Host h2
Intra-AS routing within AS B
Intra-AS routing within AS A
7
(No Transcript)
8
AS graphs obscure topology!
The AS graph may look like this.
Tim Griffin, Leiden 2000
9
Inter-AS routing (cont)
  • BGP (Border Gateway Protocol) the de facto
    standard
  • Path Vector protocol and extension of Distance
    Vector
  • Each Border Gateway broadcast to neighbors
    (peers) the entire path (ie, sequence of ASs) to
    destination
  • For example, Gateway X may store the following
    path to destination Z
  • Path (X,Z) X,Y1,Y2,Y3,,Z

10
Inter-AS routing (cont)
  • Now, suppose Gwy X send its path to peer Gwy W
  • Gwy W may or may not select the path offered by
    Gwy X, because of cost, policy () or loop
    prevention reasons.
  • If Gwy W selects the path advertised by Gwy X,
    then
  • Path (W,Z) w, Path (X,Z)
  • Note path selection based not so much on cost
    (eg, of
  • AS hops), but mostly on administrative and policy
    issues
  • (e.g., do not route packets through competitors
    AS)

11
Inter-AS routing (cont)
  • Peers exchange BGP messages using TCP.
  • OPEN msg opens TCP connection to peer and
    authenticates sender
  • UPDATE msg advertises new path (or withdraws old)
  • KEEPALIVE msg keeps connection alive in absence
    of UPDATES it also serves as ACK to an OPEN
    request
  • NOTIFICATION msg reports errors in previous msg
    also used to close a connection

12
Why different Intra- and Inter-AS routing ?
  • Policy Inter is concerned with policies (which
    provider we must select/avoid, etc). Intra is
    contained in a single organization, so, no policy
    decisions necessary
  • Scale Inter provides an extra level of routing
    table size and routing update traffic reduction
    above the Intra layer
  • Performance Intra is focused on performance
    metrics needs to keep costs low. In Inter it is
    difficult to propagate performance metrics
    efficiently (latency, privacy etc). Besides,
    policy related information is more meaningful.
  • We need BOTH!

13
What is Routing Policy?
  • Description of the routing relationship between
    autonomous systems
  • Who are the peers?
  • What routes are
  • Originated by a peer?
  • Imported from each peer?
  • Exported to each peer?
  • Preferred when multiple routes exist?
  • What to do if no route exists?

14
The example I mentioned earlier
Date Fri, 25 Apr 1997 201647 -0500 (CDT)
Subject ALERT Massive Routing Failures
At about 1030 AM today, one of Sprints
customers (AS7007, Florida Internet Exchange)
began announcing a /24 route for every CIDR block
in the core routing table. This was due to a
configuration problem in that they imported all
their routing into a classfull interior routing
protocol and then redistributed the route back
into BGP, becoming a source for the first class C
network in every CIDR block. Sprint does no
border routing filters, so they happily accepted
these routes and gave them away to all
15
Motivation
  • Why we should care about convergence?
  • Routing reliability/fault-tolerance on small time
    scales (minutes) not previously a priority
  • Emerging transaction oriented and interactive
    applications (e.g. Internet Telephony) will
    require higher levels of end2end network
    reliability
  • How well does the Internet routing infrastructure
    tolerate faults?

16
Conventional Routing Wisdom
  • The Internet is designed to survive a nuclear
    cataclysm.Internet routing is robust under
    faults
  • Supports path re-routing and restoral on the
    order of seconds
  • The internet supports fast path rerouting and
    restoral. BGP has good convergence properties
  • Does not exhibit looping/bouncing problems of RIP
  • Internet fail-over will improve with faster
    routers and faster links
  • More redundant connections (multi-homing) to
    Internet will always improve site fault-tolerance

17
Contribution
  • Labovitz et al show that most of the conventional
    wisdom about routing convergence is not accurate
  • Measurement of BGP convergence in the Internet
  • Analysis/intuition behind delayed BGP routing
    convergence
  • Modifications to BGP implementations which would
    improve convergence times

18
Motivation
  • Why has fail-over and fault-tolerance not
    previously been a priority?
  • Applications like email not delay sensitive and
    possess fault-tolerance
  • TCP/IP fault-tolerance (resend)
  • Content replication helps improve reliability for
    static content
  • Network support is required for emerging
    transaction oriented and interactive applications
    (e.g. Internet Telephony, QoS)

19
Building a Reliable Internet
  • What Network support has been proposed already?
  • Significant recent improvement on data-link
    fail-over (e.g. SRP, Sonet). Solves some
    enterprise, intra-domain reliability problems
  • Also significant research on QoS and resource
    reservation protocols for the Internet
  • But, all of these protocols assume stable
    underlying IP forwarding path

20
Background
  • Internet sites multi-home, or purchase
    connectivity from multiple Internet providers to
    improve fault tolerance
  • Goal tolerate a single link, router or ISP
    failure
  • 35 Internet end-sites currently multi-homed

21
Background Multi-homing
22
PSTN versus Internet
  • Public Switched Telephone Network (PSTN) is the
    other network in place.
  • Trade-off between
  • scalability/extensibility/low cost and
  • fault-tolerance/service guarantees/high cost
  • PSTN retains significant intermediate state (i.e.
    circuit setup) and services on relatively few
    nodes. A Smart Network
  • Internet places all intelligence on end-nodes. A
    Stupid Network

23
Trade-Offs
PSTN
High
State Reliability Service Guarantees Development
Time Switch Cost Coordination
Low
High
Low
Scalability Flexibility Distributed Operation
24
Routing
  • Unlike circuit-switched PSTN, packet-switched
    Internet uses hop-by-hop forwarding and next-hop
    selection
  • Global state and circuit-setup used in PSTN
  • this is like owning an atlas and planning route
  • Internet routers only keep local knowledge and
    routes learned from neighbors
  • like asking directions at each stop

25
Internet Routing
  • Inter-domain Internet routing protocols are
    distance vector (i.e. Bellman-Ford) algorithms.
    Unlike PSTN, no pre-computed backup paths!
  • Distance vector protocols are problematic
  • Require time to converge
  • Suffer from counting to infinity

26
Problems with Distance Vector ProtocolsCounting
to Infinity
B
A
R
R 5
R 7
27
Internet Routing
  • The Internet inter-domain routing protocol, BGP,
    solves count-to-infinity problem by keeping
    record of path the route announcement has
    traveled through network
  • Internet routing commonly (and incorrectly)
    believed to converge within 30 seconds

28
BGP Routing
R
29
Open Question
  • After a fault in a path to multi-homed site, how
    long does it take for the majority of Internet
    routers to fail-over to the secondary path?
  • Routing table convergence (backbone routers reach
    steady-state) after a fault
  • End-to-end paths stable (normal levels of loss
    and latency)

BGP
Primary ISP
Customer
BGP
Backup ISP
30
Internet Fail-Over Experiments
  • Instrument the Internet
  • Inject routes into geographically and
    topologically diverse provider BGP peering
    sessions (Mae-West, Japan, Michigan, London)
  • Periodically fail and change these routes (i.e.
    send withdraws or new attributes)
  • Monitor impact faults through 1) recordings of
    BGP peering sessions with 20 tier1/tier2 ISPs and
    2) active ICMP ECHO measurements (512 byte/second
    to 100 random web sites)
  • Write lots of Perl scripts
  • Wait two years (125,000 routing events)

31
Experiment (For the Last Two Years)
32
Fault Scenarios
  • Tup -- A new route is advertised
  • Tdown -- A route is withdrawn (i.e. single-homed
    failure)
  • Tshort -- Advertise a shorter/better ASPath (i.e.
    primary path repaired)
  • Tlong -- Advertise a longer/worse ASPath
    (i.e.primary path fails)

33
Major Convergence Results
  • Routing convergence requires an order of
    magnitude longer than expected (10s of minutes)
  • Routes converge more quickly following Tup/Repair
    than Tdown/Failure events (bad news travels more
    slowly)
  • Curiously, withdrawals (Tdown) generate several
    times the number of announcements than
    announcements (Tup)

34
Example of BGP Convergence
  • TIME BGP Message/Event
  • 104030 Route Fails/Withdrawn by AS2129
  • 104108 2117 announce 5696 2129
  • 104132 2117 announce 1 5696 2129
  • 104150 2117 announce 2041 3508 3508 4540 7037
    1239 5696 2129
  • 104217 2117 announce 1 2041 3508 3508 4540 7037
    1239 5696 2129
  • 104305 2117announce 2041 3508 3508 4540 7037
    1239 6113 5696 2129
  • 104335 2117 announce 1 2041 3508 3508 4540 7037
    1239 6113 5696 2129
  • 104359 2117 sends withdraw
  • BGP log of updates from AS2117 for route via
    AS2129
  • One BGP withdrawal triggers 6 announcements and
    one withdrawal from 2117
  • Increasing ASPath length until final withdraw

35
CDF of BGP Routing Table Convergence Times
New Route Long-gtShort Fail-over
Short-gtLong Fail-Over
Failure
  • Less than half of Tdown events converge within
    two minutes
  • Tup/Tshort and Tdown/Tlong form equivalence
    classes
  • Long tailed distribution (up to 15 minutes)

36
Impact of Delayed Convergence
  • Why do we care about routing table convergence?
    It deleteriously impacts end-to-end Internet
    paths
  • ICMP experiment results
  • Loss of connectivity, packet loss, latency, and
    packet re-ordering for an average of 3-5 minutes
    after a fault
  • Why? Routers drop packets for which they do not
    have a valid next hop. Also problems with cache
    flushing in some older routers.

37
End-to-End Impact Failover
  • ICMP loss to 100 randomly chosen web sites with
    VIF source address of our probe
  • Tlong/Tshort exhibit similar relationship as
    before

38
Delayed Convergence Background
  • Well known that distance vector protocols exhibit
    poor convergence behaviors
  • Counting to infinity, looping, bouncing problem
  • RIP redefines infinity and adds split-horizon,
    poison reverse, etc.
  • Still, slow convergence and not scalable
  • BGP advertises ASPaths instead of distance
  • Solves counting to infinity and RIP looping
    problem, but
  • BGP can still explore invalid paths during
    convergence (i.e. the bouncing problem)

39
BGP Convergence Example
40
N gt 4?
AS6453
AS2497
6453 1239 5696 237
AS6113
2497 5696 237
6113 2914 237
AS6461
6461 5696 237
AS1239
1239 5696 237
AS5696
5696 237
AS2914
2914 237
AS237
237
AS701
701 6461 5696 237
AS5000
5000 237
AS1
AS1673
1 5696 237
1673 5696 237
41
MinRouteAdver Rounds
  • Implementation of MinRouteAdver timer and
    receiver-side loop detection timer leads to 30
    second rounds O(n-3)30 seconds time complexity

42
An Experiment with SSF.OS.BGP4
  • The Model
  • Topology full mesh of N ASes, each with just 1
    router
  • No route filtering
  • Shortest path is best
  • Advertise, Withdraw, Wait and Watch
  • Wait for system to reach stable state, then
  • AS 1 advertises a bogus destination to everyone
    else
  • Wait for system to reach a stable state again,
    then
  • AS 1 tells everyone that the bogus route is not
    reachable through it any more
  • Wait for system to reach a stable state again

43
4
5
1
bogus
3
2
N 10 20 30 40 50
longest path 9 20 28 40 46
convergence time after withdrawal (sec) 150
480 720 1080 1260
avg updates due to withdrawal (range) 59.50
(35-84) 269.55 (58-397) 539.10 (118-892)
945.20 (160-1647) 1423.66 (196-2377)
44
. . . 1610.040778415 bgp_at_381 snd update to
bgp_at_21 wdsbogus 1610.040778415 bgp_at_381 snd
update to bgp_at_201 wdsbogus 1610.040778415
bgp_at_381 snd update to bgp_at_321
wdsbogus 1610.040778415 bgp_at_381 snd update
to bgp_at_441 wdsbogus 1610.040890567 bgp_at_321
snd update to bgp_at_381 nlribogus,asp32 44 34 38
4 22 2 20 48 10 26 12 6 16 36 8 14 24 28 41 18 51
21 33 45 43 35 3 5 47 23 31 37 49 25 46 39 7 27
13 9 29 11 15 17 50 19 42 40 30 1 1610.040890567
bgp_at_321 snd update to bgp_at_441
wdsbogus 1610.040907352 bgp_at_441 snd update
to bgp_at_381 wdsbogus 1610.040907352 bgp_at_441
snd update to bgp_at_341 nlribogus,asp44 38 34 32
4 22 2 20 48 10 26 12 6 16 36 8 14 24 28 41 18 51
21 33 45 43 35 3 5 47 23 31 37 49 25 46 39 7 27
13 9 29 11 15 17 50 19 42 40 30 1 1610.050930294
bgp_at_441 snd update to bgp_at_321 wdsbogus . . .
45
The Problem with BGP
  • If we assume
  • unbounded delay on BGP processing and propagation
  • Full BGP mesh BGP peers
  • Constrained shortest path first selection
    algorithm
  • BGP is O(N!), where N number of default-free BGP
    speakers

There exists possible ordering of messages such
that BGP will explore all possible ASPaths of all
possible lengths
46
BGP and RIP
  • RIP precisely monotonically increasing. Can
    explore metrics (1N)
  • BGP monotonically increasing. Multiple (N!) ways
    to represent a path metric of N.
  • BGP solved RIP routing table loop problem by
    making it exponentially worse

2117 5696 2129 2117 1 5696 2129 2117 2041 3508
3508 4540 7037 1239 5696 2129 2117 1 2041 3508
3508 4540 7037 1239 5696 2129 2117 2041 3508 3508
4540 7037 1239 6113 5696 2129 2117 1 2041 3508
3508 4540 7037 1239 6113 5696 2129
47
BGP Best Case
  • What is the best we can expect from BGP?
  • Implementation of MinRouteAdver timer leads to 30
    second rounds
  • Time complexity is O(n-3)30 seconds
  • State/Computational complexity O(n)
  • At its best, BGP performs as well as RIP2 (but
    uses exponentially more memory in the process)

48
MinRouteAdver
  • Minimum interval between successive updates sent
    to a peer for a given prefix
  • Allow for greater efficiency/packing of updates
  • Rate throttle
  • Applied only to announcements (at least according
    to BGP RFC)
  • Applied on (prefix destination, peer) basis, but
    implemented on (peer) basis

49
MinRouteAdver
  • 30(N-3) delay due to creation mutual
    dependencies. Provide proof that N-3 rounds
    necessarily created during bounded BGP
    MinRouteAdver convergence
  • Rounds due to
  • Ambiguity in the BGP RFC and lack receiver loop
    detection
  • Inclusion of BGP withdrawals with MinRouteAdver
    (in violation of RFC)

50
Simulation Results
51
Intuition for Delayed BGP Convergence
  • There exists possible ordering of messages such
    that BGP will explore ALL possible ASPaths of ALL
    possible lengths
  • BGP is O(N!), where N number of default-free BGP
    speakers in a complete graph with default policy
  • Although seemingly very different protocols, BGP
    and RIP share very similar convergence behaviors.
    Major difference
  • RIP explores metrics (1N)
  • BGP ASPath provides multiple ways to represent
    metric (path) of length N, or (N-1)!

52
Lower Bound on BGP
  • If assume optimal ordering of messages, what is
    the best we can expect from BGP?
  • In practice, BGP timers (MinRouteAdver) provide
    synchronization and limit possible orderings of
    messages
  • MinRouteAdver timer specifies interval between
    successive updates sent to a peer for a given
    prefix
  • Useful for bundling updates together
  • According to RFC, MinRouteAdver applies only
    announcements
  • But, interaction of MinRouteAdver and vendor
    ASPath loop detection implementation introduce
    artificial delay

53
Conclusions
  • Internet does not posses effective inter-domain
    fail-over (15 minutes is a long time for phone
    call)
  • Majority of BGP convergence delay due to vendor
    implementation decisions of MinRouteAdver and
    loop detection
  • In practice, Internet is not a complete graph and
    same degree of message re-ordering unlikely. Our
    current work
  • What is the impact of ISP policy and topology on
    BGP convergence?
  • Can we improve BGP convergence times?
Write a Comment
User Comments (0)
About PowerShow.com