Microloop Prevention Methods - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Microloop Prevention Methods

Description:

Solution Taxonomy. Existing Solution Space. Summary. Traditional convergence strategy ... Solution taxonomy. Controlled Information flow. Incremental cost change ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 26
Provided by: mikes55
Category:

less

Transcript and Presenter's Notes

Title: Microloop Prevention Methods


1
Micro-loop Prevention Methods
  • draft-bryant-shand-lf-conv-frmwk-00.txt
  • draft-zinin-microloop-analysis-00.txt

2
Outline of Talk
  • Convergence Strategy and Motivation
  • Solution Taxonomy
  • Existing Solution Space
  • Summary

3
Traditional convergence strategy
  • Switch to new as fast as you can independently
  • Required for failures
  • Strategy optimized for this case
  • traffic to failed element is lost
  • so speed is essential
  • Used for everything else
  • common method
  • traffic can be lost due to loops
  • Fast-Reroute prevents traffic loss due to failure
    but loops can still cause loss.

4
Micro-Loop Properties
  • Independent decisions can cause micro-loops.
  • Loops may occur between pairs of nodes or cycles
    of nodes.
  • Duration depends on relative time to update FIBs.
  • Implementation differences
  • Number of affected destinations
  • Propagation time

Loss due to Loop duration may be longer (an order
of magnitude) than Loss during the Fast Reroute
failover.
5
Controlled convergence
  • Made feasible for failure case by fast reroute
  • Traffic is not lost so can afford to take time
  • Can use common method for both failure and
    management change events
  • Traditional convergence optimized for failure
    case without fast-reroute.
  • We can do better
  • (but keep traditional as safe fall-back for
    single failure assumption violation.)

6
Solution taxonomy
  • Controlled Information flow
  • Incremental cost change
  • Controlled Distributed Behavior
  • Synchronized FIB installation
  • Ordered FIB changes
  • Path locking

7
Method Comparison
8
Incremental cost change
  • A change in a link cost of x can only cause loops
    whose cyclic cost is ltx
  • Minimum cycle is 2 (1 in each direction)
  • Hence cost change of 1 can never cause a loop.
  • Where minimum cycle is larger, larger increments
    can be used.
  • Once cost reaches cost of alternate path no more
    loops possible.
  • No Cooperation Required
  • But Can Take Hours

9
Synchronized FIB swap
  • Network synchronized change-over at predetermined
    time
  • Signal/determine time to change
  • Network Synchronized Time (NTP is there)
  • Either Two FIBs for fast swap
  • Substantial hardware implications
  • Or FIB update fast-enough from change-over
    time.
  • Dependent on NTP
  • Conceptually simple with minimal signalling NTP
    dependency implementation concerns

10
Ordering by signalling alone
  • On change, tell old primary neighbors to wait for
    you
  • Wait for all neighbors as instructed, install
    FIB, and tell your old primary neighbors.
  • Assumes a single non-SRLG failure
  • Otherwise communication per destination is
    required
  • No Estimation Required for FIB Compute/Install -
  • Require Reliable Fast Signalling and
  • Non-Trivial Protocol Extensions

11
Ordered FIB changes
  • For any isolated link/node change
  • Determine safe ordering for FIB installation
  • bad news update from edge to failure,
  • good news update from change to edge
  • Each router computes its rank with respect to
    the change.
  • Delays for a number of worst-case FIB
    compute/install times proportional to its rank.

12
Computing the ordering
  • Single Reverse SPF rooted at change node
  • Use old SPT to determine relevant node
  • For bad news- count maximum depth of sub-tree
    below you
  • For good news- count maximum hops to change

13
Delay Proportional to Network Diameter
  • For Good News, rSPF gives necessary depth.
  • For Bad News, rSPF is overly pessimistic for some
    topologies.
  • Strategies to reduce unnecessary delay
  • Prune rSPF by only considering the branch across
    the failure but still too pessimistic.
  • Run SPF rooted at edge nodes to correctly prune
    them but doesnt scale.
  • Compare rSPFs before and after failure

Avoids all micro-loops and requires single FIB
install. Delay dependent on network diameter so
may be unacceptable.
14
Signalling optimization to Reduce delay
  • Use actual FIB compute/install instead of
    worst-case
  • In many cases, actual delay is 0 b/c no change
    needed.
  • Signal to parents in rSPF when
  • Nothing to do, or
  • Completed FIB changes
  • Can change FIB when received signal from all
    children (or when delay expires)
  • Only an optimization
  • Loss of signals falls back to delay based

15
SRLG Concerns
  • Diverse failures may require mutually
    incompatible ordering
  • Different orderings for individual destination
    sets may help
  • Need Rules to merge multiple rSPFs

16
Ordered SPF Summary
  • No forwarding changes required.
  • No signalling required at time of change.
  • Complete prevention of loops for isolated node or
    link changes.
  • Requires cooperation from all routers
  • May delay re-convergence for tens of seconds
    (unless optional signalling used)
  • SRLGs require per destination delays and may
    delay re-convergence more.

17
Path Locking Framework
  • Obtain a fixed convergence delay regardless of
    network.
  • Avoid ordering issue by providing transitional
    paths.
  • Handles SRLGs
  • Different methods to
  • Determine/Create transitional paths
  • Direct traffic to use transitional paths
  • Standard trade-off of complexity versus coverage.
  • Tunnels for Transitional Paths
  • Safe Neighbors for Transitional Next-Hops
  • Marked Packets to Use Transitional Topology
  • U-turn Packets to Use New Topology

18
Time-Line of Convergence
  • Change Discovery Time At this point, all
    routers know about the change. Routers install
    transitional path support.
  • For some methods, immediately start use of
    self-determined transitional paths.
  • Use Transitional Paths Time (1 worst-case FIB
    compute/install later) All routers use
    transitional paths, if available, and new primary
    next-hops otherwise.
  • Lock to New Topology Time (1 worst-case FIB
    compute/install later) All routers use new
    primary next-hops.
  • All micro-loops avoided if a transitional path
    always exists.

19
Create Tunnels
  • Requires tunnel computation/creation at topology
    change
  • Old topology Locking
  • Tunnel to the upstream side of the failure
  • Single tunnel for all affected destinations (if
    link/node failure).
  • New topology locking
  • Tunnel to first unaffected router on new primary
    path
  • Tunnels provide a transitional path that can
    traverse non-supporting routers.
  • Non-supporting routers can only loop locally
    originated traffic.

20
Safe Neighbors
  • Find a safe neighbor to use as a transitional
    next-hop.
  • Safety condition is a neighbor that is loop-free
    on old topology and a downstream path on new
    topology.
  • If two neighboring routers dont have a safe
    neighbor, a micro-loop can form on that link.
  • Analysis of real topologies shows pretty good
    coverage.
  • Local micro-loops possible with non-supporting
    routers.

21
Typical Coverage
22
Packet Marking
  • Can mark packets to force forwarding according to
    a particular topology.
  • Topology can be new or old.
  • All marking starts at the Use Transitional Paths
    Time
  • If using new topology, traffic on new topology
    after 1 worst-case FIB compute/install delay.

23
U-turn Packet
  • Create transitional next-hop by directing U-turn
    packets to the new primary next-hops.
  • At Use Transitional Paths Time, send traffic to
    new primaries (potentially explicitly marked as
    U-turn packets).
  • If implicitly determined U-turn packets, doesnt
    require marking.
  • Explicit method for signalling support of U-turns

24
Lots of Possibilities
  • What are important criteria?
  • Time to be converged
  • Affects single failure assumption
  • Network Stability
  • Ballpark requirement is 10s
  • Simplicity
  • Support for SRLGs
  • No additional mechanisms beyond IP (but coverage
    may suffer)
  • Common additional mechanisms for this and IPFRR
    advanced methods.
  • Should also work for LDP

25
Conclusions Next Steps
  • Incremental Cost Change is impractical.
  • Synchronized FIB Swap what is the
    implementation complexity? Implications of
    coupling NTP to routing?
  • Ordered SPF long delay and poor SRLG support.
    Is that enough to be an issue?
  • Path Locking
  • Seem most promising
  • Many possibilities to get similar results
  • Please send suggestions and comments to the list.
    This solution set may not be complete.
Write a Comment
User Comments (0)
About PowerShow.com