Responsivness vs. stability - PowerPoint PPT Presentation

About This Presentation
Title:

Responsivness vs. stability

Description:

Can not do SPF all the time something minor changes, may be better to do one SPF ... need to take care of the state myself, I.e. stop a long SPF in the middle ... – PowerPoint PPT presentation

Number of Views:12
Avg rating:3.0/5.0
Slides: 27
Provided by: csd6
Category:

less

Transcript and Presenter's Notes

Title: Responsivness vs. stability


1
Lecture 3
  • Responsivness vs. stability
  • Brief refresh on router architectures
  • Protocol implementation
  • Quagga

2
To read
  • To present
  • Threads vs. events (belegrakis)
  • U-Loop prevention
  • Sub-millisecond convergence (lekakis)
  • REFS
  • Netlink
  • Quagga manual
  • Router architectures

3
How to be faster
  • Faster SPF
  • Better algorithms
  • Incremental SPF
  • Faster detection
  • Faster HELLOs
  • BFD!!!
  • In the line card instead of the control plane
  • many protocols can share
  • Faster FIB download
  • Download important prefixes first
  • Do things faster
  • Trigger SPF immediately
  • Trigger LSA origination immediately

4
How to be stable
  • SPF may be expensive
  • Can not do SPF all the time something minor
    changes, may be better to do one SPF for all
    changes
  • Avoid extra FIB downloads
  • Do not overload the CPU
  • Do not want to sent too many updates at once
  • Receiver may get overloaded
  • Do not want to trigger updates too quickly
  • Link may be flapping
  • When CPU/links are loaded ensure that I do not
    miss important things
  • Do not miss HELLOs, will make things worse

5
Configuration Timers
  • Hello timer, dead timer
  • LSA update delay
  • LSA pacing
  • LSA retransmission pacing
  • SPF delay
  • Wait for this time before you do SPF
  • SPF hold-time
  • Do not do another SPF before this time passes
  • Can have dynamic timers
  • Be fast when CPU is idle
  • Be slow when CPU is loaded

6
It is difficult
  • Speed and stability are conflicting goals
  • Alternatively Disconnect convergence from data
    plane
  • Avoid u-loops
  • See overview
  • Have alternate next-hops pre-computed and switch
    to them in case of failures
  • We will see this later

7
Anatomy of a protocol (routing or not)
  • Inputs
  • Static configuration
  • Other protocol instances in the network
  • Other components in my platform
  • Dynamic events on my platform link states etc
  • State
  • Transient (packets, queues)
  • Protocol
  • Computation
  • Triggered (process incoming packet)
  • Periodic (timers) (refresh state, LSA)
  • Outputs
  • To other protocol instances on the network
  • To other components in the platform
  • To FIB

8
Examples of protocol tasks
  • Receive, send protocol packet(s)
  • Schedule/process timers
  • Perform computations (I.e SPF)
  • Communicate with other components
  • Download Routes to FIB
  • Process changes in the environment
  • Link state changes
  • Adjacency changes
  • Process configuration and configuration changes

9
Tasks
  • Can be complex and long running
  • Download 1,000 routes to FIB
  • Originate 500 LSAs
  • Perform an SPF in a large network
  • Usually protocol runs on one CPU
  • Have to multiplex tasks

10
Scheduling
  • Scheduling of tasks is what makes or breaks an
    implementation
  • Liveness
  • even when I download 100,000 routes to the FIB, I
    can receive and process LSAs
  • Stability
  • Prioritize tasks
  • Send the hellos first even under load
  • Never skip important tasks when overloaded
  • Shed excess load so that I do not collapse
  • Queue incoming packets and start dropping if
    queue becomes too long
  • Slow down the SPFs

11
The big question
  • How to implement/handle parallelism
  • Events vs. threads
  • Events
  • trigger event handlers that are essentially
    function calls
  • Run to completion, I.e. until the function
    returns
  • Threads
  • flows of execution with their own local
    state/stack
  • Can be suspended and resumed
  • With pre-emptive threads system may switch to
    another thread along the way
  • With non-preemptive threads I have to yield

12
How does my protocol look with events
  • Assign events and event handlers
  • Packet receive, packet send, spf etc
  • Event loop (A.K.A the big select() loop)
  • Loop waiting for events
  • Incoming packet, timer, signal other event
  • Pick the next event to handle
  • According to my own scheduling
  • Call its event handler
  • When I want to initiate an action I post an event
  • Put the packet in a queue
  • Schedule a Packet_send event

13
How does my protocol look with threads
  • FIG!
  • Assign tasks to threads
  • Packet_rx thread, packet_tx thread, FIB_download
    thread
  • Thread blocks when there is no work to do
  • Packet_rx on the socket, FIB_download on a cond
    variable
  • It is unblocked when there is work
  • System handles the scheduling of the threads
  • May not have control in it

14
Events vs. threads
  • Events
  • Manage my own state
  • Manage my own scheduling
  • I explicitly handle parallelism by controlling
    when a event handler terminates
  • If I want to suspend an event handler must take
    care of its state
  • Threads
  • Can arbitrarily suspend/resume a thread
  • State is automatically managed in the thread
    stack
  • The thread scheduler has control
  • With pre-emptive threads system handles
    parallelism
  • But I have to LOCK

15
Events
  • Pros
  • I have total control of everything and I can do
    what is best
  • Handle parallelism explicitly no need for
    locking, etc
  • May be more efficient
  • No context switches and state saving there
  • Cons
  • I have total responsibility of everything, system
    does not help me
  • If I want to yield to another handler need to
    take care of the state myself, I.e. stop a long
    SPF in the middle

16
Threads
  • Pros
  • Parallelism is handled in a more clean and
    natural way
  • System helps a lot in scheduling, state copying
  • Cons
  • Real parallel programming is hard
  • Locking etc
  • State copying can be expensive
  • Thread scheduler may be making the wrong
    scheduling decisions
  • Not application specific

17
An example Quagga
  • First some router architecture
  • Forwarding and control plane
  • Forwarding plane has to be fast
  • NPs, FPGAs, ASICs, little bit inflexible
  • Control plane is usually implemented in a
    commodity processor
  • Commodity OS, environment and tools

18
Big and Small Routers
  • How does a large router look?
  • EXAMPLE control vs. forwarding plane
  • line-cards, switch, FIB per-line-card, control
    processor
  • How does a PC router look?
  • EXAMPLE
  • Kernel for the forwarding
  • Use space for the control plane

19
Distributed control planes
  • I want resiliency and minimal fate sharing
  • Break the control plane into components that are
    independent
  • Processes
  • One process per-protocol
  • It was a novelty 6 years ago, now everybody has
    it
  • May need to share some state
  • Need to prioritize between multiple routes
  • Redistribution later

20
Quagga a distributed control plane for a PC
router
  • Multiple processes
  • One per-protocol
  • Zebra
  • manage all the routes from all protocols
  • send routes to the FIB (kernel)
  • Centralize the management of local interfaces etc

21
Communication
  • EXAMPLE of system
  • Zebra protocols talk to each other through a
    private control protocol
  • Over a TCP socket
  • Protocols send their packets directly to the
    interfaces
  • But send their routes to zebra
  • Over a TCP socket
  • Zebra talks to the kernel through netlink

22
Paths
  • Interface down
  • Kernel to zebra through netlink
  • Zebra to protocols through private proto
  • Route download
  • Protocol to zebra through private proto
  • Zebra to kernel through netlink
  • OSPF Hellos
  • Directly from OSPF to interfaces and back
  • Data packets
  • Never leave the kernel

23
Zebra protocol
  • Interface
  • Add, delete, addr-add, addr-delete, up, down
  • Route
  • Ipv4-add, ipv4-del, ipv6-add, ipv6-del
  • Redistribute
  • Add, del

24
Netlink
  • Uses a special socket
  • Very powerful
  • Read and change interface state
  • Read and change interface configuration
  • Read and change routing tables
  • And MPLS, scheduling.
  • And efficient
  • Multicast some notifications

25
Configuration and management
  • Prompt based configuration and management
  • telnet localhost 2601 for zebra
  • telnet localhost 2604 for ospf

26
Implementation
  • Directories
  • Zebra, ospf, lib for common functions
  • Event based (but confusingly called threads)
  • Main loop in lib/thread.c thread_fetch()
  • Considers sockets, timers, signals
  • Timers are used as a general event mechanism
  • If I want to do something now, I schedule a timer
    with 0 expiration
  • Netlink interface in zebra/rt_netlink.c
Write a Comment
User Comments (0)
About PowerShow.com