Fast Incremental Updates for Pipelined Forwarding Engines - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Fast Incremental Updates for Pipelined Forwarding Engines

Description:

Memory locations that are modified must be limited in number and balanced across ... Software-based state trie can store the information. Eliminating excess writes ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 33
Provided by: cialCsie
Category:

less

Transcript and Presenter's Notes

Title: Fast Incremental Updates for Pipelined Forwarding Engines


1
Fast Incremental Updates for Pipelined Forwarding
Engines
  • Author Anindya Basu, Girija Narlikar
  • Publisher Transaction on networking 05
  • Reporter Yen Cheng Liu
  • Date 11/30

2
Outline
  • Introduction
  • Background
  • Solving pipeline architecture problem
  • Route update characteristics
  • Memory optimization
  • Reduce bubbles

3
Introduction
  • The paper focus on ASIC-based packet forwarding
    engine that utilize pipelining
  • Main issues of update
  • Memory allocated must be balanced across stages
  • Memory locations that are modified must be
    limited in number and balanced across stages

4
Introduction
  • Main contribution of the paper
  • Present an algorithm to build a trie which has
    balanced stage memory allocation
  • Develop multiple optimization which aimed at
    reducing number of modification in each stage due
    to route update
  • Software-based scheme to process update( similar
    to shadow trie )
  • Flexible
  • Cost effective

5
Background
  • Leaf pushed trie

6
Pipelined Lookups Using Tries
  • Each trie level is stored in a different pipeline
    stage
  • Using leaf pushing trie
  • The longest matching prefix is always in the leaf
    of the traversed path
  • Using write bubble to update
  • Each bubble consists of a sequence of( stage,
    location, value) triples, 1 triple for 1 stage
  • Minimizing the number of write bubble can reduce
    the disruption to the lookup process

7
SOLVING the PIPELINED ARCHITECTURE PROBLEM
  • Forwarding engine model
  • A trie component that constructs and updates the
    routing trie
  • Packing component that packs writes from a batch
    of consecutive route updates into write bubbles
    that are sent down the pipeline
  • pipeline component that actually simulates the
    traversal of these write bubbles through a
    multi-stage pipeline.

8
Forwarding engine model
9
SOLVING the PIPELINED ARCHITECTURE PROBLEM
  • Assumptions
  • The initial trie construction takes as input a
    snapshot of the entire table
  • Bubbles are processed by the pipeline in the same
    order as they are generated by the packing
    component
  • Only tries with fixed strides are considered
  • Focus on leaf-pushed tries
  • Writes to different pipeline stages can be
    combined into a single write bubble
  • The packing component is permitted to pack
    pipeline writes from multiple route updates into
    a single write bubble
  • focus on IPv4 lookups
  • The next hop information is stored in a separate
    Next Hop table that is distinct from the
    pipelined trie.

10
Routing table Observation
  • Because 24 bit prefix dominate the routing table
    nowadays
  • Most routing update effect 24 bit prefix
  • The number of short prefixes is very low.
    However, the modifications in each update is
    large( first level often has stride of 12-16 bits
    )
  • The address blocks allocated to an ISP customer
    are sub-blocks of the address block allocated to
    the ISP
  • Prefixes corresponding to the customers of a
    given ISP are typically neighboring 24-bit
    prefixes
  • A link failure (recovery) in an ISP network
    disconnects (
  • reconnects) some or all of its customer networks
    (represented by neighboring prefixes in the
    routing trie).
  • Large proportion of routes that are withdrawn get
    added back a few minutes later

11
Memory optimization
  • Designing non-pipelined tries
  • Use controlled prefix expansion to construct
    memory-efficient tries for the set of prefixes in
    a routing table( using DP )
  • controlled prefix expansion
  • Node(i) number of nodes at level I
  • If we terminate at bit position i, next level is
    at bit position j, j gt I
  • gt node( i 1 ) lt 2( j i )
  • T j, r gt memory requirement for j 1 bits,
    r level

12
Designing non-pipelined tries
  • Here, we choose to terminate the (r-1)th level,
    at position m to minimize the total memory

13
Implications for memory usage and update
performance
  • CPE doesnt attempt to equally distribute the
    memory across stages

14
A New Algorithm for Pipelined Architectures
  • The new algorithm, MinMax, is based on CPE
  • Constraints
  • Each level in the fixed-stride must fit in a
    single pipeline stage
  • The maximum memory allocated to a stage (over all
    stages) is minimized.
  • The total memory used is minimized subject to the
    first two constraints

15
A New Algorithm for Pipelined Architectures
  • The 1th and 3rd constraints are satisfied by
    following equations

16
A New Algorithm for Pipelined Architectures
  • Memory allocated to the rth level in the
    multi-bit trie
  • Maximum memory allocated to any trie level
  • Find p minimum value of above function

17
A New Algorithm for Pipelined Architectures
  • Appling to constraints
  • Main goal reduce max memory across stages
  • A memory-efficient trie typically has smaller
    strides and hence less replication of routes in
    the trie

18
A New Algorithm for Pipelined Architectures
  • Worst Case Memory Bound
  • The max memory per stage in k level is

19
Performance
20
Reducing write bubble
  • Four optimization methods to achieve the goal
  • Separating out updates to short routes
  • Node pull-ups
  • Eliminating excess writes
  • Caching deleted subtrees

21
Separating out update to short routes
  • Separating out updates to short routes
  • Ex the addition of 7-bit route can cause up to
    211 writes( stride of 16 )

22
Node pull-up
23
Node pull-up
  • State trie
  • The pullup information( in the form of a changed
    stride length) is stored in the node where the
    pullup has occurred.
  • Software-based state trie can store the
    information

24
Eliminating excess writes
  • Neighboring routes are often added in the same
    timestamp
  • Add

25
Eliminating excess writes
  • Withdraw

26
Caching deleted subtrees
27
Caching deleted subtrees
  • When a route withdrawal causes a sub-tree to be
    deleted, the trie component caches the sub-tree
    in software and remembers the location of the
    cached trie in the pipeline memory
  • Therefore, the only information that must be
    stored with the cached subtree is the prefix that
    was pushed down, and the last route in the
    subtree that was withdrawn

28
Caching deleted subtree
  • Memory requirements
  • Limited caching size memory
  • FIFO is applied

29
Reducing write bubble
  • Benefits of applying each optimization
    individually, and together with the other
    optimizations

30
Reducing write bubble
  • Three schemes are compared here

31
Reducing write bubble
  • The experiments shows that 4-6 stages is better
    when all optimization applied

32
Prefix Table Dynamics
  • Performing a large number of incremental updates
    may cause the trie to gradually become
    unbalanced.
  • MinMax may need to re-applied
Write a Comment
User Comments (0)
About PowerShow.com