Address Lookup and Classification - PowerPoint PPT Presentation

About This Presentation
Title:

Address Lookup and Classification

Description:

Address Lookup and Classification EE384Y May 23, 2006 Pankaj Gupta Principal Architect and Member of Technical Staff, Netlogic Microsystems pankaj_at_netlogicmicro.com – PowerPoint PPT presentation

Number of Views:115
Avg rating:3.0/5.0
Slides: 50
Provided by: Pankaj93
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Address Lookup and Classification


1
Address Lookup and Classification
EE384Y May 23, 2006
  • Pankaj Gupta
  • Principal Architect and Member of Technical
    Staff,
  • Netlogic Microsystems
  • pankaj_at_netlogicmicro.com
  • http//klamath.stanford.edu/pankaj

2
Generic Router Architecture (Review from EE384x)
Header Processing
Lookup IP Address
Update Header
Queue Packet
1M prefixes Off-chip DRAM
1M packets Off-chip DRAM
3
Lookups Must be Fast
Year Aggregate Line-rate Arriving rate of 40B POS packets (Million pkts/sec)
1997 622 Mb/s 1.56
1999 2.5 Gb/s 6.25
2001 10 Gb/s 25
2003 40 Gb/s 100
2006 80 Gb/s 200
4
Memory Technology (2006)
Technology Max single chip density /chip (/MByte) Access speed Watts/chip
Networking DRAM 64 MB 30-50 (0.50-0.75) 40-80ns 0.5-2W
SRAM 8 MB 50-60 (5-8) 3-4ns 2-3W
TCAM 2 MB 200-250 (100-125) 4-8ns 15-30W
Note Price, speed and power are manufacturer and
market dependent.
5
Lookup Mechanism is Protocol Dependent
Networking Protocol Lookup Mechanism Techniques we will study
MPLS, ATM, Ethernet Exact match search Direct lookup Associative lookup Hashing Binary/Multi-way Search Trie/Tree
IPv4, IPv6 Longest-prefix match search Radix trie and variants Compressed trie Binary search on prefix intervals
6
Outline
  • Routing Lookups
  • Overview
  • Exact matching
  • Direct lookup
  • Associative lookup
  • Hashing
  • Trees and tries
  • Longest prefix matching
  • Why LPM?
  • Tries and compressed tries
  • Binary search on prefix intervals
  • References
  • Packet Classification

7
Exact Matches in ATM/MPLS
Direct Memory Lookup
(Outgoing Port, new VCI/label)
VCI/MPLS-label
Memory
Address
Data
  • VCI/Label space is 24 bits
  • - Maximum 16M addresses. With 64b data, this is
    1Gb of memory.
  • VCI/Label space is private to one link
  • Therefore, table size can be negotiated
  • Alternately, use a level of indirection

8
Exact Matches in Ethernet Switches
  • Layer-2 addresses are usually 48-bits long,
  • The address is global, not just local to the
    link,
  • The range/size of the address is not negotiable
    (like it is with ATM/MPLS)
  • 248 gt 1012, therefore cannot hold all addresses
    in table and use direct lookup.

9
Exact Matches in Ethernet Switches (Associative
Lookup)
  • Associative memory (aka Content Addressable
    Memory, CAM) compares all entries in parallel
    against incoming data.

Associative Memory (CAM)
Network address
Location
Address
Data
48bits
Match
10
Exact MatchesHashing
Memory
Memory
Network Address
Hashing Function
Pointer
16, say
List/Bucket
Address
Data
Data
Address
48
List of network addresses in this bucket
  • Use a pseudo-random hash function (relatively
    insensitive to actual function)
  • Bucket linearly searched (or could be binary
    search, etc.)
  • Leads to unpredictable number of memory
    references

11
Exact Matches Using HashingNumber of memory
references
12
Exact Matches in Ethernet SwitchesPerfect Hashing
Network Address
Hashing Function
Port
16, say
Data
Address
Memory
48
There always exists a perfect hash
function. Goal With a perfect hash function,
memory lookup always takes O(1) memory
references. Problem - Finding perfect hash
functions (particularly a minimal perfect hash)
is complex. - Updates make such a hash function
yet more complex - Advanced techniques multiple
hash functions, bloom filters
13
Exact Matches in Ethernet SwitchesHashing
  • Advantages
  • Simple to implement
  • Expected lookup time is small
  • Updates are fast (except with perfect hash
    functions)
  • Disadvantages
  • Relatively inefficient use of memory
  • Non-deterministic lookup time (in rare cases)
  • ? Attractive for software-based switches.
    However, hardware platforms are moving to other
    techniques (but they can do well with a more
    sophisticated form of hashing)

14
Exact Matches in Ethernet Switches Trees and
Tries
Binary Search Tree
Binary Search Trie
lt
gt
0
1
lt
gt
lt
gt
0
1
0
1
111
010
Lookup time bounded and independent of table
size, storage is O(NW)
Lookup time dependent on table size, but
independent of address length, storage is O(N)
15
Exact Matches in Ethernet Switches Multiway tries
16-ary Search Trie
0000, ptr
1111, ptr
Ptr0 means no children
0000, 0
1111, ptr
1111, ptr
0000, 0
000011110000
111111111111
Q Why cant we just make it a 248-ary trie?
16
Exact Matches in Ethernet Switches Multiway tries
As degree increases, more and more pointers are
0
Table produced from 215 randomly generated 48-bit
addresses
17
Exact Matches in Ethernet Switches Trees and
Tries
  • Advantages
  • Fixed lookup time
  • Simple to implement and update
  • Disadvantages
  • Inefficient use of memory and/or requires large
    number of memory references

More sophisticated algorithms compress sparse
nodes.
18
Outline
  • Routing Lookups
  • Overview
  • Exact matching
  • Direct lookup
  • Associative lookup
  • Hashing
  • Trees and tries
  • Longest prefix matching
  • Why LPM?
  • Tries and compressed tries
  • Binary search on prefix intervals
  • References
  • Packet Classification

19
Longest Prefix Matching IPv4 Addresses
  • 32-bit addresses
  • Dotted quad notation e.g. 12.33.32.1
  • Can be represented as integers on the IP number
    line 0, 232-1 a.b.c.d denotes the integer
    (a224b216c28d)

IP Number Line
0.0.0.0
255.255.255.255
20
Class-based Addressing
A
B
C
D
E
128.0.0.0
192.0.0.0
0.0.0.0
Class Range MS bits netid hostid
A 0.0.0.0 128.0.0.0 0 bits 1-7 bits 8-31
B 128.0.0.0 -191.255.255.255 10 bits 2-15 bits 16-31
C 192.0.0.0 -223.255.255.255 110 bits 3-23 bits 24-31
D (multicast) 224.0.0.0 - 239.255.255.255 1110 - -
E (reserved) 240.0.0.0 -255.255.255.255 11110 - -
21
Lookups with Class-based Addresses
netid
port
23
Port 1
Class A
192.33.32.1
Class B
186.21
Port 2
Class C
Exact match
192.33.32
Port 3
22
Problems with Class-based Addressing
  • Fixed netid-hostid boundaries too inflexible
  • Caused rapid depletion of address space
  • Exponential growth in size of routing tables

23
Early Exponential Growth in Routing Table Sizes
Number of BGP routes advertised
24
Classless Addressing (and CIDR)
  • Eliminated class boundaries
  • Introduced the notion of a variable length prefix
    between 0 and 32 bits long
  • Prefixes represented by P/l e.g., 122/8,
    212.128/13, 34.43.32/22, 10.32.32.2/32 etc.
  • An l-bit prefix represents an aggregation of
    232-l IP addresses

25
CIDRHierarchical Route Aggregation
Router
Backbone
R3
R1
R4
R2
R2
ISP, P
ISP, Q
192.2.0/22
200.11.0/22
Site, S
Site, T
192.2.1/24
192.2.2/24
192.2.1/24
192.2.2/24
192.2.0/22
200.11.0/22
IP Number Line
26
Post-CIDR Routing Table sizes
Optional Exercise What would this graph look
like without CIDR? (Pick any one random AS, and
plot the two curves side-by-side)
Number of active BGP prefixes
  • Source http//bgp.potaroo.net

27
Routing Lookups with CIDR
Optional Exercise Find the nesting
distribution for routes in your randomly-picked
AS
192.2.2/24
192.2.2/24, R3
192.2.0/22
200.11.0/22
192.2.0/22, R2
200.11.0/22, R4
200.11.0.33
192.2.0.1
LPM Find the most specific route, or the longest
matching prefix among all the prefixes matching
the destination address of an incoming packet
28
Longest Prefix Match is Harder than Exact Match
  • The destination address of an arriving packet
    does not carry with it the information to
    determine the length of the longest matching
    prefix
  • Hence, one needs to search among the space of all
    prefix lengths as well as the space of all
    prefixes of a given length

29
LPM in IPv4Use 32 exact match algorithms for LPM!
Exact match against prefixes of length 1
Exact match against prefixes of length 2
Port
Priority Encode and pick
Exact match against prefixes of length 32
30
Metrics for Lookup Algorithms
  • Speed ( number of memory accesses)
  • Storage requirements ( amount of memory)
  • Low update time (support gt10K updates/s)
  • Scalability
  • With length of prefix IPv4 unicast (32b),
    Ethernet (48b), IPv4 multicast (64b), IPv6
    unicast (128b)
  • With size of routing table (sweetspot for
    todays designs 1 million)
  • Flexibility in implementation
  • Low preprocessing time

31
Radix Trie (Recap)
Trie node
A
next-hop-ptr (if prefix)
1
B
right-ptr
left-ptr
1
C
D
0
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
P2
1
1
F
E
P1
0
G
P3
1
H
P4
32
Radix Trie
  • W-bit prefixes O(W) lookup, O(NW) storage and
    O(W) update complexity
  • Advantages
  • Simplicity
  • Extensible to wider fields
  • Disadvantages
  • Worst case lookup slow
  • Wastage of storage space in chains

33
Leaf-pushed Binary Trie
Trie node
A
left-ptr or next-hop
right-ptr or next-hop
1
B
1
C
D
0
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
P1
P2
1
E
P2
0
G
P4
P3
34
PATRICIA
A
Patricia tree internal node
2
0
bit-position
1
B
C
right-ptr
left-ptr
P1
3
1
E
0
D
5
P2
1
0
F
G
P3
P4
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
Lookup 10111
Bitpos 12345
35
PATRICIA
  • W-bit prefixes O(W2) lookup, O(N) storage and
    O(W) update complexity
  • Advantages
  • Decreased storage
  • Extensible to wider fields
  • Disadvantages
  • Worst case lookup slow
  • Backtracking makes implementation complex

36
Path-compressed Tree
A
1, ?, 2
0
1
C
B
P1
10,P2,4
0
D
1010,P3,5
1
E
P4
Path-compressed tree node structure
next-hop (if prefix present)
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
variable-length bitstring
bit-position
left-ptr
right-ptr
37
Path-compressed Tree
  • W-bit prefixes O(W) lookup, O(N) storage and
    O(W) update complexity
  • Advantages
  • Decreased storage
  • Disadvantages
  • Worst case lookup slow

38
Multi-bit Tries
Binary trie
W
Depth W Degree 2 Stride 1 bit
39
Prefix Expansion with Multi-bit Tries
If stride k bits, prefix lengths that are not a
multiple of k need to be expanded
Prefix Expanded prefixes
0 00, 01
11 11
E.g., k 2
Maximum number of expanded prefixes corresponding
to one non-expanded prefix 2k-1
40
Four-ary Trie (k2)
A four-ary trie node
next-hop-ptr (if prefix)
A
ptr00
ptr01
ptr10
ptr11
11
10
B
C
P2
11
10
F
D
E
10
P3
P12
P11
11
10
H
G
P42
P41
P1 111 H1
P2 10 H2
P3 1010 H3
P4 10101 H4
41
Prefix Expansion Increases Storage Consumption
  • Replication of next-hop ptr
  • Greater number of unused (null) pointers in a node

Time W/k Storage NW/k 2k-1
Optional Exercise The increase in number of null
pointers in LPM is a worse problem than in exact
match. Why?
42
Generalization Different Strides at Each Trie
Level
  • 16-8-8 split
  • 4-10-10-8 split
  • 24-8 split
  • 21-3-8 split

Optional Exercise Why does this not work well
for IPv6?
43
Choice of Strides Controlled Prefix Expansion
Sri98
  • Given a forwarding table and a desired number of
    memory accesses in the worst case (i.e., maximum
    tree depth, D)

A dynamic programming algorithm to compute the
optimal sequence of strides that minimizes the
storage requirements runs in O(W2D) time
44
Binary Search on Prefix Intervals Lampson98
Prefix Interval
P1 /0 00001111
P2 00/2 00000011
P3 1/1 10001111
P4 1101/4 11011101
P5 001/3 00100011
45
0111
Alphabetic Tree
gt
?
0011
1101
?
?
gt
gt
I3
I6
1100
0001
gt
?
?
gt
I1
I2
I4
I5
46
Another Alphabetic Tree
0001
0011
I1
1/2
0111
I2
1/4
1100
I3
1/8
1101
I4
1/16
I5
I6
1/32
1/32
47
Multiway Search on Intervals
  • W-bit N prefixes O(logN) lookup,
  • O(N) storage
  • Advantages
  • Storage is linear
  • Can be balanced
  • Lookup time independent of W
  • Disadvantages
  • But, lookup time is dependent on N
  • Incremental updates more complex than tries
  • Each node is big in size requires higher memory
    bandwidth

48
Routing Lookups References
  • lulea98 A. Brodnik, S. Carlsson, M. Degermark,
    S. Pink. Small Forwarding Tables for Fast
    Routing Lookups, Sigcomm 1997, pp 3-14. Example
    of techniques for decreasing storage consumption
  • gupta98 P. Gupta, S. Lin, N.McKeown. Routing
    lookups in hardware at memory access speeds,
    Infocom 1998, pp 1241-1248, vol. 3. Example of
    hardware-optimized trie with increased storage
    consumption
  • P. Gupta, B. Prabhakar, S. Boyd. Near-optimal
    routing lookups with bounded worst case
    performance, Proc. Infocom, March 2000 Example
    of deliberately skewing alphabetic trees
  • P. Gupta, Algorithms for routing lookups and
    packet classification, PhD Thesis, Ch 1 and 2,
    Dec 2000, available at http//yuba.stanford.edu/
    pankaj/phd.html Background and introduction to
    LPM

49
Routing lookups References (contd)
  • lampson98 B. Lampson, V. Srinivasan, G.
    Varghese. IP lookups using multiway and
    multicolumn search, Infocom 1998, pp 1248-56,
    vol. 3. Multi-way search
  • PerfHash Y. Lu, B. Prabhakar and F. Bonomi.
    Perfect hashing for network applications, ISIT,
    July 2006
  • LC-trie S. Nilsson, G. Karlsson. Fast address
    lookup for Internet routers, IFIP Intl Conf on
    Broadband Communications, Stuttgart, Germany,
    April 1-3, 1998.
  • sri98 V. Srinivasan, G.Varghese. Fast IP
    lookups using controlled prefix expansion,
    Sigmetrics, June 1998.
  • wald98 M. Waldvogel, G. Varghese, J. Turner, B.
    Plattner. Scalable high speed IP routing
    lookups, Sigcomm 1997, pp 25-36.
Write a Comment
User Comments (0)
About PowerShow.com