Title: 29 March 20141
1Packet Classification usingExtended TCAMs
- Edward W. Spitznagel, Jonathan S. Turner, David
E. Taylor - Supported by NSF ANI-9813723, DARPA
N660001-01-1-8930
2Packet Classification Problem
- Suppose you are a firewall, or QoS router, or
network monitor ... - You are given a list of rules (filters) to
determine how to process incoming packets, based
on the packet header fields - Some fields in the rules are specified with bit
masks others with ranges - Goal when a packet arrives, find the first rule
that matches the packets header fields
3Packet Classification Problem
- Example packet arrives with header (0101, 0010,
3, 5, UDP) - classification result filter b is matched
- filter c also matches, but, b occurs before c in
the list - Easy to do when we have only a few rules very
difficult when we have 100,000 rules and packets
arrive at 40 Gb/s
4Geometric Representation
- Filters with K fields can be represented
geometrically in K dimensions - Example
b
c
c
c
c
a
5Related Work
- TCAM-based parallel classification
- CoolCAMs (Narlikar, Basu, Zane) for IP lookup
- SRAM-based sequential classification
- Recursive Flow Classification (Gupta, McKeown)
- HiCuts (Gupta, McKeown)
- Extended Grid of Tries (Baboescu, Singh,
Varghese) - HyperCuts (Singh, Baboescu, Varghese, Wang)
- SRAM 6 transistors per bit (vs. 16 for TCAM),
but the SRAM approaches use more bits per filter
6Ternary CAMs
- Most popular practical approach to
high-performance packet classification - Hardware compares query word (packet header) to
all stored words (filters) in parallel - each bit of a stored word can be 0, 1, or X
(dont care) - Very fast, but not without drawbacks
- High power consumption limits scalability
- inefficient representation of ranges
7Ternary CAM - Example
Entry 0 (filter a) is the first matching filter
8Range Matching in TCAMs
- Convert ranges intosets of prefixes
- 1-4 becomes 001, 01, and 100
- 3-5 becomes 011 and 10
F
9Range Matching in TCAMs
b
c
a
e
f
d
- With two 16-bit range fields,a single rule could
require upto 900 TCAM entries! - Typical case entire filter setexpands by a
factor of 2 to 6
10Extended TCAMs
- Extend standard TCAM architecture to enable
classification with larger rulesets - Partitioned TCAM, for reduced power
- inspired by CoolCAMs
- differences in indexing, search and partitioning
algorithms - Support range matching directly in hardware
11Use of Partitioned TCAM
- Main component of power use in TCAM search is
proportional to number of entries searched - Partitioning the TCAM
- divide TCAM into blocks of entries
- each block is enabled for search via an
associated index filter
12Use of Partitioned TCAM
- Example suppose we are given the following
filters
a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxx1 d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 1x1x j. 13-14, 11xx k. 11-15, 1
11x
A real Extended TCAM would have more blocks, and
more filters per block.
13Use of Partitioned TCAM
- Example classify packet with header values (2,
1010) - index block second andfourth filters match
- search second and fourthfilter blocks
- find matching filters(1-2, 1x1x) and (0-14,
1010)
filter blocks
index filters
14Use of Partitioned TCAM
- The key to minimizing power consumption
Organize filters so that only a few TCAM blocks
must be searched to find the filters matching a
packet. - Use a filter grouping algorithm
filter blocks
index filters
15a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 1
11x
29 March 2014 15
16a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 1
11x
29 March 2014 16
17a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 1
11x
29 March 2014 17
18a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 1
11x
0-6, 1xxx
h, i
7-15, 1xxx
g, j, k
Next phase
29 March 2014 18
19a. 1-13, 001x b. 2-3, 00xx c. 9-10, xxxx d. 11-14,
011x e. 12-13, 0xxx f. 0-14, 1010 g. 7-7, 110x h.
0-5, 1110 i. 1-2, 11xx j. 13-14, 11xx k. 11-15, 1
11x
0-6, 1xxx
h, i
7-15, 1xxx
g, j, k
Next phase
29 March 2014 19
20Creating a set of partitions
- At most k filters per region (k block size)
- Regions within the same partition do not overlap
- Total number of regions equals the index size
21Range Matching
- Store a pair of values (lo , hi ) for each range
match field - Range check circuitry compares query values
against lo and hi to determine if query is in
range - Transistors per bit of range field is twice that
of ordinary TCAM - But, for typical IPv4 applications, this results
in just a 22 increase in overall transistor count
22Performance Metrics
- Power Fraction
- a measure of power usage, relative to a standard
TCAM - smaller is better
- Storage Efficiency
- higher is better 1 is optimal
index size ( of partitions)(block size)
number of filters
number of filters
index size ( of blocks)(block size)
23Different Block Sizes
Block size128
Block size256
Block size64
Block size 32
Block size16
24Results Power Fraction
Basic Algorithm
Refined
Blocksize 256
Block size 32
Block size 64
Block size 128
25Results Storage Efficiency
Refined
Basic Algorithm
Blocksize 256
Block size 32
Block size 64
Block size 128
26Current/Future Work
- Computational complexity of filter grouping
problem - Filter updates (add/delete operations)
- Multi-level indices
- Different partitioning algorithms
- Application to SRAM/DRAM-based classification
techniques
27Summary
- Packet Classification is important for many
advanced network services - TCAMs scale poorly due to power consumption and
inefficient range match representations - Extended TCAMs solve these issues by using
partitioned TCAM and hardware support for range
matching - power consumption greatly reduced (typically to
5 or less of power used by a standard TCAM) - range match hardware avoid inefficiency in
representing ranges
28Questions?
?