Title: A LowPower Network Search Engine Based on Statistical Partitioning
1A Low-Power Network Search Engine Based on
Statistical Partitioning
- Taskin Kocak and Faysal Basci
- Dept. of Electrical and Computer Engineering
- University of Central Florida, Orlando, FL 32816
- tkocak_at_cpe.ucf.edu
2Introduction
- Stringent memory access and search speed
requirements are two of the main bottlenecks in
wire speed packet processing. - Most viable search engines are implemented in
Content Addressable Memories (CAM). - In networking applications, CAMs are generally
used in VLAN - lookup tables (layer-2), packet forwarding lookup
tables (layer-3), session lookup table (layer-4)
and filtering (layers 5-7) - CAMs have high operational speed advantage over
other memory search algorithms, such as
look-aside tag buffers, binary or tree based
searches. - However, this performance advantage comes with a
price of higher silicon area, and higher power
consumption (3-5 W/chip).
3CAM Architecture
4TCAMs in IP forwarding
- TCAMs dont care storage capability is
favorable in lookup engines. - e.g., for a 24-bit prefix, the last 8 bits will
be x - Routing table entries stored in TCAMs are
ordered according to their prefixes - In the case of multiple matches priority encoder
will choose the one with the lowest address,
which is the longest matching prefix
532-bit IP Prefix Distribution
6Partitioned TCAM
Perform a search in TCAM1 If there is a match
DONE otherwise search TCAM2 If
there is a match DONE otherwise
issue mismatch end
7Power Consumption
PTCAM PSTATIC PCLOCK PMATCH PMISS
PX Here, we are concerned about the dynamic power
consumption caused by the search (comparison)
operation.
where ? represent the probability that a match
will occur in TCAM1
8Experimental Results dynamic power
Example TCAM circuit is implemented in Cadence
for TSMC 0.18-µm CMOS. Simulations are run at 100
MHz. PMATCH 397 nW PMISS 515 nW Px
336 nW
9Experimental Results dynamic power
10Dynamic power
11Power consumption with different miss rates for
Telstra
12Experimental Results powerlatency
EL2-?0 L search latency in clock cycles
13Experimental Results
Expected power savings and average latency when
partitioning into three TCAMs
14Reconfigurable Architecture
15Conclusions
- We presented a TCAM partitioning scheme, which
utilizes statistical distribution of prefixes. - We showed that indeed the partitioning helps
reducing the power consumption in IP lookup
applications. - Partitioning into two and three, reduce power
consumption considerably, whereas further
partitioning shows little improvement
16ANCHOR workshop at ISCA 2004