Title: SystemonChip Packet Processor for an Experimental Network Services Platform
1System-on-Chip Packet Processorfor an
ExperimentalNetwork Services Platform
- David Taylor, Alex Chandra, Yuhua Chen,
- Sarang Dharmapurikar, John Lockwood,
- Wenjing Tang, Jonathan Turner
2 Advanced Network Services
- Mobile client services
- Content filtering for lightweight clients
- Wireless transport protocol bridging
- Admission control
- Internet telephony
- Audio compression
- Audio encryption
- Multi-media services
- Video conferencing
- Audio/video transcoding
- Audio/video bridging
3Motivation Design Goals
- Need open-platform systems for extensible
networking research - Shift focus from raw performance to delivery of
advanced applications - Provide realistic prototyping environment
- Support Gigabit links and switch fabrics with 21
speed advantage - Modern FPGAs provide a flexible platform for SOC
development without loss of flexibility - Consolidate many packet processing functions on a
single chip - Employ new packet classification algorithms or
queuing disciplines - Minimize data movement
- Off-chip data transfers are often the performance
bottleneck - Provide efficient support for multicast and
monitoring applications - Reliable operation under extreme traffic
conditions - Ex. mechanisms to prevent tail-dropping during
overload - Utilize internal header fields (shims) to pass
information between components and ports - Prevent redundant computations
4 Support for Efficient Multicast
- Extension of a unique binary tree multicast
algorithm - First employed in Gigabit ATM switch
- Multicast flow is broken into binary copy steps
and processed in multiple passes - Scales favorably in all dimensions of interest
- Memory space and bandwidth at participating ports
- Switch fabric bandwidth
- Control overhead for adding/removing nodes
- Allow ports to participate in multiple binary
copy steps - Encode the position in the multicast tree using
a shim field (MTP) - Allows a port processor to store multiple
multicast filters for a single multicast session
by including the MTP in classification
5 Network Services Platform
6Port Processor Architecture
Control
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
7Segmentation and Reassembly
Control
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
8Packet Storage Manager
- Buffers variable length IP packets in SDRAM
- Maintains context identifiers for reassembly
contexts - Receive/forward fixed-size chunks of packets
to/from SAR blocks - Dynamically allocates/deallocates fixed-size
chunks of memory - Maintains a free list of chunk pointers
- Maintains copy counts for multicast packets
Control
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
9Classification and Route Lookup (CARL)
- Determines the processing and queuing actions to
be performed for each packet - Employs 3 parallel search engines general
filters, exact match filters, route lookup - Supports 32 general filters on the IP/transport
5-tuple - Supports 20k exact match filters on the
IP/transport 5-tuple and the Multicast Tree
Position (MTP) shim field - Supports 50k routes, longest prefix match on the
IP destination address - Efficient implementation of compressed multi-bit
trie algorithm, Tree Bitmap
Control
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
10Queue Manager
- Manages three sets of queues for packets bound
for the switch, link, or Processing Elements - Packets destined for link are inserted in a
per-flow queue (for flows with reserved
bandwidth) or one of 64 best-effort queue - Per-flow queues scheduled with Self-Clocked Fair
Queueing - Best effort queues scheduled with Queue State
Deficit Round Robin - Packets destined for switch are inserted in
virtual output queues and scheduled using a
distributed queueing algorithm
Control
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
11Control Cell Processor
Control
- Central control block of system
- Receives and processes control messages sent as
fixed-sized cells - Transmits acknowledgements and status updates
- Manages register set of control variables
- Manages system counters and flags
- Ex. Input and output packet counters, drop
counters, etc. - Interface to Queue Manager provides DQ status
rate control updates and queue length status - Manages filter and route updates
Header Processing
Data Path
Packet Storage Manager (PSM)
SW
SW
ISAR
OSAR
LC
LC
12Resource Utilization
- Xilinx FPGA logic resources are grouped as
slices containing FF/LUT pairs - Each pair contains one flip-flop and one 4-input
lookup table (LUT) - Xilinx FPGA embedded memory resources are grouped
as BlockRAM - 4096 bit, dual port embedded memories with
configurable widths (1 to 16 bits) - Total resource usage (xcv2000e)
- 20,646 FFs (53)
- 19,947 4-LUTs (51)
- 19,198 Slices (99 )
- 82 contain unrelated logic
- 125 BlockRAMs (78)
- High slice usage makes 75 MHztarget difficult to
meet
13Conclusions Ongoing Efforts
- Designed and implemented an efficient
System-on-Chip that provides all core packet
processing functions and 6.4 Gb/s of total
throughput - Packet Processor and Processing Elements utilize
open-platform research systems developed at
Washington University in Saint Louis - Primary challenges defining and mitigating
worst-case traffic patterns - Investigating new packet classification and queue
management algorithms - Considering porting the design to next-generation
of low-cost FPGAs (Xilinx Spartan-3) - Utilizing the design and experience gained to
architect packet processors and processing
elements for 10 Gb/s links
14Thank you.