Title: Block Design Review: ONL NP Router Multiplexer MUX
1Block Design ReviewONL NP RouterMultiplexer
(MUX)
Mart Haitjema mah5_at_cse.wustl.edu
http//www.arl.wustl.edu/projects/techX/design/de
sign.html
2Revision History
3ONL NP Router
xScale
xScale
TCAM
Assoc. Data ZBT-SRAM
64KW
SRAM
64KW
HdrFmt (1 ME)
Parse, Lookup, Copy (3 MEs)
Rx (2 ME)
Tx (1 ME)
QM (1 ME)
Mux (1 ME)
NN
64KW
SRAM
32KW Each
NN
NN
NN
NN
SRAM Ring
Plugin1
Plugin2
Plugin3
Plugin4
Plugin5
SRAM
xScale
Scratch Ring
NN Ring
NN
(Slide modified from ONL_NProuter.ppt)
4Contents
- Overview
- MUX Function
- Handling RX
- Configurable Multiplexer Policy
- Design
- Compute Latency Budget
- Design Overview
- Implementation Status
5Overview - Function
- Multiplex input from
- RX ? MUX
- 2 Word per pkt
- 64KW SRAM Ring
- 64KW/2 32K pkts
- xScale ? MUX
- 3 Word per pkt
- 64KW SRAM Ring
- 64KW/3 21.3K pkts
- Plugins ? MUX
- 3 Word per pkt
- 64KW SRAM Ring
- 64KW/3 21.3K pkts
- To Parse-Lookup-Copy
- MUX ? PLC
- 3 Word per pkt
- 256 Word Scratch Ring
- 256/3 85 pkts
xScale
64KW
64KW
Mux (1 ME)
RX
PLC
64KW
Plugins
6Overview - Handling RX
- Modify Header Buffer Descriptor from RX
Parse, Lookup, Copy (3 MEs)
Rx (2 ME)
64KW
Mux (1 ME)
Flags Src Source (2b) 00
Rx 01 XScale 10 Plugin
11 Undefined PT(1b) PassThrough(1)/Clas
sify(0) Reserved (5b)
Buffer Handle(24b)
Rsv (4b)
Out Port (4b)
SRAM
64KW Each
L3 (IP, ARP, ) Pkt Length (16b)
QID(16b)
Stats Index (16b)
In Port (3b)
Plugin Tag (5b)
Flags (8b)
NN
Buffer Handle(24b)
Rsv (8b)
Plugin0
Plugin1
xScale
(Slide modified from ONL_NProuter.ppt)
7Overview - Handling RX
- Mux Block writes
- Buffer_size ? (frame length from Rx) -14
- Packet_size ? (frame length from Rx) -14
- Offset ? 0x18E
- Freelist ? 0
- Ref_cnt ? 1
(Slide from ONL_NProuter.ppt)
8Overview - Multiplexer Policy
- MUX should service input queues based on a
configurable policy - Round-Robin Policy
- Queues are serviced in round-robin fashion
- Each input queue is assigned a quantum which
specifies the number of packets (0 to 255) to be
serviced from queue (if available) before moving
on to the next queue - Quantum value of 0 means skip queue unless all
other queues are empty - Quantum values are stored as 3 contiguous bytes
in scratch memory
9Compute Latency Budget
- What is our performance target?
- To hit 5 Gb rate
- Minimum Ethernet frame 76B
- 64B frame 12B InterFrame Spacing
- 5 Gb/sec 1B/8b packet/76B 8.22 Mpkt/sec
- IXP ME processing
- 1.4Ghz clock rate
- 1.4Gcycle/sec 1 sec/ 8.22 Mp 170.3 cycles
per packet - Compute budget 1 ME thus 170 cycles per packet
- Latency budget (threads170)
- 1 ME 1 threads 170 cycles
- 1 ME 4 threads 680 cycles
- 1 ME 8 threads 1360 cycles
(Slide modified from ONL_NProuter.ppt)
10Design Overview
Wait For prev. sig_start
Read Quantum Values
60 Cycles
Read All Occupancy Counters
Swap
Select Queue
Service Plugins
Service RX
Service xScale
300 Cycles
150 Cycles
Read xScale Input Ring
Read Plugins Input Ring
Read RX Input Ring
Write RX Occupancy Counter
Write xScale Occupancy Counter
Write Plugins Occupancy Counter
Signal next_start
Signal next_start
Signal next_start
Swap
Format Write Buffer Descriptor
Swap
Update Stats Counter
60 Cycles
Write PLC Output Ring (dl_sink)
Latency Total 420
Swap
11Implementation Status
- MUX Assembly Stub
- Currently reads only from RX
- Performs most of functionality for RX
- Need to Implement
- Thread ordering
- Quantum Policy
- Conditional block to process from Plugins and
xScale - Read and Write Occupancy Counters
12File locations (in /ONL_Router/)
- Code
- src/mux/ONL/mux.c
- Includes
- src/dispatch_loop/ONL/dl_source.h,c
- dl_source() and dl_sink() functions