Title: Techniques and problems for
1Techniques and problems for 100Tb/s routers
Nick McKeown Professor of Electrical Engineering
and Computer Science, Stanford
University nickm_at_stanford.edu http//www.stanford.
edu/nickm
2Topics
- System architecture
- Parallel Packet Switches Building big routers
from lots of little routers. - TCP Switching Exposing (optical) circuits to IP.
- Load-balancing and mis-sequencing.
- Specific problems
- Fast IP lookup and classification.
- Fast packet buffers buffering packets at 40Gb/s
line-rates - Crossbar scheduling (other SNRC project).
Today
Today
3Building a big router from lots of little routers
What wed like
R
R
NxN
R
R
4Why this might be a good idea
- Larger overall capacity
- Faster line rates
- Redundancy
- Familiarity
- After all, this is how the Internet is built
5Parallel Packet Switch
OQ Switch
Demultiplexor
Multiplexor
(sR/k)
(sR/k)
R
R
1
1
Multiplexor
Demultiplexor
R
R
OQ Switch
2
2
Demultiplexor
Multiplexor
R
R
3
OQ Switch
Demultiplexor
Multiplexor
R
R
k3
N4
(sR/k)
(sR/k)
6Parallel Packet SwitchResults
- A PPS can precisely emulate a FCFS shared memory
switch - S gt 2k/(k2) _at_ 2 and a centralized algorithm,
- S 1 with (N.k) buffers in the demultiplexor and
multiplexor, and a distributed algorithm.
7Topics
- System architecture
- Parallel Packet Switches Building big routers
from lots of little routers. - TCP Switching Exposing (optical) circuits to IP.
- Load-balancing and mis-sequencing.
- Specific problems
- Fast IP lookup and classification.
- Fast packet buffers buffering packets at 40Gb/s
line-rates - Crossbar scheduling (other SNRC project).
Today
8Optics and Routing Seminar A Motivating
SystemPhictitious 100Tb/s IP Router
Switch
Linecard 625
Linecard 1
160- 320Gb/s
160- 320Gb/s
40Gb/s
- Line termination
- IP packet processing
- Packet buffering
- Line termination
- IP packet processing
- Packet buffering
40Gb/s
160Gb/s
40Gb/s
Arbitration
Request
40Gb/s
Grant
9Basic Flow
Egress Linecard
Ingress Linecard
Buffer
Buffer
Forwarding Table
625x625 Switch
160- 320Gb/s
160- 320Gb/s
160Gb/s
160Gb/s
IP Lookup
Packet parse
Buffer Mgmt
Buffer Mgmt
Request
Arbitration
Grant
10Packet buffers
5ns for SRAM 50ns for DRAM
Buffer Memory
External Line
64-byte wide bus
64-byte wide bus
- Rough Estimate
- 5 or 50ns per memory operation.
- Two memory operations per packet.
- Therefore, maximum 50 or 5 Gb/s.
Aside Buffers need to be largefor TCP to work
well, so DRAM is usually required.
11Packet buffers
Example of ingress queues on an OC768c
linecard OC768c 40Gb/s RTTBW 10Gb 64 byte
cells
One memory operation every 12.8ns
Buffer Memory
External Line
64-byte wide bus
64-byte wide bus
Scheduler or Arbiter
12This is not possible today
- SRAM
- fast enough random access time, but
- - too expensive, and
- - too low density to store 10Gb of data.
- DRAM
- high density means we can store data, but
- - too slow (typically 50ns random access time)
13FIFO cachesMemory Hierarchy
14Earliest Critical Queue First (ECQF)
15ResultsSingle Address Bus
- Patient Arbiter ECQF-MMA (earliest critical
queue first), minimizes the size of SRAM buffer
to Q(b-1) cells and guarantees that requested
cells are dispatched within Q(b-1)1 cell slots. - Proof Pigeon-hole principle (same argument used
for strictly non-blocking conditions for a Clos
network).
16Implementation Numbers (64byte cells, b 8,
DRAM T 50ns)
- VOQ Switch - 32 ports
- Brute Force Egress. SRAM 10 Gb, no DRAM
- Patient Arbiter Egress. SRAM 115kb, Lat.
2.9 us, DRAM 10Gb - Impatient Arbiter Egress. SRAM 787 kb,
DRAM 10Gb - Patient Arbiter(MA) No SRAM, Lat. 3.2us,
DRAM 10Gb - VOQ Switch 512 ports
- Brute Force Egress. SRAM 10Gb, no DRAM
- Patient Arbiter Egress. SRAM 1.85Mb,
Lat. 45.9us, DRAM 10Gb - Impatient Arbiter Egress. SRAM 18.9Mb,
DRAM 10Gb - Patient Arbiter(MA) No SRAM, Lat. 51.2us,
DRAM 10Gb
17Some Next Steps
- Generalizing results and design rules for
- Selectable degrees of impatience,
- Deterministic vs. statistical bounds.
- Application to design of large shared memory
switches