Techniques and problems for - PowerPoint PPT Presentation

About This Presentation
Title:

Techniques and problems for

Description:

Impatient Arbiter : Egress. SRAM = 787 kb, DRAM = 10Gb. Patient Arbiter(MA) : No ... Selectable degrees of 'impatience', Deterministic vs. statistical bounds. ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 18
Provided by: nic8160
Category:

less

Transcript and Presenter's Notes

Title: Techniques and problems for


1
Techniques and problems for 100Tb/s routers
Nick McKeown Professor of Electrical Engineering
and Computer Science, Stanford
University nickm_at_stanford.edu http//www.stanford.
edu/nickm
2
Topics
  • System architecture
  • Parallel Packet Switches Building big routers
    from lots of little routers.
  • TCP Switching Exposing (optical) circuits to IP.
  • Load-balancing and mis-sequencing.
  • Specific problems
  • Fast IP lookup and classification.
  • Fast packet buffers buffering packets at 40Gb/s
    line-rates
  • Crossbar scheduling (other SNRC project).

Today
Today
3
Building a big router from lots of little routers
What wed like
R
R
NxN
R
R
4
Why this might be a good idea
  • Larger overall capacity
  • Faster line rates
  • Redundancy
  • Familiarity
  • After all, this is how the Internet is built

5
Parallel Packet Switch
OQ Switch
Demultiplexor
Multiplexor
(sR/k)
(sR/k)
R
R
1
1
Multiplexor
Demultiplexor
R
R
OQ Switch
2
2
Demultiplexor
Multiplexor
R
R
3
OQ Switch
Demultiplexor
Multiplexor
R
R
k3
N4
(sR/k)
(sR/k)
6
Parallel Packet SwitchResults
  • A PPS can precisely emulate a FCFS shared memory
    switch
  • S gt 2k/(k2) _at_ 2 and a centralized algorithm,
  • S 1 with (N.k) buffers in the demultiplexor and
    multiplexor, and a distributed algorithm.

7
Topics
  • System architecture
  • Parallel Packet Switches Building big routers
    from lots of little routers.
  • TCP Switching Exposing (optical) circuits to IP.
  • Load-balancing and mis-sequencing.
  • Specific problems
  • Fast IP lookup and classification.
  • Fast packet buffers buffering packets at 40Gb/s
    line-rates
  • Crossbar scheduling (other SNRC project).

Today
8
Optics and Routing Seminar A Motivating
SystemPhictitious 100Tb/s IP Router
Switch
Linecard 625
Linecard 1
160- 320Gb/s
160- 320Gb/s
40Gb/s
  • Line termination
  • IP packet processing
  • Packet buffering
  • Line termination
  • IP packet processing
  • Packet buffering

40Gb/s
160Gb/s
40Gb/s
Arbitration
Request
40Gb/s
Grant
9
Basic Flow
Egress Linecard
Ingress Linecard
Buffer
Buffer
Forwarding Table
625x625 Switch
160- 320Gb/s
160- 320Gb/s
160Gb/s
160Gb/s
IP Lookup
Packet parse
Buffer Mgmt
Buffer Mgmt
Request
Arbitration
Grant
10
Packet buffers
5ns for SRAM 50ns for DRAM
Buffer Memory
External Line
64-byte wide bus
64-byte wide bus
  • Rough Estimate
  • 5 or 50ns per memory operation.
  • Two memory operations per packet.
  • Therefore, maximum 50 or 5 Gb/s.

Aside Buffers need to be largefor TCP to work
well, so DRAM is usually required.
11
Packet buffers
Example of ingress queues on an OC768c
linecard OC768c 40Gb/s RTTBW 10Gb 64 byte
cells
One memory operation every 12.8ns
Buffer Memory
External Line
64-byte wide bus
64-byte wide bus
Scheduler or Arbiter
12
This is not possible today
  • SRAM
  • fast enough random access time, but
  • - too expensive, and
  • - too low density to store 10Gb of data.
  • DRAM
  • high density means we can store data, but
  • - too slow (typically 50ns random access time)

13
FIFO cachesMemory Hierarchy
14
Earliest Critical Queue First (ECQF)
15
ResultsSingle Address Bus
  • Patient Arbiter ECQF-MMA (earliest critical
    queue first), minimizes the size of SRAM buffer
    to Q(b-1) cells and guarantees that requested
    cells are dispatched within Q(b-1)1 cell slots.
  • Proof Pigeon-hole principle (same argument used
    for strictly non-blocking conditions for a Clos
    network).

16
Implementation Numbers (64byte cells, b 8,
DRAM T 50ns)
  • VOQ Switch - 32 ports
  • Brute Force Egress. SRAM 10 Gb, no DRAM
  • Patient Arbiter Egress. SRAM 115kb, Lat.
    2.9 us, DRAM 10Gb
  • Impatient Arbiter Egress. SRAM 787 kb,
    DRAM 10Gb
  • Patient Arbiter(MA) No SRAM, Lat. 3.2us,
    DRAM 10Gb
  • VOQ Switch 512 ports
  • Brute Force Egress. SRAM 10Gb, no DRAM
  • Patient Arbiter Egress. SRAM 1.85Mb,
    Lat. 45.9us, DRAM 10Gb
  • Impatient Arbiter Egress. SRAM 18.9Mb,
    DRAM 10Gb
  • Patient Arbiter(MA) No SRAM, Lat. 51.2us,
    DRAM 10Gb

17
Some Next Steps
  • Generalizing results and design rules for
  • Selectable degrees of impatience,
  • Deterministic vs. statistical bounds.
  • Application to design of large shared memory
    switches
Write a Comment
User Comments (0)
About PowerShow.com