Scaling Internet Routers Using Optics UW, October 16th, 2003 - PowerPoint PPT Presentation

About This Presentation
Title:

Scaling Internet Routers Using Optics UW, October 16th, 2003

Description:

Joint work with research groups of: David Miller, Mark Horowitz, Olav Solgaard. ... Sheer number. Optical WAN components. Per packet processing and buffering. ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 51
Provided by: nickmc
Category:

less

Transcript and Presenter's Notes

Title: Scaling Internet Routers Using Optics UW, October 16th, 2003


1
Scaling Internet Routers Using OpticsUW,
October 16th, 2003
Nick McKeown Joint work with research groups
of David Miller, Mark Horowitz, Olav Solgaard.
Students Isaac Keslassy, Shang-Tse Chuang,
Kyoungsik Yu. Department of Electrical
Engineering, Stanford University Paper
http//klamath.stanford.edu/nickm/papers/sigcomm2
003.pdf Web site http//klamath.stanford.edu/or
2
Backbone router capacity
1Tb/s
100Gb/s
10Gb/s
Router capacity per rack 2x every 18 months
1Gb/s
3
Backbone router capacity
1Tb/s
100Gb/s
Traffic 2x every year
10Gb/s
Router capacity per rack 2x every 18 months
1Gb/s
4
Extrapolating
100Tb/s
2015 16x disparity
Traffic 2x every year
Router capacity 2x every 18 months
1Tb/s
5
Consequence
  • Unless something changes, operators will need
  • 16 times as many routers, consuming
  • 16 times as much space,
  • 256 times the power,
  • Costing 100 times as much.
  • Actually need more than that

6
Stanford 100Tb/s Internet Router
  • Goal Study scalability
  • Challenging, but not impossible
  • Two orders of magnitude faster than deployed
    routers
  • We will build components to show feasibility

7
Throughput Guarantees
  • Operators increasingly demand throughput
    guarantees
  • To maximize use of expensive long-haul links
  • For predictability and planning
  • Despite lots of effort and theory, no commercial
    router today has a throughput guarantee.

8
Requirements of our router
  • 100Tb/s capacity
  • 100 throughput for all traffic
  • Must work with any set of linecards present
  • Use technology available within 3 years
  • Conform to RFC 1812

9
What limits router capacity?
Approximate power consumption per rack
Power density is the limiting factor today
10
Trend Multi-rack routersReduces power density
11
Juniper TX8/T640
Alcatel 7670 RSP
TX8
Avici TSR
Chiaro
12
Limits to scaling
  • Overall power is dominated by linecards
  • Sheer number
  • Optical WAN components
  • Per packet processing and buffering.
  • But power density is dominated by switch fabric

13
Trend Multi-rack routersReduces power density
14
Multi-rack routers
Switch fabric
Linecard
In
WAN
Out
In
WAN
Out
15
Question
  • Instead, can we use an optical fabric at 100Tb/s
    with 100 throughput?
  • Conventional answer No.
  • Need to reconfigure switch too often
  • 100 throughput requires complex electronic
    scheduler.

16
Outline
  • How to guarantee 100 throughput?
  • How to eliminate the scheduler?
  • How to use an optical switch fabric?
  • How to make it scalable and practical?

17
100 Throughput
In
In
In
18
If traffic is uniform
R
In
R
In
R
In
19
Real traffic is not uniform
20
Two-stage load-balancing switch
R
R
R
R/N
R/N
Out
In
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
In
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R
R/N
R/N
In
R/N
R/N
Load-balancing stage
Switching stage
21
R
R
In
R/N
R/N
3
3
3
1
R/N
R/N
R/N
R/N
R/N
R/N
R
R
In
2
R/N
R/N
R/N
R/N
R/N
R/N
R/N
R
R
R/N
In
3
R/N
R/N
22
R
R
In
R/N
R/N
1
R/N
R/N
3
R/N
R/N
R/N
R/N
R
R
In
2
R/N
R/N
3
R/N
R/N
R/N
R/N
R/N
R
R
R/N
In
3
R/N
R/N
3
23
Changs load-balanced switchGood properties
  • 100 throughput for broad class of traffic
  • No scheduler needed a Scalable

24
Changs load-balanced switchBad properties
  • Packet mis-sequencing
  • Pathological traffic patterns a Throughput
    1/N-th of capacity
  • Uses two switch fabrics a Hard to package
  • Doesnt work with some linecards missinga
    Impractical

25
Single Mesh Switch
2R/N
In
2R/N
2R/N
2R/N
In
2R/N
2R/N
2R/N
2R/N
In
2R/N
26
Packaging
R
In
R
In
R
In
27
Many fabric options
N channels each at rate 2R/N
Any permutation network
Options Space Full uniform mesh Time
Round-robin crossbar Wavelength Static WDM
28
Static WDM switching
Array Waveguide Router (AWGR) Passive
andAlmost ZeroPower
A
B
C
D
29
Linecard dataflow
In
l1
l1, l2,.., lN
R
R
WDM
lN
1
3
1
1
1
1
2
3
4
1
1
1
1
30
Problems of scale
  • For N lt 64, WDM is a good solution.
  • We want N 640.
  • Need to decompose.

31
Decomposing the mesh
2R/8
1
1
2
2
3
3
4
4
5
5
6
6
7
7
8
8
32
Decomposing the mesh
2R/8
2R/8
1
1
2R/4
2R/8
2R/8
2
2
3
3
4
4
5
5
6
6
7
7
8
8
33
When N is too largeDecompose into groups (or
racks)
Group/Rack 1
2R
Array Waveguide Router (AWGR)
l1, l2, , lG
2R
1
2R
Group/Rack G
2R
l1, l2, , lG
2R
G
2R
34
When a linecard is missing
  • Each linecard spreads its data equally over every
    other linecard.
  • Problem If one is missing, or failed, then the
    spreading no longer works.

35
When a linecard fails
2R/3
In
2R/3
2R/3
  • Solution
  • Move light beams
  • Replace AWGR with MEMS switch.
  • Reconfigure when linecard added, removed or
    fails.
  • Finer channel granularity
  • Multiple paths.

2R/3
In
2R/3
2R/3
2R/3
2R/3
In
2R/3
36
SolutionUse transparent MEMS switches
Group/Rack 1
MEMS switches reconfigured only when linecard
added, removed or fails.
2R
2R
2R
Group/Rack G40
2R
2R
2R
Theorems 1. Require LG-1 MEMS switches 2.
Polynomial time reconfiguration algorithm
37
Hybrid Architecture Logical View
38
Hybrid Electro-Optical Architecture
39
Number of MEMS Switches
R
R
R
Linecard 1
Crossbar
Crossbar
Linecard 1
R
R
Linecard 2
Linecard 2
R
R
Linecard 3
Crossbar
Crossbar
Linecard 3
R
R
R
R
R
Linecard 4
Linecard 4
StaticMEMS
R
R
R
Linecard 1
Crossbar
Crossbar
Linecard 1
R
R
R
Linecard 2
Linecard 2
R
R
Linecard 3
Crossbar
Crossbar
Linecard 3
R
R
R
R
Linecard 3
Linecard 4
40
Number of MEMS Switches
R
R
4R/3
Linecard 1
Crossbar
Crossbar
Linecard 1
R
R
Linecard 2
Linecard 2
R
R
Linecard 3
Crossbar
Crossbar
Linecard 3
2R/3
2R/3
R/3
StaticMEMS
R
R
R
Linecard 1
Crossbar
Crossbar
Linecard 1
R/3
2R/3
R
R
Linecard 2
Linecard 2
R/3
R
R
Linecard 3
Crossbar
Crossbar
Linecard 3
2R/3
41
Number of MEMS needed for a schedule
  • Li number of linecards in group i, 1 i G.
    Group i needs to send to group j
  • Assume each group can send at most R to each
    MEMS. Number of MEMS needed between groups i and
    j

42
Number of MEMS needed for a schedule
  • The number of MEMS needed for group i to send to
    group j is Aij.
  • The total number of MEMS needed for group i is
    the sum of the Aijs

43
Constraints for the TDM Schedule
  • Latin Square In any period N, each transmitting
    linecard is connected to each receiving linecard
    exactly once.
  • MEMS constraint In any time-slot, there are at
    most Aij connections between transmitting group i
    and receiving group j, where

44
Example
  • Assume L13, L22, L31
  • Then
  • E.g., at most 2 packets from the first group to
    the first group at each time-slot

45
Bad TDM Transmit Schedule
46
Good TDM Transmit Schedule
47
Configuration Algorithm
  • Assign connections between groups, so MEMS
    constraint is satisfied.
  • Assign group connections to specific linecards,
    so there is exactly one connection per linecard
    pair in the schedule.
  • Comments
  • Algorithm is surprisingly complex.
  • Best running time so far 40 seconds for 640
    linecards.

48
Challenges
In
l1
Address Lookup
l1, l2,.., lG
R
R
WDM
lG
l1, l2,.., lG
R
l1, l2,.., lG
1
1
1
2
2
R160Gb/s
3
4
Out
l1
R
l1, l2,.., lG
R
WDM
lG
49
What we are building
250ms DRAM
320Gb/s
Chip 1 160Gb/s Packet Buffer
Buffer Manager 90nm ASIC
160Gb/s
160Gb/s
Optical Detector
Optical Modulator
50
100Tb/s Load-Balanced Router
L 16 160Gb/s linecards
Write a Comment
User Comments (0)
About PowerShow.com