EE384Y: Packet Switch Architectures - PowerPoint PPT Presentation

About This Presentation
Title:

EE384Y: Packet Switch Architectures

Description:

Title: No Slide Title Author: nickm Last modified by: Nick McKeown Created Date: 2/12/1999 3:01:23 AM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 23
Provided by: NickM154
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: EE384Y: Packet Switch Architectures


1
EE384Y Packet Switch Architectures Part
II Scaling Crossbar Switches
Nick McKeown Professor of Electrical Engineering
and Computer Science, Stanford
University nickm_at_stanford.edu http//www.stanford.
edu/nickm
2
Outline
  • Up until now, we have focused on high performance
    packet switches with
  • A crossbar switching fabric,
  • Input queues (and possibly output queues as
    well),
  • Virtual output queues, and
  • Centralized arbitration/scheduling algorithm.
  • Today well talk about the implementation of the
    crossbar switch fabric itself. How are they
    built, how do they scale, and what limits their
    capacity?

3
Crossbar switchLimiting factors
  • N2 crosspoints per chip, or N x N-to-1
    multiplexors
  • Its not obvious how to build a crossbar from
    multiple chips,
  • Capacity of I/Os per chip.
  • State of the art About 300 pins each operating
    at 3.125Gb/s 1Tb/s per chip.
  • About 1/3 to 1/2 of this capacity available in
    practice because of overhead and speedup.
  • Crossbar chips today are limited by I/O
    capacity.

4
Scaling number of outputs Trying to build a
crossbar from multiple chips
Building Block
4 inputs
4 outputs
Eight inputs and eight outputs required!
5
Scaling line-rate Bit-sliced parallelism
k
  • Cell is striped across multiple identical
    planes.
  • Crossbar switched bus.
  • Scheduler makes same decision for all slices.

Linecard
8
7
6
5
4
Cell
Cell
Cell
3
2
1
Scheduler
6
Scaling line-rate Time-sliced parallelism
k
  • Cell carried by one plane takes k cell times.
  • Scheduler is unchanged.
  • Scheduler makes decision for each slice in turn.

Linecard
Cell
8
7
6
5
4
Cell
3
Cell
2
Cell
1
Cell
Cell
Scheduler
7
Scaling a crossbar
  • Conclusion scaling the capacity is relatively
    straightforward (although the chip count and
    power may become a problem).
  • What if we want to increase the number of ports?
  • Can we build a crossbar-equivalent from multiple
    stages of smaller crossbars?
  • If so, what properties should it have?

8
3-stage Clos Network
m x m
1
n x k
k x n
1
1
1
2
1
n
n
2

2



N
m

m
N
N n x m k gt n
k
9
With k n, is a Clos network non-blocking like a
crossbar?
Consider the example scheduler chooses to
match (1,1), (2,4), (3,3), (4,2)
10
With k n is a Clos network non-blocking like a
crossbar?
Consider the example scheduler chooses to
match (1,1), (2,2), (4,4), (5,3),
By rearranging matches, the connections could be
added. Q Is this Clos network rearrangeably
non-blocking?
11
With k n a Clos network is rearrangeably
non-blocking
  • Routing matches is equivalent to edge-coloring in
    a bipartite multigraph.
  • Colors correspond to middle-stage switches.

(1,1), (2,4), (3,3), (4,2)
No two edges at a vertex may be colored the same.
Each vertex corresponds to an n x k or k x n
switch.
Vizing 64 a D-degree bipartite graph can be
colored in D colors. Therefore, if k n, a
3-stage Clos network is rearrangeably
non-blocking (and can therefore perform any
permutation).
12
How complex is the rearrangement?
  • Method 1 Find a maximum size bipartite matching
    for each of D colors in turn, O(DN2.5).
  • Method 2 Partition graph into Euler sets,
    O(N.logD) Cole et al. 00

13
Edge-Coloring using Euler sets
  • Make the graph regular Modify the graph so that
    every vertex has the same degree, D. combine
    vertices and add edges O(E).
  • For D2i, perform i Euler splits and 1-color
    each resulting graph. This is logD operations,
    each of O(E).

14
Euler partition of a graph
  • Euler partiton of graph G
  • Each odd degree vertex is at the end of one open
    path.
  • Each even degree vertex is at the end of no open
    path.

15
Euler split of a graph
G
G1
G2
  • Euler split of G into G1 and G2
  • Scan each path in an Euler partition.
  • Place each alternate edge into G1 and G2

16
Edge-Coloring using Euler sets
  • Make the graph regular Modify the graph so that
    every vertex has the same degree, D. combine
    vertices and add edges O(E).
  • For D2i, perform i Euler splits and 1-color
    each resulting graph. This is logD operations,
    each of O(E).

17
Implementation
Scheduler
Route connections
Request graph
Permutation
Paths
18
Implementation
  • Pros
  • A rearrangeably non-blocking switch can perform
    any permutation
  • A cell switch is time-slotted, so all connections
    are rearranged every time slot anyway
  • Cons
  • Rearrangement algorithms are complex (in addition
    to the scheduler)
  • Can we eliminate the need to rearrange?

19
Strictly non-blocking Clos Network
Clos Theorem If k gt 2n 1, then a new
connection can always be added without
rearrangement.
20
m x m
M1
n x k
k x n
1
1
I1
O1
M2
n
n
I2
O2




Im
Om

N
N
N n x m k gt n
Mk
21
Clos Theorem
x
Ia
Ob
n 1 alreadyin use at inputand output.
x n
  1. Consider adding the n-th connection between1st
    stage Ia and 3rd stage Ob.
  2. We need to ensure that there is always
    somecenter-stage M available.
  3. If k gt (n 1) (n 1) , then there is always
    an M available. i.e. we need k gt 2n 1.

22
Scaling Crossbars Summary
  • Scaling capacity through parallelism (bit-slicing
    and time-slicing) is straightforward.
  • Scaling number of ports is harder
  • Clos network
  • Rearrangeably non-blocking with k n, but
    routing is complicated,
  • Strictly non-blocking with k gt 2n 1, so
    routing is simple. But requires more bisection
    bandwidth.
Write a Comment
User Comments (0)
About PowerShow.com