Title: Static and Dynamic Networks
1Static and Dynamic Networks
- L.N. Bhuyan
- Partly from Berkeley Notes
2Hypercubes
- Also called binary n-cubes. of nodes N
2n. - O(logN) Hops
- Good bisection BW
- Complexity
- Out degree is n logN
- correct dimensions in order
- with random comm. 2 ports per processor
0-D
1-D
2-D
3-D
4-D
5-D !
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Toplology Summary
Topology Degree Diameter Ave Dist Bisection D (D
ave) _at_ P1024 1D Array 2 N-1 N / 3 1 huge 1D
Ring 2 N/2 N/4 2 2D Mesh 4 2 (N1/2 - 1) 2/3
N1/2 N1/2 63 (21) 2D Torus 4 N1/2 1/2
N1/2 2N1/2 32 (16) k-ary n-cube 2n nk/2 nk/4 nk/4
15 (7.5) _at_n3 Hypercube n log N n n/2 N/2 10
(5)
- All have some bad permutations
- many popular permutations are very bad for meshs
(transpose) - randomness in wiring or routing makes it hard to
find a bad one!
7How Many Dimensions?
- n 2 or n 3
- Short wires, easy to build
- Many hops, low bisection bandwidth
- Requires traffic locality
- n gt 4
- Harder to build, more wires, longer average
length - Fewer hops, better bisection bandwidth
- Can handle non-local traffic
- k-ary d-cubes provide a consistent framework for
comparison - N kd
- scale dimension (d) or nodes per dimension (k)
- assume cut-through
8Traditional Scaling Latency(P)
- Assumes equal channel width
- independent of node count or dimension
- dominated by average distance
9Average Distance
ave dist d (k-1)/2
- but, equal channel width is not equal cost!
- Higher dimension gt more channels
10Latency under Contention
- Optimal packet size? Channel utilization?
11Dynamic Networks
- L.N. Bhuyan
- Partly from Berkeley Notes
12What is Dynamic Network
- Dynamic Network is the network that can connect
any input to any output by enabling or disabling
some switches in the network - Examples
- - Shared Bus The bus arbiter connects a
processor to a memory - - Crossbar Consists of a lot of switching
elements, which can be enabled to connect many
inputs to many outputs simultaneously - - Multistage Network Consists of several
stages of switches that are enabled to get
connections - - The nodes in static networks (like Mesh)
also consist of dynamic crossbars
13Crossbar Switch Design
- Complexity O(N2) for an NXN Crossbar Why?
See next page
14How do you build a crossbar
From Control
N2 switches gt Cost O(N2) Time taken by the
arbiter O(N2)
Multiplexors are controlled from controller
15Crossbar Contd.
- An NXN Crossbar allows all N inputs to be
connected simultaneously to all N outputs - It allows all one-to-one mappings, called
permutations. No. of permutations N! - When two or more inputs request the same output,
only one of them is connected and others are
either dropped or buffered - When processors access memories through crossbar,
this situation is called memory access conflicts - Given p as the probability of request by a
processor per cycle and assuming that each of N
processors request is uniformly directed to all
N memories, the average number of connections
allowed per cycle, called Bandwidth (BW) is - BW N1- (1-p/N)N Derive this!!!
16Input buffered swtich
- Independent routing logic per input - FSM
- Scheduler logic arbitrates each output -
priority, FIFO, random - Head-of-line blocking problem The head packet
in a buffer cannot depart because the output is
busy with another packet. The second packet may
be destined to an output that is free, but cannot
depart due to blocking by the first packet gt One
solution is to create multiple input queues, one
per output, called Virtual Output Queuing
adopted in most routers. - Scheduler Design How to ensure maximum
simultaneous connections is a challenging
research area.
17Problems with Input-Buffered Switch
- FIFO Input buffers give rise to Head of the Line
(HOL) problem - Current routers employ a separate input queue for
each output, called virtual output queue (VOQ) - Then how to schedule the packets from different
VOQs for transmission?
18VOQ-based Input Buffered Switch
19Scheduling in Input Buffered Switch
- n independent arbitration problems?
- static priority, random, round-robin
- simplifications due to routing algorithm?
- general case is max bipartite matching
Iterative algorithms iSLIP in Cisco
20Iterative Matching A 3-step Procedure
Request
Accept
Grant
21Output/Shared Buffered Switch
Shared Buffer
RAM speed has to be N times the link speed.
Output Buffered Switch has buffers at output to
store packets. There is always a minimal
transmitting buffer at the input. What happens if
there are 2 or more packets to the same output at
the same time. In order to capture both, the
switch speed has to be N times that of link speed
gt Difficult to design.
22Shared Buffer Switch IBM SP Vulcan switch
- Many gigabit Ethernet switches use similar design
without the cut-through - 128 8-byte chunks in central queue, LRU per
output
23SGI SPIDER IEEE Micro Jan 1997
24Multistage Interconnection Network
- A network consisting of multiple stages of
crossbar switches has the following properties. - NxN network for N2n
- Consists of log2N stages of 2x2 switches
- Has N/2 2x2 switches per stage
- Cost O(N log n) instead of O(N2) for Crossbar
- For N an, a MIN can be similarly designed with
axa switches
25Multistage interconnection networks
0
000
1
1
001
2
010
1
3
011
4
100
5
101
6
110
0
7
111
Omega Network Complexity O(Nlog2N)
26Perfect Shuffle
000
000
000
000
0
001
001
001
001
1
010
010
010
010
2
011
011
011
011
3
100
100
100
100
4
101
101
101
101
5
110
110
110
110
6
111
111
111
111
7
(a) Perfect shuffle
(b) Inverse perfect shuffle
shuffle interconnection S(an-1 an-2 a1 a0)
(an-2 an-3 a0 an-1 )
27Omega Network
- Every stage of switches is preceded by a perfect
shuffle interconnection - S(an-1 an-2 a1 a0) (an-2 an-3 a0 an-1 )
- An input can be connected to a straight or
exchange output in a 2x2 switch. - E(an-1 an-2 a1 a0) (an-1 an-2 a1 a0)
- To route a message/packet in an Omega network,
the destination tag which is binary equivalent of
the destination is used, (dn-1 dn-2 d1 d0). The
ith bit di is used to control the routing at the
ith stage counted from the right with 0 lt i lt
n-1. If di 0, the input is connected to the
upper output. If di 1, it is connected to the
lower output.
28Self Routing
- A processor generates a tag that is binary
equivalent of the destination - MSB controls the leftmost stage and the lsb
controls the rightmost stage of the Omega
network. A small controller inside the 2 x 2
switch senses this bit and enables the connection - If bit ci 0, the request is to the upper
output if it is 1, the request is to the lower
output. - Based on digit if switch size is greater than 2
- Network conflict - Select Round Robin
- Less Bandwidth than crossbar, but more cost
effective - What about QoS? Future research
29Theorem The Omega network is self routing
- Let source be (sn-1sn-2 s2 s1s0) and
destination be (dn-1dn-2 d2 d1d0). Before
Stage 1, the source is switched to the position
(sn-2sn-3 s1 s0sn-1) due to perfect shuffle
connection. After Stage 1 it is switched to
(sn-2sn-3 s1 s0dn-1) as per the (n-1)th of
the destination. - Before 2nd stage of the switches, the source is
connected to (sn-3 s0dn-1sn-2) as after 2nd
stage it becomes (sn-3 s0dn-1dn-2) - If we continue like this for n stages, the
source matches (dn-1dn-2 di d1d0) which is
the destination.
30Example SP
- 8-port switch, 40 MB/s per link, 8-bit phit,
16-bit flit, single 40 MHz clock - packet sw, cut-through, no virtual channel,
source-based routing - variable packet lt 255 bytes, 31 byte fifo per
input, 7 bytes per output, 16 phit links
31Summary
- Routing Algorithms restrict the set of routes
within the topology - simple mechanism selects turn at each hop
- arithmetic, selection, lookup
- Deadlock-free if channel dependence graph is
acyclic - limit turns to eliminate dependences
- add separate channel resources to break
dependences - combination of topology, algorithm, and switch
design - Deterministic vs. adaptive routing
- Switch design issues
- input/output/pooled buffering, routing logic,
selection logic - Flow control
- Real networks are a package of design choices