Interconnection Networks Contd. - PowerPoint PPT Presentation

About This Presentation
Title:

Interconnection Networks Contd.

Description:

Route A - B given by relative address R = B-A. Torus? ... DASH. 16. 2. 16. 6.67. 3D Torus. CRAY T3D. 8. 7. 8. 20. Fat-Tree. Meiko CS-2. 16. 2. 16. 11.5. 2D Mesh ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 23
Provided by: david3085
Learn more at: http://www.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: Interconnection Networks Contd.


1
Interconnection Networks Contd.
  • L.N. Bhuyan
  • Partly from Berkeley Notes

2
More Static Networks Linear Arrays and Rings
  • Linear Array
  • Diameter?
  • Average Distance?
  • Bisection bandwidth?
  • Route A -gt B given by relative address R B-A
  • Torus?
  • Examples FDDI, SCI, FiberChannel Arbitrated
    Loop, KSR1

3
Multidimensional Meshes and Tori
3D Cube
2D Grid
  • d-dimensional array
  • n kd-1 X ...X kO nodes
  • described by d-vector of coordinates (id-1, ...,
    iO)
  • d-dimensional k-ary mesh N kd
  • k dÖN
  • described by d-vector of radix k coordinate
  • d-dimensional k-ary torus (or k-ary d-cube)?
  • Ex Intel Paragon (2D), SGI Origin (Hypercube),
    Cray T3E (3DMesh)

4
Hypercubes
  • Also called binary n-cubes. of nodes N
    2n.
  • O(logN) Hops
  • Good bisection BW
  • Complexity
  • Out degree is n logN
  • correct dimensions in order
  • with random comm. 2 ports per processor

0-D
1-D
2-D
3-D
4-D
5-D !
5
N 26 nodesS (sn-1 sn-2 si s2s1s0)D
(dn-1 dn-2 di d2d1d0)E-cube routing For
i0 to n-1 Compare si and di Route along i
dimension if they differ.Distance Hamming
distance between S and D the no. of dimensions
by which S and D differ.Diameter Maximum
distance n log2 N Dimension of the
hypercubeNo. of alternate parts nFault
tolerance (n-1) O(log2 N)
Routing in Hypercube
000gt001gt011gt111 000gt010gt110gt111 000gt100gt10
1gt111
6
Origin Network
  • Each router has six pairs of 1.56MB/s
    unidirectional links
  • Two to nodes, four to other routers
  • latency 41ns pin to pin across a router
  • Flexible cables up to 3 ft long
  • Four virtual channels request, reply, other
    two for priority or I/O

7
Case Study Cray T3D
  • Build up info in shell
  • Remote memory operations encoded in address

8
Trees
  • Diameter and ave distance logarithmic
  • k-ary tree, height d logk N
  • address specified d-vector of radix k coordinates
    describing path down from root
  • Fixed degree
  • Route up to common ancestor and down
  • R B xor A
  • let i be position of most significant 1 in R,
    route up i1 levels
  • down in direction given by low i1 bits of B
  • H-tree space is O(N) with O(ÖN) long wires
  • Bisection BW?

9
Real Machines
Machine Topology Cycle Time (ns) Channel Width (bits) Routing Delay (cycles) Flit (data bits)
nCUBE/2 Hypercube 25 1 40 32
TMC CM-5 Fat-Tree 25 4 10 4
IBM SP-2 Banyan 25 8 5 16
Intel Paragon 2D Mesh 11.5 16 2 16
Meiko CS-2 Fat-Tree 20 8 7 8
CRAY T3D 3D Torus 6.67 16 2 16
DASH Torus 30 16 2 16
J-Machine 3D Mesh 31 8 2 8
Monsoon Butterfly 20 16 2 16
SGI Origin Hypercube 2.5 20 16 160
Myricom Arbitrary 6.25 16 50 16
  • Wide links, smaller routing delay
  • Tremendous variation

10
What is Dynamic Network
  • Dynamic Network is the network that can connect
    any input to any output by enabling or disabling
    some switches in the network
  • Examples
  • - Shared Bus The bus arbiter connects a
    processor to a memory
  • - Crossbar Consists of a lot of switching
    elements, which can be enabled to connect many
    inputs to many outputs simultaneously
  • - Multistage Network Consists of several
    stages of switches that are enabled to get
    connections
  • - The nodes in static networks (like Mesh)
    also consist of dynamic crossbars

11
Dynamic Network Consists of Switches Switch
Components
  • Output ports
  • transmitter (typically drives clock and data)
  • Input ports
  • synchronizer aligns data signal with local clock
    domain
  • essentially FIFO buffer
  • Crossbar
  • connects each input to any output
  • degree limited by area or pinout
  • Buffering
  • Control logic
  • complexity depends on routing logic and
    scheduling algorithm
  • determine output port for each incoming packet
  • arbitrate among inputs directed at same output

12
Crossbar Switch Design
  • Complexity O(N2) for an NXN Crossbar Why?
    See next page

13
How do you build a crossbar
From Control
N2 switches gt Cost O(N2) Time taken by the
arbiter O(N2)
Multiplexors are controlled from controller
14
Crossbar Contd.
  • An NXN Crossbar allows all N inputs to be
    connected simultaneously to all N outputs
  • It allows all one-to-one mappings, called
    permutations. No. of permutations N!
  • When two or more inputs request the same output,
    only one of them is connected and others are
    either dropped or buffered
  • When processors access memories through crossbar,
    this situation is called memory access conflicts

15
Multistage Interconnection Network
  • A network consisting of multiple stages of
    crossbar switches has the following properties.
  • NxN network for N2n
  • Consists of log2N stages of 2x2 switches
  • Has N/2 2x2 switches per stage
  • Cost O(N log n) instead of O(N2) for Crossbar
  • For N an, a MIN can be similarly designed with
    axa switches

16
Multistage interconnection networks
0
000
1
1
001
2
010
1
3
011
4
100
5
101
6
110
0
7
111
Omega Network and Self Routing
Note Complexity O(Nlog2N) Conflict, less BW than
Crossbar, but cost effective
17
Example SP
  • 8-port switch, 40 MB/s per link, 8-bit phit,
    16-bit flit, single 40 MHz clock
  • packet sw, cut-through, no virtual channel,
    source-based routing
  • variable packet lt 255 bytes, 31 byte fifo per
    input, 7 bytes per output, 16 phit links
  • 128 8-byte chunks in central queue, LRU per
    output
  • run in shadow mode

18
Switching Techniques
  • Circuit Switching A control message is sent from
    source to destination and a path is reserved.
    Communication starts. The path is released when
    communication is complete.
  • Store-and-forward policy (Packet Switching) each
    switch waits for the full packet to arrive in
    switch before sending to the next switch (good
    for WAN)
  • Cut-through routing or worm hole routing switch
    examines the header, decides where to send the
    message, and then starts forwarding it
    immediately
  • In worm hole routing, when head of message is
    blocked, message stays strung out over the
    network, potentially blocking other messages
    (needs only buffer the piece of the packet that
    is sent between switches). CM-5 uses it, with
    each switch buffer being 4 bits per port.
  • Cut through routing lets the tail continue when
    head is blocked, storing the whole message into
    an intermmediate switch. (Requires a buffer large
    enough to hold the largest packet).

19
(No Transcript)
20
Store and Forward vs. Cut-Through
  • Advantage
  • Latency reduces from function ofnumber of
    intermediate switches X by the size of the packet
    to time for 1st part of the packet to
    negotiate the switches the packet size
    interconnect BW

21
StoreForward vs Cut-Through Routing
  • h(n/b D) vs n/b h D
  • what if message is fragmented?
  • wormhole vs virtual cut-through

22
Example
  • Q. Compare the efficiency of store-and-forward
    (packet switching) vs. wormhole routing for
    transmission of a 20 bytes packet between a
    source and destination, which are d-nodes apart.
    Each node takes 0.25 microsecond and link
    transfer rate is 20 MB/sec.
  • Answer Time to transfer 20 bytes over a link
    20/20 MB/sec 1 microsecond.
  • Packet switching nodes x (node delay
    transfer time) d x (.25 1) 1.25 d
    microseconds
  • Wormhole ( nodes x node delay) transfer time
  • 0.25 d 1
  • Book For d7, packet switching takes 8.75
    microseconds vs. 2.75 microseconds for wormhole
    routing
Write a Comment
User Comments (0)
About PowerShow.com