Multiprocessors and the Interconnect - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Multiprocessors and the Interconnect

Description:

Multiprocessors and the Interconnect Scope Taxonomy Metrics Topologies Characteristics cost performance Interconnection Carry data between processors and to memory ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 38
Provided by: www2EngrA
Category:

less

Transcript and Presenter's Notes

Title: Multiprocessors and the Interconnect


1
Multiprocessors and the Interconnect
2
Scope
  • Taxonomy
  • Metrics
  • Topologies
  • Characteristics
  • cost
  • performance

3
Interconnection
  • Carry data between processors and to memory
  • Interconnect components
  • switches
  • links (wires, fiber)
  • Interconnection network flavors
  • static networks point-to-point communication
    links
  • AKA direct networks.
  • dynamic networks switches and communication
    links
  • AKA indirect networks.

4
Static vs. Dynamic
5
Dynamic Networks
  • Switch maps a fixed number of inputs to outputs
  • Number of ports on a switch degree of the
    switch.
  • Switch cost
  • grows as the square of switch degree
  • peripheral hardware grows linearly with switch
    degree
  • packaging cost grows linearly with the number of
    pins
  • Key property blocking vs. non-blocking
  • blocking
  • path from p to q may conflict with path from r to
    s
  • for independent p, q, r, s
  • non-blocking
  • disjoint paths between each pair of independent
    sources and sinks

6
Network Interface
  • Processor nodes link to the interconnect
  • Network interface responsibilities
  • packetizing communication data
  • computing routing information
  • buffering incoming/outgoing data
  • Network interface connection
  • I/O bus PCI or PCIx on many modern systems
  • memory bus e.g. AMD HyperTransport, Intel
    QuickPath
  • higher bandwidth and tighter coupling than I/O
    bus
  • Network performance
  • depends on relative speeds of I/O and memory buses

7
Topologies
  • Many network topologies
  • Tradeoff performance vs. cost
  • Machines often implement hybrids of multiple
    topologies
  • packaging
  • cost
  • available components

8
Metrics
  • Degree
  • number of links per node
  • Diameter
  • longest distance between two nodes in the network
  • Bisection Width
  • min of wire cuts to divide the network in 2
    halves
  • Cost
  • links or switches

9
Topologies Bus
  • All processors access a common bus for exchanging
    data
  • Used in simplest and earliest parallel machines
  • Advantages
  • distance between any two nodes is O(1)
  • provides a convenient broadcast media
  • Disadvantages
  • bus bandwidth is a performance bottleneck

10
Bus Systems
  • A bus system is a hierarchy of buses connection
    various system and subsystem components.
  • has a complement of control, signal, and power
    lines.
  • a variety of buses in a system
  • Local bus (usually integral to a system board)
    connects various major system components (chips)
  • Memory bus used within a memory board to
    connect the interface, the controller, and the
    memory cells
  • Data bus might be used on an I/O board or VLSI
    chip to connect various components
  • Backplane like a local bus, but with connectors
    to which other boards can be attached

11
Bridges
  • The term bridge is used to denote a device that
    is used to connect two (or possibly more) buses.
  • The interconnected buses may use the same
  • standards, or they may be different (e.g. PCI in
    a modern PC).
  • Bridge functions include
  • Communication protocol conversion
  • Interrupt handling
  • Serving as cache and memory agents

12
Bus
  • Since much of the data accessed by processors is
    local to the processor, cache is critical for the
    performance of busbased machines

13
Bus Replacement Direct Connect
  • Intel Quickpath interconnect (2009 - present)

14
Direct Connect 4 Node Configurations
4N FC XFIRE BW 29.9GB/s Diam 1, Avg 0.75
4N SQ XFIRE BW 14.9GB/s Diam 2 avg1
Figure Credit The Opteron CMP
NorthBridge Architecture, Now and in the Future,
AMD , Pat Conway, Bill Hughes , HOT CHIPS 2006
15
Direct Connect 8 Node Configurations
16
Crossbar Network
  • A crossbar network uses an pm grid of switches
    to connect p inputs to m outputs in a
    non-blocking manner
  • A non-blocking crossbar network connecting p
    processors to b memory banks
  • Cost of a crossbar O(p2)
  • Generally difficult to scale for large values of
    p
  • Earth Simulator custom 640-way single-stage
    crossbar

17
Assessing Network Alternatives
  • Buses
  • excellent cost scalability
  • poor performance scalability
  • Crossbars
  • excellent performance scalability
  • poor cost scalability
  • Multistage interconnects
  • compromise between these extremes

18
Multistage Network
19
Multistage Omega Network
  • Organization
  • log p stages
  • p inputs/outputs
  • At each stage, input i is connected to output j
    if

20
Omega Network Stage
  • Each Omega stage is connected in a perfect shuffle

21
Omega Network Switches
  • 22 switches connect perfect shuffles
  • Each switch operates in two modes

22
Multistage Omega Network
  • Cost p/2 log p switching nodes ? O(p log p)

23
Omega Network Routing
  • Let
  • s binary representation of the source processor
  • d binary representation of the destination
    processor or memory
  • The data traverses the link to the first
    switching node
  • if the most significant bit of s and d are the
    same
  • route data in pass-through mode by the switch
  • else
  • use crossover path
  • Strip off leftmost bit of s and d
  • Repeat for each of the log p switching stages

24
Omega Network Routing
25
Blocking in an Omega Network
26
Clos Network (non-blocking)
27
Star Connected Network
  • Static counterparts of buses
  • Every node connected only to a common node at the
    center
  • Distance between any pair of nodes is O(1)

28
Completely Connected Network
  • Each processor is connected to every other
    processor
  • static counterparts of crossbars
  • number of links in the network scales as O(p2)

29
Linear Array
  • Each node has two neighbors left right
  • If connection between nodes at ends 1D torus
    (ring)

30
Meshes and k-d Meshes
  • Mesh generalization of linear array to 2D
  • nodes have 4 neighbors north, south, east, and
    west.
  • k-d mesh
  • d-dimensional mesh
  • node have 2d neighbors

31
Hypercubes
  • Special d-dimensional mesh p nodes, d log p

32
Hypercube Properties
  • Distance between any two nodes is at most log p.
  • Each node has log p neighbors
  • Distance between two nodes of bit positions
    that differ between node numbers

33
Trees
34
Tree Properties
  • Distance between any two nodes is no more than 2
    log p
  • Trees can be laid out in 2D with no wire
    crossings
  • Problem
  • links closer to root carry gt traffic than those
    at lower levels.
  • Solution fat tree
  • widen links as depth gets shallower
  • copes with higher traffic on links near root

35
Fat Tree Network
  • Fat tree network for 16 processing nodes
  • Can judiciously choose fatness of links
  • take full advantage of technology and packaging
    constraints

36
Metrics for Interconnection Networks
37
Metrics for Dynamic Interconnection Networks
Write a Comment
User Comments (0)
About PowerShow.com