Current major high performance networking technologies - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Current major high performance networking technologies

Description:

Title: PowerPoint Presentation Author: Rachel Fang Last modified by: surfing Created Date: 5/25/2005 12:43:22 AM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:141
Avg rating:3.0/5.0
Slides: 31
Provided by: Rache130
Category:

less

Transcript and Presenter's Notes

Title: Current major high performance networking technologies


1
Current major high performance networking
technologies
  • InfiniBand
  • 10G-Ethernet

2
InfiniBand
  • Was originally designed as a system area
    network connecting CPUs and I/O devices.
  • A larger role replaceing all I/O standards for
    data centers PCI, Fibre Channel, and Ethernet
    everything connects through InfiniBand.
  • A less role Low latency, high bandwidth, low
    overhead interconnect for commercial datacenters
    between servers and storage.
  • Can form local area or even large area networks.
  • Has become the de-facto interconnect for high
    performance clusters (100 systems in top 500
    supercomputer list).

3
  • Infiniband architecture
  • Specification (Infiniband architecture
    specification release 1.2.1, January 2008/Oct.
    2006) available at Infiniband Trade Association
    (http//www.infinibandta.org)

4
  • Infiniband architecture overview

5
  • Infiniband architecture overview
  • Components
  • Links, Channel adaptors, Switches, Routers
  • The specification allows Infiniband wide area
    network, but mostly adopted as a system/storage
    area network.
  • Topology
  • Irregular
  • Regular Fat tree, hypercube, etc

6
  • Infiniband architecture overview
  • Link speed (signal rate)
  • Single data rate (SDR) 2.5Gbps (1X), 10Gbps
    (4X), and 30Gbps (12X).
  • Double data rate (DDR) 5Gbps (1X), 20 Gbps (4X),
    60Gbps(12X)
  • Quad data rate (QDR) 10Gbps (1X), 40Gbps(4X),
    120Gbps(12X)
  • Fourteen data rate (FDR) 14Gbps(1X), 56Gbps(4X),
    168Gbps(12X)
  • Enhanced data rate (EDR) 25Gbps(1X),
    100Gbps(4X), 300Gbps(12X)
  • 8b/10b enconding in SDR, DDR, and QDR
  • 64b/66b enconding in FDR and EDR

7
Infiniband link speed
Infiniband Roadmap from InfiniBand trade
association http//www.infinibandta.org/content/pa
ges.php?pgtechnology_overview
8
  • Layer architecture somewhat similar to TCP/IP
  • Physical layer
  • Link layer
  • Error detection (CRC checksum)
  • flow control (credit based)
  • switching, virtual lanes (VL),
  • forwarding table computed by subnet manager
  • Not adaptive
  • Network layer across subnets.
  • No use for the cluster environment
  • Transport layer
  • Reliable/unreliable, connection/datagram
  • Verbs interface between adaptors and OS/Users

9
  • Link layer Packet format
  • Local Route Header (LRH) 8 bytes. Used for local
    routing by switches within a IBA subnet
  • Global Route Header (GRH) 40 Bytes. Used for
    routing between subnets
  • Base Transport header (BTH) 12 Bytes, for IBA
    transport
  • Reliable datagram extended transport header
    (RDETH) 4 bytes, just for reliable datagram
  • Datagram extended transport header (DETH) 8
    bytes
  • RDMA extended transport header (RETH) 16 bytes
  • Atomic, ACK, Atomic ACK,
  • Immediate DATA extended transport header 4
    bytes, optimized for small packets.
  • Invalidate
  • Invariant CRC and variant CRC
  • CRC for fields not changed and changed.

10
  • Local Route Header
  • Switching based on the destination port address
    (LID)
  • Multipath switching by allocating multiple LIDs
    to one port

11
  • Local Route Header
  • Switching based on the destination port address
    (LID).
  • Forwarding table entry (LID, outgoing-port)

12
  • Local Route Header
  • Multipath switching by allocating multiple LIDs
    to one port, see the previous example.
  • GRH same format as IPV6 address (16 bytes
    address)

13
Subnet management
  • Discover subnet topology and topology changes,
    compute the paths, assign LIDs, distribute the
    routes, configure devices
  • Not well-defined in the specification
  • Forwarding table must be computed such that all
    devices in the network can be reached.
  • References
  • A. Bermudez, R. Casado, F.J. Quiles, T. M.
    Pinkston, J. Duato, Evaluation of a Subnet
    Management Mechanism for Infiniband Networks,
    ICPP 2003.
  • A. Vishnu, A. R. Mamidala, H. Jin, D. K. Panda,
    Performance Modeling of Subnet Management on Fat
    Tree Infiniband Networks using OpenSM, Workshop
    on System Management Tools on Large Scale
    Parallel Systems, Held in Conjunction with IPDPS
    2005

14
  • InfiniBand devices and entities related to subnet
    management
  • Devices Channel Adapters (CA), Host Channel
    Adapters, switches, routers
  • Subnet manager (SM) discovering, configuring,
    activating and managing the subnet
  • A subnet management agent (SMA) in every device
    generates, responses to control packets (subnet
    management packets (SMPs)), and configures local
    components for subnet management
  • SM exchange control packets with SMA with subnet
    management interface (SMI).

15
(No Transcript)
16
  • Subnet management packets (SMP)
  • 256 bytes of data
  • Use unreliable datagram service on the management
    virtual lane (VL 15)
  • Two routing schemes
  • LID routed use lookup table for forwarding
  • Use after the subnet is setup. E.g. Check the
    status of an active port
  • Direct routed has the information of the output
    port for each intermediate hop.
  • Subnet discovery for the subnet is setup

17
  • Subnet management packets (SMP)
  • Define the operation to be performed by SM
  • Get get the information about CA, switch, port
  • Set set the attribute of a port (e.g. LID)
  • GetResp get response
  • Trap inform SM about the state of a local node
  • A SMA stop sending Trap message until it receives
    TrapRepress packet.
  • Topology information can be obtained by a sweep
    and by peridical Traps.

18
  • Subnet Management phases
  • Topology discovery sending direct routed SMP to
    every port and processing the responses.
  • Path computation computing valid paths between
    each pair of end node
  • Path distribution phase configuring the
    forwarding table

19
  • Subnet discovery
  • SM starts by sending a direct routed Get SMP to
    its local node. Upone receiving response, SM
    sends SMPs with additive depth.

20
  • Path computation
  • Compute paths between all pair of nodes
  • For irregular topology
  • Up/Down routing does not work directly
  • Need information about the incoming interface and
    the destination and Infiniband only uses
    destination
  • Potential solution
  • find all possible paths
  • remove all possible down link following up links
    in each node
  • find one output port for each destination
  • Other solutions destination renaming
  • Fat tree topology
  • What is the best that can be achieved (optimal
    routing) is also not clear.

21
  • Path distribution
  • Ordering issue the network may be in an
    inconsistent state when partially updated, which
    may result in deadlock during this period.
  • Traditional solution, no data packets for a
    period of time
  • deadlock free reconfiguration schemes.
  • How to do this correctly, effectively, and
    incrementally is still open.

22
  • Base transport header

23
  • Verbs
  • OS/Users access the adaptor through verbs
  • Communication mechanism Queue Pair (QP)
  • Users can queue up a set of instructions that the
    hardware executes.
  • A pair of queues in each QP one for send, one
    for receive.
  • Users can post send requests to the send queue
    and receive requests to the receive queue.
  • Three types of send operations SEND,
    RDMA-(WRITE, READ, ATOMIC), MEMORY-BINDING
  • One receive operation (matching SEND)

24
(No Transcript)
25
(No Transcript)
26
  • Queue Pair
  • The status of the result of an operation
    (send/receive) is stored in the complete queue.
  • Send/receive queues can bind to different
    complete queues.
  • Related system level verbs
  • Open QP, create complete queue, Open HCA, open
    protection domain, register memory, allocate
    memory window, etc
  • User level verbs
  • post send/receive request, poll for completion.

27
  • To communicate
  • Make system calls to setup everything (open QP,
    bind QP to port, bind complete queues, connect
    local QP to remote QP, register memory, etc).
  • Post send/receive requests.
  • Check completion.

28
  • InfiniBand has an almost perfect software/network
    interface (Chien'94 paper)
  • The network subsystem realizes all user level
    functionality.
  • User level accesses to the network interface. A
    few machine instructions will accomplish the
    transmission task without involving the OS.
  • Network supports in-order delivery and and fault
    tolerance.
  • Buffer management is pushed out to the user.

29
  • Mellanox product brief Switch-2 Virtual
    Protocol Interconnect Optimized for SDN

30
  • Mellanox product brief Switch-2 Virtual
    Protocol Interconnect Optimized for SDN
  • Virtual protocol interconnect
  • Automatically sensing Infiniband, Ethernet and
    Fiber channel, and data center bridging
  • Flexible port configuration
  • 36 IB FDR ports or 40/56GbE Ports
  • 64 10GbE ports
  • 24 2/4/5Gb FC ports
  • SDN support
  • Complete support for Openflow and Subnet
    management
  • Remote configurable routing table, overlay,
    control plan.
Write a Comment
User Comments (0)
About PowerShow.com