Local - PowerPoint PPT Presentation

1 / 59
About This Presentation
Title:

Local

Description:

Module #4 Storage Area Networks, Fibre Channel, & High Performance ... Example iSCSI HBA/NIC (Qlogic 4052C) 100/1000 Full Duplex Ethernet. 133-MHz PCI-X ... – PowerPoint PPT presentation

Number of Views:77
Avg rating:3.0/5.0
Slides: 60
Provided by: JR98
Category:
Tags: local | nic | offload

less

Transcript and Presenter's Notes

Title: Local


1
Local Wide Area NetworkingJohns Hopkins
University Course 635.412.71
  • Module 4 Storage Area Networks, Fibre Channel,
    High Performance Computing Interconnects

2
SANs, Fibre Channel, HPCIntroduction
  • As computing network technologies evolve, these
    technologies are intersecting in new ways
  • High Performance Storage Storage Area Networks
    (SANs)
  • High Performance Computing Clusters
  • Reasons for the rise of High Performance Storage
    SANs
  • Growth of Storage Needs
  • Growth of Physical Storage Capacity
  • Need to solve issues of
  • Performance
  • Reliability
  • Cost Efficiency
  • Management
  • Security

3
SANs, Fibre Channel, HPCTerminology
  • Common Storage Terminology
  • Channel
  • Interconnect
  • PCI, PCI-X, PCI-e (Peripheral Component
    Interconnect)
  • SCSI/iSCSI (Small Computer Systems Interface)
  • ATA/SATA
  • RAID (Redundant Array of Inexpensive Disks)
  • HBA (Host Bus Adapter)
  • DAS (Direct Attached Storage)
  • JBOD (Just a Bunch of Disks)
  • NAS (Network Attached Storage)
  • And, of course SAN

4
SANs, Fibre Channel, HPCSo What does a SAN
look like?
5
Low-Cost SAN iSCSIIntroduction
  • Introduction
  • iSCSI (i for internet) is a basic protocol that
    allows storage links to traverse TCP/IP-based
    networks
  • Standardized by the IETF in 2004 (RFC 3720)
  • Encapsulates block-level SCSI commands for
    transport across TCP/IP
  • An initiator uses iSCSI to access a remote
    target (typically a LUN)
  • Design Goal match performance of existing SCSI
    transport
  • Key Uses
  • Storage Virtualization (especially SMB)
  • Storage Consolidation
  • Disaster Recovery

6
Low-Cost SAN iSCSITechnology
  • Technical Details (1)
  • Encapsulates block-level SCSI (CDB) commands for
    transport
  • Appears as application-level TCP/IP traffic
  • Uses TCP and (optionally) IPsec for transport
  • Multiple TCP connections can be used for an iSCSI
    session
  • CHAP can be used for initial authentication of a
    session
  • Follows the SCSI Remote Procedure Invocation
    model
  • SCSI Commands carried in iSCSI request PDUs
  • SCSI responses, data, status reports carried in
    iSCSI response PDUs
  • Uses DNS, SLP, and iSNS for resource
    location/discovery
  • SLP Service Location Protocol (RFC 2608/4108)
  • iSNS internet Storage Naming Service (RFC 4171)

7
Low-Cost SAN iSCSITechnology
  • Technical Details (2)
  • Examples of iSCSI PDUs carrying SCSI Payloads
  • SCSI Command/Response
  • SCSI Data-Out/Data-In
  • Examples of iSCSI PDUs with iSCSI-only payload
  • Login Request/Response
  • Logout Request/Response
  • SNACK Request (Retransmission)
  • Example iSCSI PDU -gt Next Slide

8
Low-Cost SAN iSCSITechnology
  • Example is SCSI DATA-IN PDU (WRITE operation)

9
Low-Cost SAN iSCSIImplementation
  • Most iSCSI implementations use TCP/IP over
    Ethernet
  • Cost advantages
  • Familiarity Ease of Troubleshooting
  • Storage infrastructure parallels the rest of LAN
    infrastructure
  • For most low/mid-range applications performance
    is not an issue (especially when Jumbo frames
    are used)
  • iSCSI Gateways allow low-cost iSCSI-enabled
    servers to talk to high-end Fibre Channel-based
    storage resources
  • Windows 2003, Linux 2.6, VMware have iSCSI
    support
  • Example iSCSI HBA/NIC (Qlogic 4052C)
  • 100/1000 Full Duplex Ethernet
  • 133-MHz PCI-X
  • Full TOE iSCSI Offload
  • QoS VLAN enabled

10
Fibre ChannelIntroduction
  • Originally developed for mainframe
    supercomputing environments to connect together
    high speed clusters storage
  • Development began in 1988 under the auspices of
    the ANSI T11 committee (device level interfaces)
    as a standard in 1994
  • Besides its use as a very high bandwidth I/O
    channel technology, there is interest in Fibre
    Channel as a LAN technology because of its high
    speed and unique combination of channel network
    oriented properties
  • Data-type qualifiers for routing data into
    specific interface buffers
  • Link-level constructs designed to support
    individual I/O operations
  • Support for existing I/O interface specifications
    (SCSI, HIPPI, etc.)
  • Full multiplexing capabilities
  • Peer-to-peer connectivity between any two ports
    in a FC network
  • Ability to internetwork with other LAN, WAN,
    I/O technologies
  • This reflects the book ca. 2000 but does not
    appear to be the case

11
Fibre ChannelIntroduction
  • Comparison of Fibre Channel with Gigabit Ethernet
    and ATM Table 9.1 with updates

12
Fibre ChannelArchitecture
  • Designed to provide a common, efficient,
    high-speed transport to a wide variety of devices
    through a single port type
  • Requirements outlined by the Fibre Channel
    Association
  • Full-duplex links over a fiber pair (one
    transmit/one receive)
  • Bi-directional performance up to 6.4-Gbps on a
    single link
  • Support over distances up to 10 kilometers
  • Small connectors for high density applications
  • High-capacity utilization with distance
    insensitivity
  • Greater connectivity than existing multi-drop
    channels
  • Broad availability at reasonable cost
  • Support for multiple cost/performance levels,
    from PCs to clusters
  • Ability to carry multiple protocols and command
    sets
  • The best way to meet such demanding requirements
    was to develop a transport mechanism based on
    simple point-to-point links a switching network

13
Fibre ChannelTerminology
  • Fibre Channel, having a different heritage than
    other LAN/WAN technologies, has different
    terminology Table 9.2
  • Dedicated Connection A circuit guaranteed and
    retained by the fabric for two specified N_Ports
  • Exchange The basic mechanism that transfers
    information, consisting of one or more related
    non-concurrent sequences in one or both
    directions
  • Fabric The entity that interconnects various
    N_Ports attached to it and handle the routing of
    frames
  • Intermix A mode of service that reserves the
    full FC capacity for a dedicated (Class 1)
    connection but allows the transport of additional
    connectionless data if space is available
  • Node A collection of one or more N_Ports

14
Fibre ChannelTerminology (continued)
  • Fibre Channel, having a different heritage than
    other LAN/WAN technologies, has different
    terminology Table 9.2
  • Operation A set of one or more (possibly
    concurrent) exchanges associated with a logical
    construct above the FC-2 layer
  • Originator The logical function associated with
    an N_Ports that initiates an exchange
  • Port The hardware entity within a node that
    performs data communications over a FC link
  • Responder The logical function in a N_Port
    responsible for supporting an exchange initiated
    by an originator
  • Sequence A set of one or more data frames with
    a common sequence ID transmitted unidirectionally
    from one N_Port to another N_Port, with a
    corresponding response, if applicable,
    transmitted in response to each data frame

15
Fibre ChannelTerminology
  • Fibre Channel Elements
  • The key elements of a FC network are the end
    devices called nodes and the collection of
    switching elements called the fabric
  • Communication between FC-attached nodes consists
    of transmission of frames across point-to-point
    links or fabric
  • Each node has one or more N_Ports for connection
    to the fabric
  • Nodes connect to F_Ports on the fabric via
    bi-directional point-to-point links
  • Fabrics can be a single switch or a general set
    of switching elements
  • Frames may be buffered within the fabric, making
    it possible for nodes to connect to the fabric at
    different data rates
  • The fabric is a switched architecture, not a
    shared access medium, so no MAC issues are
    encountered and no MAC sublayer is necessary
  • The FC network scales easily in terms of ports,
    data rate, and distance covered and through its
    layered protocol architecture interworks with
    existing LAN and I/O protocols

16
Fibre ChannelTerminology
  • Basic Fibre Channel Architectural Diagram

17
Fibre ChannelExample Architecture
18
Fibre ChannelProtocol Specifications
  • Fibre Channel Protocol Architecture
  • The Fibre Channel standard reference model is
    organized into five levels Figure 9.3 and Table
    9.3
  • These are not levels in the strict sense of the
    OSI model but are instead functional groupings of
    services and/or definitions
  • The standard does not dictate actual
    implementations, relationships between the
    levels, or the specific interfaces between levels
  • Levels FC-0, FC-1, and FC-2 are defined together
    in a standard called the Fibre Channel Physical
    Signaling Interface (FC-PH)
  • No final standard has been issued for FC-3
  • A number of standards have been developed at FC-4
    specifying how Fibre Channel interfaces to
    existing LAN and I/O technologies

19
Fibre ChannelProtocol Specifications
  • Fibre Channel Protocol Architecture (continued)

20
Fibre ChannelProtocol Specifications
  • Fibre Channel Protocol Architecture (continued)
  • Details on the FC-0 level
  • A variety of physical media and data rates are
    allowed
  • Data rates 100-Mbps to 3.2-Gbps
  • Media fiber optic, coaxial cable, and STP
  • Distance 50 m to 10 km depending on data rate
    and media
  • The FC-1 level uses a 8B/10B encoding scheme in
    which 8 bits of data from the FC-2 level are
    encoded into a 10 bit binary symbol
  • Note the raw vs. effective speeds quoted
    are due to the 8B/10B scheme (a 4-Gbps raw stream
    actually carries 3.2-Gpbs of data)

21
Fibre ChannelProtocol Specifications
  • Fibre Channel Protocol Architecture (continued)
  • The FC-2 level is responsible for the
    transmission of data between N_Ports, which
    requires the following
  • Addressing of N_Ports
  • Permissible topologies of the fabric
  • Classes of service
  • Segmentation and reassembly of frames as well as
    higher level grouping of frames (sequences and
    exchanges)
  • Sequencing, flow control, and error control
  • The FC-3 level provides a common set of services
    across multiple N_Ports
  • Striping the process of using multiple ports to
    transmit a single data unit in parallel
  • Hunt groups allows a connection to any
    available N_Port in the group
  • Multicast (and broadcast)

22
Fibre ChannelProtocol Specifications
  • Fibre Channel Protocol Architecture (continued)
  • The FC-4 level defines how other protocols
    interoperate with Fibre Channel (specifically
    FC-PH)
  • SCSI a common device interface standard for
    computer peripherals
  • HIPPI a high speed I/O channel used in
    mainframe and supercomputing environments
  • IEEE 802 how IEEE 802 MAC frames map to Fibre
    Channel frames
  • ATM
  • IP how to map packets into Fibre Channel frames
    (RFC 4338)

23
Fibre ChannelPhysical Media and Topologies
  • One FC strength is the range of allowed options
    for the physical medium, the data rate, and
    network topology
  • Transmission Media
  • A special shorthand nomenclature has been
    developed for FC media it basically consists of
    the following
  • Speed-Medium-Transmitter-Distance
  • FC-0 options are listed in Figure 9.4
  • Allowable Media Types
  • Fiber Optic both SM and both 50?m and 62.5?m MM
  • Coaxial Cable three 75 ohm cable types
    specified, a thick RG-6/U, a thinner RG-59/U, a
    miniature coax cable 0.1 in diameter
  • Shielded Twisted Pair two types of 150 ohm
    cables are specified for use over short distances
    at data rates up to 200-Mbps EIA-568 Type 1 STP
    (two shielded twisted pair) or EIA-568 Type 2
    STP (four pair STP)

24
Fibre ChannelPhysical Media and Topologies
  • Topologies
  • The most general FC topology is the (switched)
    fabric
  • Four basic topologies are available
    point-to-point, fabric, arbitrated loop (no hub),
    and arbitrated loop with hub
  • Point-to-point connects two end nodes with no
    switches or routing
  • The fabric topology can contain an arbitrary
    number of switches, some connecting to nodes and
    others that just provide transport between other
    switches
  • The fabric topology allows for easy scalability
  • In the fabric topology the overhead on nodes is
    minimized they are only responsible for managing
    the point-to-point link to their local switch
  • Each port requires a unique address to allow
    frames to be delivered to the proper destination

25
Fibre ChannelPhysical Media and Topologies
  • Topologies (continued)
  • The arbitrated loop topology allows up to 126
    nodes to be connected in a simple, low-cost loop
  • The ports on the loop are a special kind called
    NL_Ports because they must perform loop
    management functions
  • Operation is roughly equivalent to other token
    ring protocols
  • A token acquisition protocol controlling loop
    access is required
  • The fabric loop topologies can be connected as
    long as one node can act as both an arbitrated
    loop a fabric node that participates in routing
    decisions on the fabric
  • The topology of a given FC network is discovered
    automatically as part of network initialization

26
Fibre ChannelPhysical Media and Topologies
  • Fibre Channel Topologies (continued)

27
Fibre ChannelFraming Classes of Service
  • Framing Protocol
  • The FC-2 layer defines the rules for the transfer
    of frames between nodes, comparable to the OSI
    data link layer
  • FC-2 specifies frame types, procedures for frame
    exchange, frame formats, flow control, and
    classes of service
  • FC-2 Classes of Service
  • Multiple classes of service are defined by the
    way communication is established between two
    ports and their flow/error control capabilities
  • Five classes of service are currently defined
  • Class 1 Acknowledged Connection-oriented
    service
  • Class 2 Acknowledged Connectionless service
  • Class 3 Unacknowledged Connectionless service
  • Class 4 Fractional Bandwidth Connection-oriented
    service
  • Class 6 Unidirectional Connection service

28
Fibre ChannelFraming Classes of Service
  • FC-2 Classes of Service
  • Class 1 Service
  • Provides a dedicated path through the fabric
    which behaves to the end nodes like a
    point-to-point link
  • Also provides a guaranteed data rate with
    sequenced delivery of frames
  • The end node requests the setup of a Class 1
    service connection using a special start-of-frame
    delimiter (SOFc1)
  • Class 1 service is advantageous for long constant
    bandwidth transfers of data (e.g. - streaming
    backups over a network)

29
Fibre ChannelFraming Classes of Service
  • FC-2 Classes of Service (continued)
  • Class 2 Service
  • Provides an acknowledged data transmission
    service without connection setup overhead
  • Acknowledgements frames are returned by the
    receiving port, if a delivery cannot be made due
    to congestion a busy frame is returned
  • This is not the case with frames that cannot be
    delivered due to frame errors
  • Sequenced delivery is not guaranteed frames can
    take different paths through the fabric if
    possible
  • Multiplexing of frames from different sources
    and/or destinations is allowed
  • Class 2 service is good for Storage Area Networks
    (SANs)

30
Fibre ChannelFraming Classes of Service
  • FC-2 Classes of Service (continued)
  • Class 3 Service
  • Provides a basic datagram service (no connection
    setup)
  • No guaranteed nor acknowledged delivery
  • Good for short data bursts or multicast/broadcast
    data
  • Class 4 Service
  • Provides service similar to Class 1 but adds
    Quality of Service (QoS) guarantees and
    reservations
  • Allows the specification of guaranteed bandwidth
    bounded latency
  • QoS parameters established separately for each
    direction
  • Good for time-critical real-time applications
    (e.g. -- VTC)
  • Class 6 Service
  • Provides the reliable unicast delivery found in
    Class 1 but also supports reliable multicast and
    preemption
  • Good for video streaming and broadcasting

31
Fibre ChannelFrame Types and Uses
  • There are two general types of frames data and
    control
  • Three types of data frames are used to transfer
    higher level information between N_Ports
  • FC-4 Device Data used to transfer higher-layer
    data units from protocols specified in FC-4
    standards (IP, SCSI, etc.)
  • FC-4 Video Data used to transmit streamed video
    between buffers without an intermediate storage
  • Link Data used to support higher level control
    information between N_Ports
  • Three types of link control frames are currently
    defined
  • Link Continue functions as an acknowledgement
    in Fibre Channel sliding-window based data
    transfer
  • Link Response used as a negative
    acknowledgement in FC sliding-window based data
    transfer
  • Link Command A reset command used to
    reinitialize the sliding-window based transfer
    mechanism

32
Fibre ChannelFrames, sequences, and exchanges
  • There is much more to the FC-2 layer than frames
    classes of service it defines a set of
    functional building blocks for higher layer
    services
  • Also defines a number of protocols used to
    implement services at a port
  • Typical protocols are creating or terminating a
    connection, transferring data, etc.
  • Protocols consist of an exchange of information
    between N_Ports, which in turn consists of
    sequences, and sequences a composed of a related
    set of frames

33
Fibre ChannelFrames, sequences, and exchanges
(continued)
34
Fibre ChannelFrames, sequences, and exchanges
(continued)
  • Sequences
  • With Fibre Channel a maximum frame size is
    imposed at the FC-2 layer but is transparent to
    higher layers
  • Higher layers set down chunks of data to FC-2,
    which may need to break them up into a sequence
    of frames
  • The sequence of data frames needed to carry a
    single higher-layer chunk of data may also be
    accompanied by one or more link control frames
    for acknowledgement
  • FC-2 provides segmentation reassembly that
    supports the transmission of sequences as well as
    error control
  • Errors in a frame that belongs to a sequence
    causes the retransmission of that whole sequence
    (and any others transmitted after it go back N
    ARQ)

35
Fibre ChannelFrames, sequences, and exchanges
(continued)
  • Exchanges
  • Exchanges are mechanisms for organizing multiple
    sequences into a higher-level construct to allow
    easier interfacing to applications
  • Examples of exchanges are SCSI disk operations
    like a read or write
  • Can involve either a unidirectional or
    bi-directional transfer of sequences
  • Within a given exchange, only a single sequence
    can be active (though sequences from different
    exchanges can be simultaneously active)

36
Fibre ChannelFrames, sequences, and exchanges
(continued)
  • Protocols
  • An exchange is tied to a protocol that provides a
    specific service for higher levels
  • Some common protocols that may be used by any
    higher application
  • Fabric Login executed upon initialization of an
    N_Port, requires the exchange of the N_Port
    address, classes of service supported, and
    flow-control parameters
  • N_Port Login the exchange of service parameters
    between a pair of N_Ports before data exchange
    (buffer space, service classes supported, etc.)
  • N_Port Logout the termination of a connection
    between a pair of N_Ports

37
Fibre ChannelFraming Classes of Service
  • Flow Control
  • Fibre Channel provides a sophisticated set of
    flow control mechanisms at two levels
    end-to-end and buffer-to-buffer
  • Key concept is credit -- negotiated at login
    denotes the number of unacknowledged frames
    allowed at any time
  • End-to-End Flow Control
  • Paces the flow of frames between N_Ports
  • Requires acknowledgements to operate, so
    end-to-end flow control can be used only with
    Class 1 and Class 2 services
  • Acknowledgement Types (Class 1 or Class 2
    service)
  • ACK_1 ACKs one data frame decrements credit
    by 1
  • ACK_N ACKs N data frames decrements credit by
    N
  • ACK_0 acknowledges a whole sequence,
    decrementing the credit count by the number of
    frames in the sequence

38
Fibre ChannelFlow Control (continued)
  • End-to-End Flow Control (continued)
  • Acknowledgement types cannot be mixed if ACK_1
    is initially used for a Class 1 connection than
    it must be used for the entire connection
  • Busy Reject control frames are also used for
    flow control
  • The F_BSY frame indicates the fabric is busy and
    cannot deliver a frame
  • The P_BSY frame indicates the destination port is
    busy and cannot accept a frame the sender will
    try a predefined number of times to retransmit
    the frame
  • With the Reject (F_RJT and P_RJT) frames,
    delivery of the data frame is being denied (for
    some reason other than congestion)
  • When a frame belonging to a sequence is rejected
    the whole sequence must be retransmitted

39
Fibre ChannelFlow Control (continued)
  • Buffer-to-buffer Flow Control
  • Operates across a pair of ports connected by a
    point-to-point link assures that buffers are
    available at either end of the link
  • Applicable to all classes of service (including
    Class 3)
  • A single type of control signal, the R_RDY frame,
    is used for buffer-to-buffer flow control
  • As a data frame is transmitted across the link,
    the sender increments its credit count for the
    link
  • At the receiving port the data frame is buffered
    as received
  • Once the data frame is switched to another ports
    buffer on the switch, the receiving port sends
    back the R_RDY frame to the sending port
  • When the sending port receives the R_RDY frame it
    decrements the credit count, opening its window
    by a frame

40
Fibre ChannelFraming Classes of Service
  • Frame Format Figure 9.10
  • The Fibre Channel Frame contains five general
    fields
  • Start Delimiter
  • Frame Header
  • Data
  • Cyclic Redundancy Check (CRC)
  • End Delimiter

41
Fibre ChannelFraming Classes of Service
  • Frame Format - Start of Frame Delimiter
  • The start of Frame Delimiter includes a four byte
    set of non-data symbols denoting the start of a
    frame and allowing synchronization
  • The SOF delimiter comes in several varieties,
    each of which will specify the frames type and
    class of service
  • Examples are SOF Class 1 connection (SOFc1), SOF
    normal (for data frames), and SOF fabric (for
    control frames in the fabric)

42
Fibre ChannelFraming Classes of Service
  • Frame Format - FC- 2 Frame Header
  • Contains the control data required at this level
    consists of
  • Routing control contains two subfields, one for
    frame type (device data, link control, etc.) and
    one for data type in the frame
  • Destination Identifier destination N_Port or
    F_Port
  • FC uses two levels of addressing a globally
    unique identifier (world wide port/node names)
    a lower level port identifier
  • World wide/port name is used by higher layers and
    for network management
  • Port identifier is the 3-byte that is used for
    frame routing that consists of three parts
    domain, area, and port
  • The hierarchical addressing structure facilitates
    routing and management of the fabric
  • A mechanism for mapping between the two addresses
    is necessary

43
Fibre ChannelFraming Classes of Service
  • Frame Format - FC- 2 Frame Header (continued)
  • Contains the control data required at this level
    consists of
  • Source Identifier source N_Port or F_Port
  • Type if the routing control field specifies an
    FC-4 frame, then this field specifies the payload
    protocol (SCSI, IP, etc.)
  • This field and the Route control field allow the
    destination N_Port to deliver the data to the
    correct higher layer user
  • Frame control control information relating to
    frame content
  • Is frame a retransmission? Is frame part of a
    sequence?
  • Sequence ID unique identifier for a sequence
    used for all frames belonging to it
  • Data Field control specifies which, if any, of
    four optional headers are present

44
Fibre ChannelFraming Classes of Service
  • Frame Format - Frame Header (continued)
  • Contains the control data required at this level
    consists of
  • Sequence count A unique number assigned
    sequentially to each frame in a sequence (for
    flow control and proper reassembly of frames
    within a sequence)
  • Originator Exchange Identifier a unique
    identifier assigned to the higher layer initiator
    of an exchange
  • Responder Exchange Identifier a unique
    identifier assigned to the higher layer
    destination of an exchange
  • Parameter used in different ways for link
    control and data frames
  • Link control frames carry information specific to
    the control function in this field
  • Data frames may carry an address meaningful to
    the upper layer protocol

45
Fibre ChannelFraming Classes of Service
  • Frame Format - Data Field
  • Contains user data in a multiple of four bytes
    chunks up to a maximum of 2112 bytes
  • Can also include one or more optional headers
    whose presence is denoted in the Data Field
    control field
  • Optional Expiration Security header can carry a
    frame expiration date plus other security data
    over and above the FC-PH standard
  • Optional Network Header may be used by a bridge
    or gateway node interfacing to an external
    network to allow tunneling (includes 8 bit source
    and destination network addresses)
  • Optional Association Header may help specify an
    upper layer process (or group of processes)
    associated with an exchange
  • Optional Device Header if used the format is
    specified by the upper layer protocol used with
    the frame

46
Fibre ChannelFraming Classes of Service
  • Frame Format - CRC End Delimiter
  • CRC field the error detection algorithm is the
    same 32 bit CRC used with FDDI and IEEE 802
  • End of Frame Delimiter
  • A four byte field denoting the end of the frame
  • The EOF field may be modified by a switch in the
    fabric if it finds an error in the frame or some
    other condition that invalidates the frame
  • There are three different EOF delimiters for
    valid frames
  • EOFt denotes the end of a valid sequence
  • EOFdt is used with Class 1 service to indicate
    that the frame is the last frame on the logical
    connection (i.e. the connection is being
    terminated)
  • EOFn is used to denote successful transmission of
    frames not covered by the first two

47
Fibre ChannelExamples of Equipment
  • Fibre Channel Equipment Manufacturers
  • High-end (Director-Class) Switches
  • Brocade Silkworm 2400 (http//www.brocade.com/prod
    ucts-solutions/products/directors/product-details/
    48000-director/index.page )
  • Cisco MDS 9513 (http//www.cisco.com/en/US/produc
    ts/ps6780/index.html )
  • Low-end (Edge) Switches
  • EMC DS-300B (http//www.emc.com/collateral/hardwar
    e/specification-sheet/h5528-connectrix-ds300b-ss.p
    df )
  • Cisco MDS 9134 (http//www.cisco.com/en/US/product
    s/ps8414/index.html )
  • Host-Bus Adapters (HBA)
  • HP Storageworks 8-Gbps PCIe HBA
    (http//h18006.www1.hp.com/products/storageworks/f
    c81q_pci/index.html )
  • Qlogic QLA2340 2-Gbps PCI-x (http//www.qlogic.com
    /Products/SAN_products_FCHBA_QLA2340.aspx )

48
High Performance ComputingIntroduction
  • Throughout the 1960s, 1970s, 1980s
    Supercomputers defined the high-end of computing
    performance
  • Within the past decade the notion of gluing
    collections of lower powered computers together
    to harness their collective power has been a
    force in the HPC arena
  • Come in two general forms
  • Clusters
  • Grids
  • Clusters are usually homogenous resources owned
    and maintained by a single organization
  • Grids are usually heterogeneous and dynamic
    resources distributed many times shared among
    organizations
  • The key in both situations is network
    interconnections!

49
High Performance ComputingIntroduction to
Infiniband
  • Several high performance network technologies
    have been developed for HPC connectivity
  • Infiniband
  • A promising technology in both HPC SAN
    environments
  • Developed by the Infiniband Trade Association a
    vendor consortium with 190 members (notably
    Dell, Sun, IBM, HP)
  • Goal is to provide an extensible high-speed,
    low-latency interconnection platform that is cost
    effective in a number of scenarios
  • Version 1.0 ratified in Sept 2000, currently at
    version 1.2.1
  • Specification provides a comprehensive protocol
    architecture through the transport layer
    (including management)

50
High Performance ComputingInterconnects
Infiniband Protocol Architecture
51
High Performance ComputingInterconnects
Infiniband
  • Physical Layer Layer
  • Overall architecture designed on switch-based P2P
    links
  • Individual links based on 4-wire 2.5-Gbps (1x)
    full-duplex connection
  • Higher speed interfaces use multiple 1x links
    (e.g. 4x link has 16 wires and runs at 10-Gbps
    full-duplex)
  • Links use 8B/10B encoding 1x throughput is
    2-Gbps
  • Copper Links
  • PCB/bus links run at maximum of 30 inches
  • TP copper links run up to 17m currently specd
    for 4x and 12x
  • Mechanical connectors and cables defined in
    specification
  • Fiber Links
  • SX (850nm MM) and LX (1310nm SM) versions of 1x,
    4x, 12x
  • Use multiple lanes like copper except for 4x-LX
    (10GBase-LR)
  • Variety of Physical Connectors specified (MPO,
    SC, etc.)

52
High Performance ComputingInterconnects
Infiniband
  • Data Link Layer
  • Defines Management Data Packets (Max. size of
    4kB)
  • Within a subnet packets forwarded using LID
    assigned to interface
  • Each link has 15 Virtual Lanes (VLs) prioritized
    from 0 to 15
  • Subnet Management packets have highest priority
    (VL-15)
  • Allows QoS schemes only VL-0 and 15 are
    mandatory
  • Each data packet has a Service Level (SL) used to
    map it to a VL
  • Unicast and Multicast support
  • Uses a credit-based flow control scheme
  • Two CRCs for comprehensive error detection
  • Link-based 16-bit CRC checked hop-to-hop
  • End-to-end invariant 32-bit CRC checks fields
    that do not change hop-by-hop

53
High Performance ComputingInterconnects
Infiniband
  • Network Layer
  • Defines Management Data Packets (Max. size of
    4kB)
  • Within a subnet packets forwarded using LID
    assigned to interface (Max. 65k nodes per subnet)
  • Outside a subnet packets routed using GRH/GID
  • Routers vs. Switches
  • Transport Layer
  • Defines five different services associated PDUs
  • Reliable Connection
  • Reliable Datagram
  • Unreliable Connection
  • Unreliable Datagram
  • Raw Datagram

54
High Performance ComputingInterconnects
Infiniband
  • Application Layer
  • A number of service interfaces and verbs are
    defined
  • User interfaces follow the VIA (Virtual Interface
    Architecture) specification
  • Management
  • Includes two general management
    packages/protocols
  • Subnet Manager (SM)
  • Configures the local subnet provides essential
    services
  • LID assignment, SL-to-VL mapping, Link failover,
    etc.
  • All devices must talk to SM, can have standby SMs
  • SM traffic is highest priority (VL-15) with no
    flow control
  • General Services Interface (GSI)
  • Other operations like out-of-band mgmt., chassis
    mgmt.
  • Lower priority and subject to flow control

55
High Performance ComputingInterconnects
Infiniband
  • Product Examples (Low-End) Cisco SFS-7000P
  • Provides 24 fixed Infiniband 4x ports (10-Gbps)
    in 1U enclosure
  • 480-Gbps switch fabric, non-blocking, 200ns
    latency
  • Integrated Subnet Manager, ITBA v1.2 compliant
  • Managed via Web, Java clients, or command line
  • List price (2/2006) 14,495

56
High Performance ComputingInterconnects
Infiniband
  • Product Examples (High-End) Voltaire ISR 9288
  • Up to 288 Infiniband 4x ports in a 14U modular
    enclosure
  • Up to 11.5-Tbps bandwidth non-blocking with 420ns
    latency
  • Integrated subnet manager, comprehensive mgmt.
    packages
  • Multiple redundant components failover options
  • ITBA v1.1 compliant
  • GSA price 43,000 for 24 ports

57
High Performance ComputingInterconnects
Infiniband
  • Uses

58
High Performance ComputingOther choices for
interconnects
  • Other Alternatives for Interconnect Technologies
  • Myrinet
  • ANSI standard protocol architecture with
    limited support beyond one company (Myricom)
  • Low-latency, full-duplex switch fabric 2
    10-Gbps links
  • Older Copper (HSSDC, LAN, SAN) connections
    fiber preferred
  • Comprises 2 of current Top500 list (28 for
    Infiniband)
  • Q-net/Quadrics (1 of Top500)
  • Proprietary high-speed (900-MBps) low latency
    interconnect
  • Copper/Fiber connections up to 100m
  • Fat-tree switch fabric up to 4096 nodes
  • Gigabit/10-Gigabit Ethernet (56 of Top500)
  • The interconnect of choice because of cost
    convenience
  • Latency can be an issue so can frame size

59
SANs, Fibre Channel, HPC InterconnectsReading
  • Reading
  • This modules material Stallings chapter 9
  • Next module Last-Mile Technologies
Write a Comment
User Comments (0)
About PowerShow.com