iSCSI Extensions for RDMA (iSER) - PowerPoint PPT Presentation

About This Presentation
Title:

iSCSI Extensions for RDMA (iSER)

Description:

... the target can issue on a particular RDMAP Stream (iSER-ORD) to the initiator ... target will flow control the RDMA Read Request Messages to not exceed iSER-ORD ... – PowerPoint PPT presentation

Number of Views:159
Avg rating:3.0/5.0
Slides: 20
Provided by: mik68
Learn more at: https://www.ietf.org
Category:
Tags: rdma | extensions | iscsi | iser | ord

less

Transcript and Presenter's Notes

Title: iSCSI Extensions for RDMA (iSER)


1
iSCSI Extensions for RDMA (iSER)
  • draft-ko-iwarp-iser-02
  • Mike Ko
  • IBM
  • August 2, 2004

2
Agenda
  • What is iSER?
  • iSER connection setup
  • Open issues
  • iSER flow control
  • Open issues

3
iSCSI Datamover with RDMA Extensions
  • The Datamover Architecture defines an abstract
    model in which the movement of data between iSCSI
    end nodes is logically separated from the rest of
    the iSCSI protocol
  • Allows a datamover protocol layer to offload the
    tasks of data movement and placement from the
    iSCSI layer
  • The iSCSI Extensions for RDMA (iSER) protocol is
    one such datamover protocol
  • Applies the Datamover Architecture in extending
    the data transfer capabilities of iSCSI to
    include RDMA (Remote Direct Memory Access) as
    defined in the iWARP protocol suite
  • Allows iSCSI implementations to have data
    transfers which achieve true zero copy behavior
    using generic RDMA network interface controllers
    (RNICs)

SCSI
iSCSI
Datamover Interface
iSER
Verbs
RDMAP
DDP
iWARP
MPA
TCP
4
Connection Setup for iSER-assisted Modeat the
Initiator
  • Negotiated key values may be passed by the iSCSI
    layer to the iSER layer by invoking the
    Notice_Key_Values Operational Primitive
  • Before sending the final Login Request, the iSCSI
    layer invokes the Allocate_Connection_Resources
    Operational Primitive to request the iSER layer
    to allocate the iWARP resources for the
    connection
  • After the target returns the final Login
    Response, the iSCSI layer at the initiator
    invokes the Enable_Datamover Operational
    Primitive to request the iSER layer to transition
    into iSER-assisted mode
  • The first message sent by the iSER layer at the
    initiator to the target is the iSER Hello Message

5
Connection Setup for iSER-assisted Modeat the
Target
  • Negotiated key values may be passed by the iSCSI
    layer to the iSER layer by invoking the
    Notice_Key_Values Operational Primitive
  • Before sending the final Login Response, the
    iSCSI layer invokes the Allocate_Connection_Resour
    ces Operational Primitive to request the iSER
    layer to allocate the iWARP resources for the
    connection
  • The iSCSI layer invokes the Enable_Datamover
    Operational Primitive to enable the iSER mode
    qualified with the final Login Response PDU
  • The iSER layer sends the final Login Response PDU
    in byte stream mode and then transitions into
    iSER-assisted mode
  • After receiving the iSER Hello Message from the
    initiator, the iSER layer at the target responds
    by sending the iSER HelloReply Message

6
Example of Successful iSER Connection Setup
A. SCSI Login Request PDU with
RDMAExtensionsYes B. SCSI Login Response PDU
with RDMAExtensionsYes C. Optional
Notice_Key_Values to pass values of negotiated
keys D. Allocate_Connection_Resources to set up
iWARP resources E. SCSI Login Request PDU with
T1 and NSGFullFeaturePhase F. Enable_Datamover
to go into iSER mode ( send last iSCSI PDU in
byte stream mode) G. SCSI Login Response PDU in
byte stream mode with T1 and NSGFullFeaturePhase
H. iWARP Send Message containing iSER Hello J.
iWARP Send Message containing iSER HelloReply
target
initiator
iSCSI Layer
iSER Layer
iSER Layer
iSCSI Layer
A
B
. . .
C
D
E
C
D
F
G
F
H
J
7
Negotiation of RDMAExtensions in Leading
Connection Only
  • From section 2.3 of iSER draft iSER-assisted
    mode is negotiated during the iSCSI Login for
    each connection, but an entire iSCSI session MUST
    operate in one mode ...
  • Question Since RDMAExtensions is leading-only,
    this statement is incorrect
  • Proposed change
  • Replace the sentence with iSER-assisted mode is
    negotiated during the iSCSI Login for each
    session, and an entire iSCSI session MUST operate
    in one mode ...

8
CRC32C Protection in the Layer Below iSER
  • From section 5.1 of iSER draft when the
    RDMAExtensions key is negotiated to "Yes", the
    HeaderDigest and the DataDigest keys MUST be
    negotiated to "None" ... because ... the iWARP
    protocol suite provides a CRC32c-based error
    detection for all iWARP Messages
  • Recent updates to the MPA draft renders the use
    of CRC optional
  • Disabling of CRCs should only be done when it is
    clear that the connection through the network has
    data integrity at least as good as a CRC
  • RDDP WGs position is that all ULPs can assume
    CRC level or equivalent data protection
  • Proposed change Add the explicit requirement
    that end-to-end CRC32C based error detection or
    equivalent be provided in a layer below iSER

9
Order of RDMAExtensions Key Negotiation and
Allocate_Connection Resources
  • From section 5.1.1 (and similarly for section
    5.1.2) If the outcome of the iSCSI negotiation
    is to enable iSER-assisted mode, then on the
    initiator side, ... the iSCSI Layer MUST invoke
    the Allocate_Connection_Resources Operational
    Primitive
  • Question The alternative approach of invoking
    Allocate_Connection_Resources before negotiating
    for iSER-assisted mode should be allowed
  • Current approach results in the connection being
    torn down if the required resources cannot be
    allocated
  • Alternative approach avoids this problem
  • Resources must be deallocated if login fails
  • Resources may have to be deallocated if the
    negotiated values are less than the allocated
    value
  • Proposed change Update the draft to allow the
    alternative approach with the proviso that it is
    the responsibility of the implementation to
    deallocate the resources if the login fails or if
    the negotiation values are less than the
    allocated value

10
Clarification on the Usage of the
Notice_Key_Values Primitive
  • From section 5.1.1 Optionally, the iSCSI Layer
    MAY invoke the Notice_Key_Values Operational
    Primitive before invoking the Allocate_Connection_
    Resources Operational Primitive
  • Question The word optionally is ambiguous
  • Could mean the iSCSI layer may choose to invoke
    the primitive
  • Or the iSCSI layer may choose to use that
    primitive, or some other defined or undefined
    primitive
  • Proposed change Remove the word optionally

11
Requiring the Use of the Notice_Key_Values
Primitive
  • From section 5.1.1 The iSCSI Layer MAY invoke
    the Notice_Key_Values Operational Primitive to
    request the iSER Layer to take note of the
    negotiated values of the iSCSI keys for the
    Connection
  • Question The word MAY should be replaced with
    MUST to enforce the invocation of the primitive
  • Proposed change None
  • If the default values are accepted for all the
    negotiated keys, then there is no new information
    to be passed from the iSCSI layer to the iSER
    layer
  • Requiring a "MUST" instead of a "MAY would
    require this primitive be invoked even though it
    is not necessary
  • Also, it is not architecturally required for the
    iSCSI layer to issue the Notice_Key_Values
    primitive

12
HeaderDigest, DataDigest, OFMarker, IFMarker in
iSER-assisted Mode
  • From section 6.1 and 6.6 These 4 keys must be
    negotiated to none or no if the
    RDMAExtensions key is negotiated to yes
  • Question Draft seems to imply that these 4 keys
    must be negotiated even for the defaults
  • Suggestion Negotiations resulting in
    RDMAExtensionsYes for a session implies
    HeaderDigestNone, DataDigestNone, OFMarkerNo,
    and IFMarkerNo on all connections in that
    session
  • Override both the default and explicit settings
  • Proposed change Update the draft to reflect the
    suggested change

13
Scope of RDMAExtensions Key
  • From section 6.3 RDMAExtensions key has
    session-wide scope
  • Question Should iSER support mixed mode
    sessions
  • Argument for
  • Open an iSCSI connection when there are
    insufficient resources to support an
    iSER-assisted connection in allegiance
    reassignment and the session is in iSER-assisted
    mode
  • Flexibility on general principles
  • Argument against
  • RFC 3720 assumes homogeneous connections in a
    session
  • Introducing mixed mode sessions would require
    that the RFC3720 semantics be carefully thought
    through to ensure correctness
  • The task states maintained by an iSCSI connection
    may be different from those for an iSER-assisted
    connection
  • iSER-assisted connection may require different LO
    key values for optimization compared with iSCSI
    connection
  • Test and debug effort will increase 2x to 3x for
    mixed mode support
  • Proposed change None

14
Clarification on the Order of RDMAExtensions Key
Negotiation
  • From section 6.3 If the RDMAExtensions key is
    to be negotiated, it must be offered only on the
    initial Login Request PDU or Login Response PDU
    of the leading connection, and if offered, the
    response must be sent in the immediately
    following Login Response or Login Request PDU
    respectively.
  • Question Clarify when the negotiation response
    is to be returned if the key is offered in a PDU
    where the C-bit is set
  • Question Clarify that the negotiation takes
    place in the LoginOperationalNegotiation stage of
    the leading connection
  • Question Section 5.2.2 of RFC3720 states that a
    response is optional if the Boolean function is
    "AND" and the value "No" is received
  • iSER draft always requires a response to be
    returned
  • However, since the default for RDMAExtensions is
    no, it is unlikely that the key-value pair of
    RDMAExtensionsno will be offered

15
Clarification on the Order of RDMAExtensions Key
Negotiation (cont.)
  • Proposed change Replace sentence with However,
    if the RDMAExtensions key is to be negotiated, an
    initiator MUST offer the key on the first Login
    Request PDU in the LoginOperationalNegotiation
    stage of the leading connection, and a target
    MUST offer the key on the first Login Response
    PDU with which it is allowed to do so (i.e., the
    first Login Response issued after the first Login
    Request with the C bit set to 0) in the
    LoginOperationalNegotiation stage of the leading
    connection. In response to the offered keyvalue
    pair of RDMAExtensionsyes, an initiator MUST
    respond on the next Login Request PDU with which
    it is allowed to do so, and a target MUST respond
    on the next Login Response PDU with which it is
    allowed to do so.

16
Order of RDMAExtensions Key Negotiation Response
  • From section 6.3 RDMAExtensions key must be
    offered for negotiation in the first PDU that a
    node is allowed to do so and the response must be
    returned in the immediately following PDU in
    which a node is allowed to respond
  • Question Why must the RDMAExtensions key be
    negotiated first?
  • Negotiating the RDMAExtensions key first allows a
    node to optimally negotiate the value of other
    keys
  • Certain iSCSI keys such as MaxBurstLength,
    MaxOutstandingR2T, ErrorRecoveryLevel,
    InitialR2T, ImmediateData, etc., may have
    different optimization points depending on
    whether iSER-assisted mode is to be enabled in
    the iSCSI session
  • Proposed change Update the draft to include the
    rationale for the order requirement

17
Key Ordering Within a PDU
  • From section 6.3 The RDMAExtensions key must
    precede any other login keys which may be
    affected by the outcome of the negotiation of the
    RDMAExtensions key
  • Question This can be interpreted as requiring
    key ordering within a PDU which is contrary to
    RFC3720
  • Proposed change Remove the sentence from the
    draft

18
iSER Flow Control
  • For RDMA Send Type Messages
  • The iSER protocol does not provide additional
    flow control beyond that provided by the iSCSI
    layer on control-type PDUs
  • An implementation should be able to take
    advantage of iWARP Verbs mechanisms such as the
    Shared Receive Queue mechanism to effectively
    address the Send Message flow control question
  • For RDMA Read Resources
  • In the iSER Hello Message, the iSER layer at the
    initiator declares the maximum number of RDMA
    Read Requests that the initiator can receive on
    the particular RDMAP Stream (iSER-IRD) to the
    target
  • This allows the iSER layer at the target to
    adjust its resources if it can issue more RDMA
    Read Requests than the initiator can handle
  • In the iSER HelloReply Message, the iSER layer at
    the target declares the maximum number of RDMA
    Read Requests that the target can issue on a
    particular RDMAP Stream (iSER-ORD) to the
    initiator
  • This allows the iSER layer at the initiator to
    adjust its resources if it can handle more RDMA
    Read Requests than the target can issue
  • The iSER layer at the target will flow control
    the RDMA Read Request Messages to not exceed
    iSER-ORD

19
Flow Control for Control-Type PDU
  • From section 8.1 The iSER Layer SHOULD
    provision enough Untagged buffers for handling
    incoming RDMAP Send Message Types to prevent a
    buffer underrun condition
  • Question Should some form of send side flow
    control be established for iSCSI control-type
    PDUs?
  • Latest DDP draft, draft-ietf-rddp-ddp-02, no
    longer mandates that a DDP stream be disabled for
    a buffer underrun condition
  • Proposed change Further discussion is needed
Write a Comment
User Comments (0)
About PowerShow.com