Bug 600481

Summary: REGRESSION: F13 Fails to Boot with Conforming iSCSI Root
Product: [Fedora] Fedora Reporter: Mike Hayward <mh-fedora>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: low    
Version: 13CC: agrover, hdegoede, mchristi
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-07 09:59:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mike Hayward 2010-06-04 19:32:16 UTC
iSCSI initiator is implementing connection reinstatement incorrectly.

F12 boots fine with an iSCSI root, presumably since it never drops a
connection and doesn't exercise reinstatement.  F13 initramfs mounts
iscsi root but when running rc scripts, iscsid drops the
connection to the target root device, then connection reinstatement
sends an illegal iSCSI frame.  Perhaps some targets implement
reinstatement when a new session is commanded but such behavior is
counter to the iSCSI spec and could lead to data corruption.

Below are selected, annotated tshark traces which show the protocol
violation.  I'd love to be able to use F13 or maybe a future version
with my iSCSI san.

Presumably this reinstatement defect applies to F12 and prior versions
but simply isn't encountered since reinstatement never runs.  Probably
this should be upstreamed to open-iscsi developers.

I tried reinstalling twice with updates and updates-testing, but the
installer fails with complaints about "Unable to read package metadata
from repository".  This update install has worked fine with F12.  The
network is configured correctly... clearly the iSCSI volume is
formatted before getting to this step.  So I have no way to test if
any updates have resolved the issue.

Can someone confirm with latest iscsi-initator-utils that the
initiator is incorrectly sending a null TSIH when attempting to
reinstate a connection?

----------------------------------------------------------------------
Frame 6 (526 bytes on wire, 526 bytes captured)
...
Transmission Control Protocol, Src Port: 48022 (48022), Dst Port: iscsi-target (3260), Seq: 49, Ack: 1, Len: 460
...
iSCSI (Login Command)
    Opcode: Login Command (0x03)
    1... .... = T: Transit to next login stage
    .0.. .... = C: Text is complete
    .... 01.. = CSG: Operational negotiation (0x01)
    .... ..11 = NSG: Full feature phase (0x03)
    VersionMax: 0x00
    VersionMin: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 0x000001c9
    ISID: 00023D010000
        00.. .... = ISID_t: IEEE OUI (0x00)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0x023d
        ISID_c: 0x01
        ISID_d: 0x0000
    TSIH: 0x0000
    InitiatorTaskTag: 0x00000000
    CID: 0x0000
    CmdSN: 0x00000000
    ExpStatSN: 0x00000000
    Key/Value Pairs
        KeyValue: InitiatorName=iqn.1994-05.com.fedora:01.739587
        KeyValue: InitiatorAlias=fc13san
        KeyValue: TargetName=iqn.2003-08.com.sanify:vid.f72bca2a57ecf55e.1
        KeyValue: SessionType=Normal
        KeyValue: HeaderDigest=None
        KeyValue: DataDigest=None
        KeyValue: DefaultTime2Wait=2
        KeyValue: DefaultTime2Retain=0
        KeyValue: IFMarker=No
        KeyValue: OFMarker=No
        KeyValue: ErrorRecoveryLevel=0
        KeyValue: InitialR2T=No
        KeyValue: ImmediateData=Yes
        KeyValue: MaxBurstLength=16776192
        KeyValue: FirstBurstLength=262144
        KeyValue: MaxOutstandingR2T=1
        KeyValue: MaxConnections=1
        KeyValue: DataPDUInOrder=Yes
        KeyValue: DataSequenceInOrder=Yes
        KeyValue: MaxRecvDataSegmentLength=131072
    Padding: 000000

Frame 8 (446 bytes on wire, 446 bytes captured)
...
iSCSI (Login Response)
    Opcode: Login Response (0x23)
    1... .... = T: Transit to next login stage
    .0.. .... = C: Text is complete
    .... 01.. = CSG: Operational negotiation (0x01)
    .... ..11 = NSG: Full feature phase (0x03)
    VersionMax: 0x00
    VersionActive: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 0x0000014c
    ISID: 00023D010000
        00.. .... = ISID_t: IEEE OUI (0x00)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0x023d
        ISID_c: 0x01
        ISID_d: 0x0000
    TSIH: 0x0041
    InitiatorTaskTag: 0x00000000
    StatSN: 0x00000000
    ExpCmdSN: 0x00000000
    MaxCmdSN: 0x00000040
    Status: Success (0x0000)
    Key/Value Pairs
        KeyValue: DataDigest=None
        KeyValue: DataPDUInOrder=Yes
        KeyValue: DataSequenceInOrder=Yes
        KeyValue: DefaultTime2Retain=0
        KeyValue: DefaultTime2Wait=0
        KeyValue: ErrorRecoveryLevel=0
        KeyValue: FirstBurstLength=262144
        KeyValue: HeaderDigest=None
        KeyValue: IFMarker=No
        KeyValue: ImmediateData=Yes
        KeyValue: InitialR2T=Yes
        KeyValue: MaxBurstLength=524288
        KeyValue: MaxConnections=1
        KeyValue: MaxOutstandingR2T=1
        KeyValue: MaxRecvDataSegmentLength=65536
        KeyValue: OFMarker=No
        KeyValue: TargetPortalGroupTag=1
Frame 11772 (66 bytes on wire, 66 bytes captured)
...
Transmission Control Protocol, Src Port: 48022 (48022), Dst Port: iscsi-target (3260), Seq: 286862, Ack: 27753774, Len: 0
    Source port: 48022 (48022)
    Destination port: iscsi-target (3260)
    [Stream index: 0]
    Sequence number: 286862    (relative sequence number)
    Acknowledgement number: 27753774    (relative ack number)
    Header length: 32 bytes
    Flags: 0x10 (ACK)
        0... .... = Congestion Window Reduced (CWR): Not set
        .0.. .... = ECN-Echo: Not set
        ..0. .... = Urgent: Not set
        ...1 .... = Acknowledgement: Set
        .... 0... = Push: Not set
        .... .0.. = Reset: Not set
        .... ..0. = Syn: Not set
        .... ...0 = Fin: Not set

TCP session is hung... no frame ever seen again.

Frame 11778 (230 bytes on wire, 230 bytes captured)
...
Transmission Control Protocol, Src Port: 48023 (48023), Dst Port:
iscsi-target (3260), Seq: 49, Ack: 1, Len: 164
    Source port: 48023 (48023)
    Destination port: iscsi-target (3260)
    [Stream index: 1]
    Sequence number: 49    (relative sequence number)
    [Next sequence number: 213    (relative sequence number)]
...
iSCSI (Login Command)
    Opcode: Login Command (0x03)
    0... .... = T: Stay in current login stage
    .0.. .... = C: Text is complete
    .... 00.. = CSG: Security negotiation (0x00)
    VersionMax: 0x00
    VersionMin: 0x00
    TotalAHSLength: 0x00
    DataSegmentLength: 0x000000a2
    ISID: 00023D010000
        00.. .... = ISID_t: IEEE OUI (0x00)
        ..00 0000 = ISID_a: 0x00
        ISID_b: 0x023d
        ISID_c: 0x01
        ISID_d: 0x0000
    TSIH: 0x0000          <-------------- (1) Indicates new session.
    InitiatorTaskTag: 0x00000000
    CID: 0x0000           <-------------- (2)
    CmdSN: 0x000003a3
    ExpStatSN: 0x000003a4 <-------------- (3)
    Key/Value Pairs
        KeyValue: InitiatorName=iqn.1994-05.com.fedora:01.739587
        KeyValue: InitiatorAlias=fc13san
        KeyValue: TargetName=iqn.2003-08.com.sanify:vid.f72bca2a57ecf55e.1
        KeyValue: SessionType=Normal
        KeyValue: AuthMethod=CHAP
    Padding: 0000

(1) This should be 0x0041 to reinstate a connection in same session.
(2) If TSIH were 0x0041, this would indicate connection reinstatement.
(3) StatSN 0x000003a3 was last ackd from prior tcp connection,
    presumably initiator is trying to reinstate.  If it were really
    trying to create a new session with TSIH 0x0000, this is reserved.

----------------------------------------------------------------------
RFC3720 EXCERPTS

10.12.6.  TSIH

   TSIH must be set in the first Login Request.  The reserved value 0
   MUST be used on the first connection for a new session.  Otherwise,
   the TSIH sent by the target at the conclusion of the successful login
   of the first connection for this session MUST be used.  The TSIH
   identifies to the target the associated existing session for this new
   connection.

   All Login Requests within a Login Phase MUST carry the same TSIH.

10.12.9.  ExpStatSN

   For the first Login Request on a connection this is ExpStatSN for the
   old connection and this field is only valid if the Login Request
   restarts a connection (see Section 5.3.4 Connection Reinstatement).

   For subsequent Login Requests it is used to acknowledge the Login
   Responses with their increasing StatSN values.

10.14.  Logout Request

   If an initiator intends to start recovery for a failing connection,
   it MUST use the Logout Request to "clean-up" the target end of a
   failing connection and enable recovery to start, or the Login Request
   with a non-zero TSIH and the same CID on a new connection for the
   same effect (see Section 10.14.3 CID).  In sessions with a single
   connection, the connection can be closed and then a new connection
   reopened.  A connection reinstatement login can be used for recovery
   (see Section 5.3.4 Connection Reinstatement).

----------------------------------------------------------------------

Comment 1 Hans de Goede 2010-06-07 09:59:30 UTC

*** This bug has been marked as a duplicate of bug 589250 ***