Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 800041

Summary: iSER (iscsi rdma) connection can get broken as of missing receive buffers
Product: Red Hat Enterprise Linux 6 Reporter: Or Gerlitz <ogerlitz>
Component: kernelAssignee: Mike Christie <mchristi>
Status: CLOSED ERRATA QA Contact: Bruno Goncalves <bgoncalv>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.3CC: dledford, fge, mchristi
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-252.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 08:33:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Or Gerlitz 2012-03-05 16:29:28 UTC
Description of problem: an iser target may send iscsi NO-OP PDUs as soon as it marks the iser iscsi session as fully operative. This means that there is window in time, where there are no posted receive buffers in the initiator side, such that its possible for the iser RC connection to break as of RNR NAK / retry errors. To solve that, rely on the flags bits in the login request to have FFP (0x3) in the lower nibble, as a marker for the final login request, and post an initial chunk of receive buffers before sending that login request instead of after getting the login response.

A patch to solve that was submitted upstream see http://marc.info/?l=linux-rdma&m=133096464331279&w=2 

We actually hit that bug in practice, and here are the prints from the target side that show that:

tgtd0: iser_cm_conn_established(1540) conn:0x8943e0 cm_id:0x8a1c90, 192.168.20.11 -> 192.168.20.17, established

tgtd0: handle_wc_error(2960) conn:0x8943e0 task:0x1790670 tag:0xffffffff wr_id:0x0x1790748 op:send err:RNR retry counter exceeded vendor_err:0x87

tgtd0: iser_conn_close(1219) conn:0x8943e0 cm_id:0x0x8a1c90 state: CLOSE, refcnt:395

tgtd0: iser_cm_disconnected(1560) conn:0x8943e0 cm_id:0x8a1c90 event:10, RDMA_CM_EVENT_DISCONNECTED

Reproducible well under loop of login/logout for bunch (say five or more) of iser targets exported by one tgt instance

Comment 2 Doug Ledford 2012-03-05 16:50:52 UTC
I've got this one Mike.

Comment 3 RHEL Program Management 2012-03-05 17:10:44 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 4 Mike Christie 2012-03-05 19:33:32 UTC
(In reply to comment #2)
> I've got this one Mike.

If you do not have time, let me know and I can do it. I have been working with Or on this upstream. So it is tested and reviewed on my side. I will watch for your posting and ack it.

Comment 5 Or Gerlitz 2012-03-05 21:04:19 UTC
the patch was picked by Roland, so should be present in 3.4-rc1 and -stable immediatly followin that - http://git.kernel.org/?p=linux/kernel/git/roland/infiniband.git;a=commitdiff;h=89e984e2c2cd14f77ccb26c47726ac7f13b70ae8

Comment 7 Doug Ledford 2012-03-06 15:19:41 UTC
Sorry Mike, I knew you said you weren't planning on an iSCSI update this release so figured you weren't doing anything with this at the moment.  I'll let you handle it.

Comment 8 Aristeu Rozanski 2012-03-20 14:21:20 UTC
Patch(es) available on kernel-2.6.32-252.el6

Comment 11 Bruno Goncalves 2012-05-30 10:05:20 UTC
Tested more than 50 times login/logout and no error has been reported.

tgtd was configured with 10 iser targets.

uname -r
2.6.32-274.el6.x86_64

rpm -q iscsi-initiator-utils
iscsi-initiator-utils-6.2.0.872-41.el6.x86_64

rpm -q scsi-target-utils
scsi-target-utils-1.0.24-2.el6.x86_64

[root@rdma1 ~]# iscsiadm -m discovery -p 172.31.1.2 -t st -I iser
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-1000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-2000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-3000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-4000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-5000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-6000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-7000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-8000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-9000
172.31.1.2:3260,1 iqn.2010-10.com.example:storage-A000

Comment 13 errata-xmlrpc 2012-06-20 08:33:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0862.html