Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 548556

Summary: Infortrend S16E-R1130 network weirdness, TCP Dup ACKs
Product: Red Hat Enterprise Linux 5 Reporter: matt-rhbugzilla
Component: iscsi-initiator-utilsAssignee: Chris Leech <cleech>
Status: CLOSED WONTFIX QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 5.4CC: coughlan, matt-rhbugzilla, mchristi
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-02 13:25:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Packet dump co-inciding with error logs none

Description matt-rhbugzilla 2009-12-17 19:21:12 UTC
Description of problem:

http://groups.google.com/group/open-iscsi/browse_thread/thread/c5b3cdce3ac22d88/11f100ba8075db34?q=infortrend&lnk=ol&

I’m seeing  TCP “weirdness”, including many duplicate ACKS

Nov  2 08:15:14 backup kernel:  connection28:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4295808782, last ping 4295813782, now
4295818782
Nov  2 08:15:14 backup kernel:  connection28:0: detected conn error
(1011)
Nov  2 08:15:14 backup kernel:  connection27:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4295808790, last ping 4295813790, now
4295818790
Nov  2 08:15:14 backup kernel:  connection27:0: detected conn error
(1011)
Nov  2 08:15:14 backup kernel:  connection33:0: ping timeout of 5 secs
expired, recv timeout 5, last rx 4295808796, last ping 4295813796, now
4295818796
Nov  2 08:15:14 backup kernel:  connection33:0: detected conn error
(1011) 


ov 10 09:08:04 backup kernel:  connection12:0: detected conn error (1011)
Nov 10 09:08:05 backup iscsid: Kernel reported iSCSI connection 12:0 error
(1011) state (3)
Nov 10 09:08:08 backup iscsid: connection12:0 is operational after recovery
(1 attempts)
Nov 10 09:09:43 backup kernel:  connection11:0: detected conn error (1011)
Nov 10 09:09:43 backup kernel:  connection12:0: detected conn error (1011)
Nov 10 09:09:44 backup kernel:  connection11:0: detected conn error (1011)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 11:0 error
(1011) state (3)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 12:0 error
(1011) state (3)
Nov 10 09:09:44 backup iscsid: Kernel reported iSCSI connection 11:0 error
(1011) state (1)
Nov 10 09:09:46 backup kernel:  session11: target reset succeeded\
Nov 10 09:09:47 backup iscsid: connection11:0 is operational after recovery
(1 attempts)
Nov 10 09:09:47 backup iscsid: connection12:0 is operational after recovery
(1 attempts)
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:09:56 backup kernel: end_request: I/O error, dev sdv, sector
60721248
Nov 10 09:09:56 backup kernel: device-mapper: multipath: Failing path 65:80.
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:09:56 backup kernel: end_request: I/O error, dev sdv, sector
60727648
Nov 10 09:09:56 backup kernel: sd 18:0:0:2: SCSI error: return code =
0x000e0000
Nov 10 09:10:31 backup kernel: device-mapper: multipath: Failing path
65:112.

Comment 1 matt-rhbugzilla 2009-12-18 15:37:34 UTC
Created attachment 379234 [details]
Packet dump co-inciding with error logs

Dec 18 09:33:04 backup kernel:  connection7:0: detected conn error (1011)
Dec 18 09:33:05 backup iscsid: Kernel reported iSCSI connection 7:0 error (1011) state (3)
Dec 18 09:33:07 backup kernel:  session7: target reset succeeded
Dec 18 09:33:08 backup iscsid: connection7:0 is operational after recovery (1 attempts)
Dec 18 09:33:17 backup kernel: sd 14:0:0:0: timing out command, waited 60s
Dec 18 09:33:17 backup kernel: sd 14:0:0:0: SCSI error: return code = 0x060e0000
Dec 18 09:33:17 backup kernel: end_request: I/O error, dev sdk, sector 197133352
Dec 18 09:33:17 backup kernel: device-mapper: multipath: Failing path 8:160.
Dec 18 09:33:17 backup kernel: sd 14:0:0:2: timing out command, waited 60s
Dec 18 09:33:17 backup kernel: sd 14:0:0:2: SCSI error: return code = 0x060e0000
Dec 18 09:33:17 backup kernel: end_request: I/O error, dev sdz, sector 1732248320
Dec 18 09:33:17 backup kernel: device-mapper: multipath: Failing path 65:144.
Dec 18 09:33:17 backup kernel: sd 14:0:0:2: timing out command, waited 60s
Dec 18 09:33:17 backup kernel: sd 14:0:0:2: SCSI error: return code = 0x060e0000
Dec 18 09:33:17 backup kernel: end_request: I/O error, dev sdz, sector 1736442240
Dec 18 09:33:17 backup kernel: sd 14:0:0:0: timing out command, waited 300s

Comment 2 Mike Christie 2009-12-20 01:10:26 UTC
Hi Matt,

One of our network experts thinks it could be packet loss or packet reordering. They are looking at your trace. I will update the bugzilla when I know more.

Comment 3 matt-rhbugzilla 2009-12-21 20:39:34 UTC
the server and storage unit are connected on the same switch - there's nothing to indicate that the switch is losing any packets (it works fine with the equalogic unit connected on it too) - there's no bad "counters" on any of the ports.

Comment 4 Mike Christie 2009-12-22 14:30:04 UTC
That is what the network people said too. Here is what they said about the wireshark trace:


OK, what wireshark indicated wasn't lost segment as seen by the
receiver, but a segment that was lost on the way to wireshark.

Comment 5 RHEL Program Management 2014-03-07 13:57:39 UTC
This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug.

Comment 6 RHEL Program Management 2014-06-02 13:25:48 UTC
Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support).