Bug 208017

Summary: iscsistart session hangs with equallogic PS300 array
Product: [Fedora] Fedora Reporter: Mark C. Davis <davismc>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: agrover, cgarde, davismc, mattdm, triage
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-06 16:24:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
shell script, commands output, wireshark output, tcpdump capture file none

Description Mark C. Davis 2006-09-25 21:01:59 UTC
Description of problem:
iscsi sessions to Equallogic PS300 array started using "iscsistart" hang.  
Same software works on Enterprise ISCSI Target

Version-Release number of selected component (if applicable) 684

How reproducible: Boot FC 6 T2. Use iscsistart to create a session.  Attempt 
to do IO (read ) on session.  (Also occurs if you use iscsistart in initrd, 
but boot from HDD is easier to reproduce.)


Steps to Reproduce:
1. Boot FC6 T2
2. issue iscsistart command to create session
3. read from new disk
  
Actual results: After a few seconds, read application will hang.


Expected results: read should succeed.


Additional info: This software works with iscsid/iscsiadm.  This software 
works agains Enterprise ISCSI Target.

Prelimary trouble shooting of enclosed files by Equallogic indicated this was 
an iscsistart/iscsis initiator problem.

Comment 1 Mark C. Davis 2006-09-25 21:02:00 UTC
Created attachment 137088 [details]
shell script, commands output, wireshark output, tcpdump capture file

Comment 2 Mike Christie 2006-09-25 21:18:55 UTC
What arch are you using?

And you are using iscsistart from the Fedora rpms correct? If so could you try
the tools and kernel from the current development snapshot?

The "development" dir like on this mrror, has some bug fixes and updates:

ftp://ftp.linux.ncsu.edu/pub/fedora/linux/core/development/i386/os/Fedora/RPMS/

Could you also run iscsistart with "-d 8"?

And was there anything in your kernel logs?

Comment 3 Mark C. Davis 2006-09-25 23:29:06 UTC
Quick answers:
arch == x86 (32 bit)
I will update to the latest.  This will take me one or two days
OK, I will use iscsistart -d 8
There was nothing in /var/log/messages

Thanks for you patience and help.  I will provide new info in bugzilla on 
about 9/27.

Comment 4 Mike Christie 2006-09-27 18:46:10 UTC
Mark do not bother sending me info. I got access to a equalogic target now and
can reproduce this here. It seems that we log in ok. are redirected ok, but then
for some reason there is a connection failure a couple seconds after the disk is
found and since iscsistart is only there to connect the IO hangs.

This works with iscsiadm + iscsid because iscsid can handle the error and relogin.

I suspect the problem is that the equalogic target is sending us noop, we do not
respond (iscsid handles this and is not up) and then the target drops the
connection.

I will send a patch shortly.

Comment 5 Mike Christie 2006-09-27 20:26:41 UTC
As a temp work around, I think you can disable the target from send nops. I do
not have admin access to the target here, so I am not sure though. I ccd someone
from equalogic that does know or can find out.

Comment 6 Cesar Garde 2006-09-27 20:35:45 UTC
Unfortunately, you cannot disable the keepalive NOPs from the EqualLogic array.
 In the most recent release (3.0.5), we do not send a keepalive within the first
300 seconds after a connection is established.

- Ces

Comment 7 Mike Christie 2006-09-27 20:40:21 UTC
Thanks for the reply Cesar. The 300 sec bahavior, should be enough to get us
booted normally.

I will continue to work on supporting nops in the kernel instead of userspace so
there is a fool proof solution.

Comment 8 Mark C Davis 2006-09-27 21:35:21 UTC
I did reproduce using today's code (2.6.18-1.2693.fc6), but it looks like you 
are way past that.  At least I am set up to test the patch.

Comment 9 Mark C Davis 2006-10-02 19:24:27 UTC
I upgraded my array to 3.0.5 and the problem did go away (my application does 
not wait 300 seconds between accesses), so this is a valid work around.

Comment 10 Matthew Miller 2007-04-06 17:07:12 UTC
Fedora Core 5 and Fedora Core 6 are, as we're sure you've noticed, no longer
test releases. We're cleaning up the bug database and making sure important bug
reports filed against these test releases don't get lost. It would be helpful if
you could test this issue with a released version of Fedora or with the latest
development / test release. Thanks for your help and for your patience.

[This is a bulk message for all open FC5/FC6 test release bugs. I'm adding
myself to the CC list for each bug, so I'll see any comments you make after this
and do my best to make sure every issue gets proper attention.]


Comment 11 Bug Zapper 2008-04-04 03:50:50 UTC
Fedora apologizes that these issues have not been resolved yet. We're
sorry it's taken so long for your bug to be properly triaged and acted
on. We appreciate the time you took to report this issue and want to
make sure no important bugs slip through the cracks.

If you're currently running a version of Fedora Core between 1 and 6,
please note that Fedora no longer maintains these releases. We strongly
encourage you to upgrade to a current Fedora release. In order to
refocus our efforts as a project we are flagging all of the open bugs
for releases which are no longer maintained and closing them.
http://fedoraproject.org/wiki/LifeCycle/EOL

If this bug is still open against Fedora Core 1 through 6, thirty days
from now, it will be closed 'WONTFIX'. If you can reporduce this bug in
the latest Fedora version, please change to the respective version. If
you are unable to do this, please add a comment to this bug requesting
the change.

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we are following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

And if you'd like to join the bug triage team to help make things
better, check out http://fedoraproject.org/wiki/BugZappers

Comment 12 Bug Zapper 2008-05-06 16:24:30 UTC
This bug is open for a Fedora version that is no longer maintained and
will not be fixed by Fedora. Therefore we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen thus bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.