Bug 149016 - iSCSI host is hanging during port disables
Summary: iSCSI host is hanging during port disables
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: iscsi-initiator-utils
Version: 3.0
Hardware: i386
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brock Organ
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-02-17 22:32 UTC by Heather Conway
Modified: 2008-04-07 04:42 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-05-06 17:47:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
serial console output from a host with no PowerPath at all (96.37 KB, text/plain)
2005-02-17 22:36 UTC, Heather Conway
no flags Details
panic report from a host with PowerPath v4.3.2 (4.58 KB, text/plain)
2005-02-17 22:38 UTC, Heather Conway
no flags Details
iscsi.conf that was being used (20.90 KB, text/plain)
2005-02-17 22:44 UTC, Heather Conway
no flags Details
panic report from host with PowerPath v4.3.2 (1.08 KB, text/plain)
2005-02-18 15:06 UTC, Heather Conway
no flags Details

Description Heather Conway 2005-02-17 22:32:15 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; EMC IS 55; .NET CLR 1.0.3705; .NET CLR 1.1.4322)

Description of problem:
RHEL 3.0 U4 host hangs when running port disables on Cisco switch ports for CLARiiON SPs.  The host is Dell PE4600 using the on-board e1000 NIC running RHEL 3.0 U4 v2.4.21-27.ELsmp kernel.  The driver being used is the integrated 3.6.1 iscsi driver attached to a CLARiiON iSCSI array via a Cisco switch.
After a variable amount of time (anywhere from 30 minutes to a day and a half), the host would hang while the port disable script was running ( 5 minute disable with a 10 minute wait).  The host was running Iozone (filesystem I/O).
Nothing of substance was being reported in/var/log/messages and the host was simply hung so a power cycle was required.  
Enabled the NMI Watchdog to initiate a panic.  When the system hung, it produced a similar panic with PowerPath v4.3.1 (release), v4.3.2 (beta), and without Powerpath at all.

Version-Release number of selected component (if applicable):
iscsi_sfnet v3.6.1

How reproducible:
Always

Steps to Reproduce:
1. Attach your RHEL 3.0 U4 host via iSCSI to a CLARiiON array.
2.  Run I/O.
3.  Disable the switch ports.
  

Actual Results:  The host will hang.  

Expected Results:  The host should not hang and should continue running.

Additional info:

Comment 1 Heather Conway 2005-02-17 22:36:48 UTC
Created attachment 111180 [details]
serial console output from a host with no PowerPath at all

Attached is the serial console output from the host running through the port
disables without PowerPath.

Comment 2 Heather Conway 2005-02-17 22:38:47 UTC
Created attachment 111181 [details]
panic report from a host with PowerPath v4.3.2

Attached is the panic report from the host running through the port
disables with PowerPath v4.3.2.

Comment 3 Heather Conway 2005-02-17 22:44:52 UTC
Created attachment 111182 [details]
iscsi.conf that was being used

In case it's needed, I've attached a copy of the iscsi.conf file that was being
used on this host.  It was the same for both PowerPath and non-PowerPath
testing.
Thanks.

Comment 4 Heather Conway 2005-02-18 15:06:50 UTC
Created attachment 111203 [details]
panic report from host with PowerPath v4.3.2

Attached is the clean panic report from the host running through the port
disables with PowerPath v4.3.2.

Comment 5 Tom Coughlan 2005-02-25 15:51:16 UTC
In the opennnig comment you said that the on-board e1000 NIC is being used. The
crash output indicates that tg3 and e100 are loaded. No e1000.

Assuming that this failure was with the tg3, would you be able to re-test with
an e1000 NIC, so we can see if the problem is specific to the tg3?

Also, would you be able to re-test on tg3 without iSCSI? Just run a network load
over the tg3. 

Thanks.

Tom  

Comment 6 Heather Conway 2005-05-06 12:51:15 UTC
Per Wayne, this problem is no longer occurring since moving to RHEL 3.0 U5 and 
using a different iscsi.conf file.  Waiting for an update on his testing before 
closing the bugzilla.

Comment 7 Heather Conway 2005-05-06 12:51:38 UTC
Per Wayne, this problem is no longer occurring since moving to RHEL 3.0 U5 and 
using a different iscsi.conf file.  Waiting for an update on his testing before 
closing the bugzilla.

Comment 8 Heather Conway 2005-05-06 17:46:47 UTC
Closing the Bugzilla as the problem hasn't been replicated with RHEL 3.0 U5.


Note You need to log in before you can comment on or make changes to this bug.