Bug 499017

Summary: iSCSI cable pulls results in Host reboot.
Product: Red Hat Enterprise Linux 5 Reporter: Hector Arteaga <hector.arteaga>
Component: iscsi-initiator-utilsAssignee: Mike Christie <mchristi>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Storage QE <storage-qe>
Severity: high Docs Contact:
Priority: low    
Version: 5.2CC: bdonahue, bzeranski, coughlan, herbert.xu, jwest, kris.strecker, mchristi, phinchman, qcai, rick.bieber
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-01 14:11:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Full messages log and requested command output.
none
Serial Log with Oops
none
RH5.3 messages log none

Description Hector Arteaga 2009-05-04 20:13:32 UTC
Description of problem:
In testing some of our arrays, I perform some cable pulls to make sure failover/failback work as expected.  During long duration cable pulls (2 minutes plus) after about the 4 or 5 one my server reboots unexpectedly.  The last messages logged before restarting are:

Apr 29 18:35:10 cb-xen-srv13 kernel: sd 1:0:1:3: SCSI error: return code = 0x00020000
Apr 29 18:35:10 cb-xen-srv13 kernel: end_request: I/O error, dev sdap, sector 54984488
Apr 29 18:35:10 cb-xen-srv13 kernel: device-mapper: multipath: Failing path 66:144.
Apr 29 18:35:17 cb-xen-srv13 kernel: mptbase: ioc1: LogInfo(0x31170000): Originator={PL}, Code={IO Device Miss
ing Delay Retry}, SubCode(0x0000)
Apr 29 18:35:17 cb-xen-srv13 kernel: sd 1:0:1:6: SCSI error: return code = 0x00020000
Apr 29 18:35:17 cb-xen-srv13 kernel: end_request: I/O error, dev sdas, sector 20956463
Apr 29 18:35:37 cb-xen-srv13 kernel:  rport-5:0-0: blocked FC remote port time out: saving binding
Apr 29 18:35:37 cb-xen-srv13 kernel:  rport-5:0-1: blocked FC remote port time out: saving binding
Apr 29 18:35:39 cb-xen-srv13 kernel: mptsas: ioc1: removing ssp device, channel 0, id 4, phy 3
Apr 29 18:35:40 cb-xen-srv13 iscsid: connect failed (113)
Apr 29 18:41:02 cb-xen-srv13 syslogd 1.4.1: restart.

which are device mapper entries for my iscsi devices.  My OS is booting from a Fibre Channel array that does not have its paths interfered with.  Here are my iSCSI session settings:

[root@cb-xen-srv13 ~]# iscsiadm -m session -r 1
node.name = iqn.1986-03.com.hp:storage.msa2312i.0819d56ed0
node.tpgt = 3
node.startup = automatic
iface.hwaddress = default
iface.iscsi_ifacename = default
iface.net_ifacename = default
iface.transport_name = tcp
node.discovery_address = 11.1.203.1
node.discovery_port = 3260
node.discovery_type = send_targets
node.session.initial_cmdsn = 0
node.session.initial_login_retry_max = 4
node.session.cmds_max = 128
node.session.queue_depth = 32
node.session.auth.authmethod = None
node.session.auth.username = <empty>
node.session.auth.password = <empty>
node.session.auth.username_in = <empty>
node.session.auth.password_in = <empty>
node.session.timeo.replacement_timeout = 15
node.session.err_timeo.abort_timeout = 15
node.session.err_timeo.lu_reset_timeout = 30
node.session.err_timeo.host_reset_timeout = 60
node.session.iscsi.FastAbort = Yes
node.session.iscsi.InitialR2T = Yes
node.session.iscsi.ImmediateData = No
node.session.iscsi.FirstBurstLength = 262144
node.session.iscsi.MaxBurstLength = 16776192
node.session.iscsi.DefaultTime2Retain = 0
node.session.iscsi.DefaultTime2Wait = 2
node.session.iscsi.MaxConnections = 1
node.session.iscsi.MaxOutstandingR2T = 1
node.session.iscsi.ERL = 0
node.conn[0].address = 11.2.203.2
node.conn[0].port = 3260
node.conn[0].startup = manual
node.conn[0].tcp.window_size = 524288
node.conn[0].tcp.type_of_service = 0
node.conn[0].timeo.logout_timeout = 15
node.conn[0].timeo.login_timeout = 15
node.conn[0].timeo.auth_timeout = 45
node.conn[0].timeo.noop_out_interval = 5
node.conn[0].timeo.noop_out_timeout = 5
node.conn[0].iscsi.MaxRecvDataSegmentLength = 131072
node.conn[0].iscsi.HeaderDigest = None,CRC32C
node.conn[0].iscsi.IFMarker = No
node.conn[0].iscsi.OFMarker = No


Version-Release number of selected component (if applicable):
RedHat 5.2 (Xen)
iscsi-initiator-utils-6.2.0.868-0.7.el5
All components are default inbox versions.

How reproducible:
100%

Steps to Reproduce:
1. Setup a Server with iSCSI targets
2. Start I/O (not sure if I/O is necessary, but that how I've been reproducing it)
3. Pull the cable from 1 path and wait 3 or 4 minutes
4. Reconnect the cable and wait 10 minutes making sure that Device mapper has regained the paths.
5. Repeat 3 and 4 with the remaining paths.
  
Actual results:
Server reboots after about the 4th or 5th itteration

Expected results:
I/O should failover between the paths but no sever reboots.

Additional info:
I've tried this with an iSCSI HBA (Hardware iSCSI) and the problem does not occur.)  The behavior I'm seeing is somewhat similar to bug: 429469

Comment 1 Mike Christie 2009-05-05 18:30:44 UTC
(In reply to comment #0)
> Description of problem:
> In testing some of our arrays, I perform some cable pulls to make sure
> failover/failback work as expected.  During long duration cable pulls (2
> minutes plus) after about the 4 or 5 one my server reboots unexpectedly.  The
> last messages logged before restarting are:
> 
> Apr 29 18:35:10 cb-xen-srv13 kernel: sd 1:0:1:3: SCSI error: return code =
> 0x00020000
> Apr 29 18:35:10 cb-xen-srv13 kernel: end_request: I/O error, dev sdap, sector
> 54984488
> Apr 29 18:35:10 cb-xen-srv13 kernel: device-mapper: multipath: Failing path
> 66:144.
> Apr 29 18:35:17 cb-xen-srv13 kernel: mptbase: ioc1: LogInfo(0x31170000):
> Originator={PL}, Code={IO Device Miss
> ing Delay Retry}, SubCode(0x0000)
> Apr 29 18:35:17 cb-xen-srv13 kernel: sd 1:0:1:6: SCSI error: return code =
> 0x00020000
> Apr 29 18:35:17 cb-xen-srv13 kernel: end_request: I/O error, dev sdas, sector
> 20956463
> Apr 29 18:35:37 cb-xen-srv13 kernel:  rport-5:0-0: blocked FC remote port time
> out: saving binding
> Apr 29 18:35:37 cb-xen-srv13 kernel:  rport-5:0-1: blocked FC remote port time
> out: saving binding
> Apr 29 18:35:39 cb-xen-srv13 kernel: mptsas: ioc1: removing ssp device, channel
> 0, id 4, phy 3
> Apr 29 18:35:40 cb-xen-srv13 iscsid: connect failed (113)
> Apr 29 18:41:02 cb-xen-srv13 syslogd 1.4.1: restart.
> 
> which are device mapper entries for my iscsi devices.  My OS is booting from a
> Fibre Channel array that does not have its paths interfered with.  Here are my
> iSCSI session settings:
> 

Which ones are the iscsi paths and which ones are the fc paths?

iscsid: connect failed (113) indicates that we cannot connect at the iscsi/tcp level.

However, it looks like some fc paths are getting disrupted. The "blocked FC remote port time out" messages means that something happened to some fc paths and eventually the kernel had to block the paths and that timed out so it eventually did the equivalent of marking them offline (any IO sent to the fc paths would fail).


If the root fc paths are gone, then software iscsi may be hitting some issue where it is trying to access the root disk for something and failing.


> [root@cb-xen-srv13 ~]# iscsiadm -m session -r 1
> node.name = iqn.1986-03.com.hp:storage.msa2312i.0819d56ed0
> node.tpgt = 3
> node.startup = automatic
> iface.hwaddress = default
> iface.iscsi_ifacename = default
> iface.net_ifacename = default
> iface.transport_name = tcp
> node.discovery_address = 11.1.203.1
> node.discovery_port = 3260
> node.discovery_type = send_targets
> node.session.initial_cmdsn = 0
> node.session.initial_login_retry_max = 4
> node.session.cmds_max = 128
> node.session.queue_depth = 32
> node.session.auth.authmethod = None
> node.session.auth.username = <empty>
> node.session.auth.password = <empty>
> node.session.auth.username_in = <empty>
> node.session.auth.password_in = <empty>
> node.session.timeo.replacement_timeout = 15


The 2 minutes is sort of odd. For iscsi we should start failing IO withint 15 seconds or sooner after you see something like a conn error or nop/ping timeout.

2 minutes though, is close to the fc remote port timeout default value.

Could you send all the log output? Could you also run multipath -ll and iscsiadm -m session -P 3 before you pull any cables?

Comment 2 Hector Arteaga 2009-05-05 21:56:54 UTC
Created attachment 342546 [details]
Full messages log and requested command output.

The log that messages that I copied were from one of my initial runs in which I was pulling both iSCSI and FC paths.  I was able to verify that the issue occurs when I pull only the iSCSI paths and does not occur when I pull only the FC paths.  
I'm attaching a the full /var/log/messages log in which the latest run was only with iSCSI cable pulls.  The test was run between the timestamps of "May  1 12:18:42" and "May  1 16:50:44" (the "May  1 16:50:44" restart was unexpected).  The zip file also contains the iscsiadm and multipath output that you requested.
When I initially encoutered this issue, my node.session.timeo.replacement_timeout setting was set to 120 (which might explain the 2 minute thing).  I reduced it to 15 as it is recommended that this be lowered if using multipath.

Comment 3 Mike Christie 2009-05-06 15:43:24 UTC
Are you running the iscsi initiator in a xen guest/vm/domU?

Could you also try this using RHEL 5.3 or at least the kernel from 5.3? If you do not have access to 5.3, then you can try this kernel
http://people.redhat.com/dzickus/el5/141.el5/

And did you take this log directly from /var/log/messages or did you have a serial line attached? I think if you have a serial line and you do
tail -f /var/log/messages
you might catch a oops.


And could you try the new kernel test without multipath? I do not think it is multipath, but it will reduce the log output.

Comment 4 Hector Arteaga 2009-05-06 16:18:23 UTC
I'm not running iscsi initiator from the VM.  I'm running it only on the Xen Server (Dom0).
The log I attached was taken directly from /var/log/messages.  I can setup the serial port to see if I can catch the oops.
I can also try RH5.3 but I don't know if I'll be able to run it without mulitpath as I'm not sure if our test bench will like seeing multiple paths to a device withough DM.  This might take a little longer to setup though.

Comment 5 Paul Hinchman 2009-05-06 18:00:51 UTC
IT # 293622 also created at recommendation of engineering alliance manager.

Comment 6 Hector Arteaga 2009-05-07 16:22:19 UTC
Created attachment 342873 [details]
Serial Log with Oops

I was able to setup serial port logging and reproduce the issue capturing the Oops.  I've attached the log.  Still working on setting up the 5.3 setup.

Comment 7 Mike Christie 2009-05-07 18:32:26 UTC
Ok, do not worry about the multipath comment. If it is easier just run with it.

Comment 8 Hector Arteaga 2009-05-12 17:33:51 UTC
I've loaded RH5.3 and presented some storage to it.  So far I've tried a couple of times to reproduce unsuccessfully, although the first time I the test did fail but in a different mannter.  The server did not reboot, but it was very unresponsive and any multipath related commands would hang.  This caused most of the test processes from our test bench to time out.

Comment 9 Hector Arteaga 2009-05-15 01:14:25 UTC
Created attachment 344068 [details]
RH5.3 messages log

Third time's a charm. :)  I was able to reproduced the issue on RH5.3.  The difference between this run and the other runs is that on the other runs I was generating IO directly from Dom0 and did not have VMs installed.  This last run I created VMs to make the setup more like the original failing setup and was generating IO through the VMs.  So there may be some xen/hypervisor interaction that may be causing this.  I've attached the /var/log/messages from my RH5.3 server that has the unexpected reboot (at May 14 14:35:55).  Also, I started the test at around 1:00 pm.

Comment 12 Mike Christie 2009-06-08 15:45:56 UTC
Hector,

What net driver are you using? If it is not bnx2x, could you try this kernel:
http://people.redhat.com/dzickus/el5/152.el5/

Comment 13 Hector Arteaga 2009-06-09 02:15:59 UTC
I installed Kernel 152.el5 from the source you specified.  All I did was run:
rpm -ivh kernel-xen-2.6.18-152.el5.x86_64.rpm

I seem to recall that updating the kernel was more involved than that, but its been a while since I've done a kernel updae.  Is there something else that needs to be done?
The kernel seems to have loaded fine and I was able to boot the systme up with that kernel:

# uname -a
Linux cb-xen-vm220.caslbahn.com 2.6.18-152.el5xen #1 SMP Wed Jun 3 19:14:54 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

The problem I am having now is that when I try to start the VMs that I had previously created I get the following error message:

# xm create vm221_Win2003
Using config file "/etc/xen/vm221_Win2003".
Error: Device 768 (vbd) could not be connected. Hotplug scripts not working.

Do I have to recreate the VMs?  Any ideas?

Comment 15 Mike Christie 2009-06-09 16:03:36 UTC
(In reply to comment #12)
> Hector,
> 
> What net driver are you using? If it is not bnx2x, could you try this kernel:
> http://people.redhat.com/dzickus/el5/152.el5/  

Could you also try this iscsi rpm
http://people.redhat.com/mchristi/iscsi/rhel5.4/iscsi-initiator-utils/
with the updated kernel?

Comment 16 Mike Christie 2009-06-09 16:07:18 UTC
(In reply to comment #13)
> I installed Kernel 152.el5 from the source you specified.  All I did was run:
> rpm -ivh kernel-xen-2.6.18-152.el5.x86_64.rpm
> 
> I seem to recall that updating the kernel was more involved than that, but its
> been a while since I've done a kernel updae.  Is there something else that
> needs to be done?

I am not sure with XEN. On some of my test systems I have just updated the kernel. Sometimes you might have to update other rpms. With XEN though, I am not sure.


> The kernel seems to have loaded fine and I was able to boot the systme up with
> that kernel:
> 
> # uname -a
> Linux cb-xen-vm220.caslbahn.com 2.6.18-152.el5xen #1 SMP Wed Jun 3 19:14:54 EDT
> 2009 x86_64 x86_64 x86_64 GNU/Linux
> 
> The problem I am having now is that when I try to start the VMs that I had
> previously created I get the following error message:
> 
> # xm create vm221_Win2003
> Using config file "/etc/xen/vm221_Win2003".
> Error: Device 768 (vbd) could not be connected. Hotplug scripts not working.
> 

I will do some digging to see if other rpms are needed.

> Do I have to recreate the VMs?  Any ideas?

Comment 18 Hector Arteaga 2009-06-09 17:16:46 UTC
I figured out why my VMs weren't starting.  Whenever I reboot my server, my LUNs are discovered and LVM finds all the VGs and LVs but all the LVs are not in an active state.  I have to manually activate them everytime I reboot using vgchange -ay VGname.  I've seen this before and should have caught this yesterday.  
I've installed the new kernel and updated iscsi-tools.  I'll be re-running the test today and I'll let you know how it goes.

Comment 19 Hector Arteaga 2009-06-09 19:57:51 UTC
No luck.
I was able to reproduce the system crash with the new kernel and iscsi-tools.

Jun  9 11:25:09 cb-xen-vm220 multipathd: sdw: tur checker reports path is down
Jun  9 11:25:09 cb-xen-vm220 multipathd: sdy: tur checker reports path is down
Jun  9 11:27:48 cb-xen-vm220 syslogd 1.4.1: restart.
Jun  9 11:27:48 cb-xen-vm220 kernel: klogd 1.4.1, log source = /proc/kmsg started.

Comment 21 Mike Christie 2009-06-26 00:38:04 UTC
(In reply to comment #20)
> (In reply to comment #17)
> >
> > If I do a close() and the last sockfd_put/sock_put on the sock/sock->sk on the
> > socket, when those return should the network layer be touching any of the pages
> > I sent to it? 
> 
> If you simply closed the fd and dropped *your* reference count on the socket,
> then yes of course the network lay can still be touching your page since it may
> be part of an inflight skb.
> 
> However, if the socket's own refcount hits zero and is about to be freed, then
> there should be no skbs referring to the socket.  However, the pages may still
> be in use.  This does not appear to be the case in this crash though.
> 
> So the only safe time when you can touch the pages is when the page refcount
> hits 1 (if you still have your own reference, otherwise zero).  


I wanted to narrow down the problem and verify the above, so I sent Hector a kernel/patch that has the iscsi code just use kernel_sendmsg(). This way the pages the network layer is accessing and using are all allocated and owned by the network layer.

He is still getting the same problem though. Is this a valid test and can we say it might not be the iscsi layer fudging up, or is this problem still possible when we use kernel_sendmsg()? And if so, would it be a problem with the network layer's memory/page management?

Comment 23 Mike Christie 2009-06-26 00:44:00 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > (In reply to comment #17)
> > >
> > > If I do a close() and the last sockfd_put/sock_put on the sock/sock->sk on the
> > > socket, when those return should the network layer be touching any of the pages
> > > I sent to it? 
> > 
> > If you simply closed the fd and dropped *your* reference count on the socket,
> > then yes of course the network lay can still be touching your page since it may
> > be part of an inflight skb.
> > 
> > However, if the socket's own refcount hits zero and is about to be freed, then
> > there should be no skbs referring to the socket.  However, the pages may still
> > be in use.  This does not appear to be the case in this crash though.
> > 
> > So the only safe time when you can touch the pages is when the page refcount
> > hits 1 (if you still have your own reference, otherwise zero).  
> 
> 
> I wanted to narrow down the problem and verify the above, so I sent Hector a
> kernel/patch that has the iscsi code just use kernel_sendmsg().


Herbert, ignore this. I think the kernel might have been bad. I will update the bz when I have Hector try a new kernel.

Comment 24 Herbert Xu 2009-06-26 01:10:24 UTC
Can you attach a copy of the patch here? Thanks! Also please attach the new oops message with the patch applied.

Comment 25 Mike Christie 2009-10-21 15:24:20 UTC
Waiting on results from Hector at HP. Setting devel ack for 5.5 incase we get the info in time.

Comment 28 RHEL Program Management 2010-08-09 19:11:19 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 30 Tom Coughlan 2011-02-01 14:11:27 UTC
waiting for test results since 2009-10-21
closing