Created attachment 407233 [details] Screenshot showing connection and timeout issues. Description of problem: The changes made to the /etc/init.d/iscsid script are preventing shutdown or reboot of the machine. Version-Release number of selected component (if applicable): iscsi-initiator-utils-6.2.0.871-0.16.el5 How reproducible: Always Steps to Reproduce: 1. Add an iscsi device to a RHEL5.5 machine through iscsiadm 2. Reboot machine Actual results: Machine hangs waiting for sync of iscsi devices after network is down. Expected results: Machine reboots. Additional info: Info from the job logged with Redhat support 16-APR-2010 04:38:33 Matt Clark File shutdown-screen-old-iscsid.jpg attached 16-APR-2010 04:38:32 Matt Clark Ok, it now is working with the old 5.4 shutdown scripts that delete the network shutdown sym links from the rc0.d and rc6.d directory. Attached a screen shot of the behaviour now that the network is not getting shut down as part of runlevel 0 or 6. The screen shot is from a fresh 5.5 build of redhat with the 2.6.18-194 kernel and the /etc/init.d/iscsid script from the iscsi-initiator-utils on redhat 5.4 (iscsi-initiator-utils-6.2.0.871-0.10.el5.x86_64.rpm). It seems to fail at a different layer and is obviously not ideal, this probably goes back to the thread that I pasted in an earlier entry. There is something missing to allow this to all happen gracefully. The error that is now occuring responds immediately instead of giving the timeout. Could be something like the network layer is still there but the ports are down causing an immediate error to the attempts to contact the iscsi device. 16-APR-2010 04:15:52 Matt Clark Sorry that should be the /etc/rc6.d directory that I would like a listing of. Anyway, I had a look at the iscsi and iscsid kill order for 5.4 and they match what I have for 5.5. So this is not the problem.... I did find a difference in the two iscsi shutdown scripts matt@axiom tmp]$ diff iscsid-5.4 iscsid-5.5 19,26d18 < echo -n $"Turning off network shutdown. " < # we do not want iscsi or network to run during system shutdown < # incase there are RAID or multipath devices using < # iscsi disks < chkconfig --level 06 network off < rm /etc/rc0.d/*network < rm /etc/rc6.d/*network < 31a24 > modprobe -q be2iscsi 71a65,66 > rmmod be2iscsi 2>/dev/null Looking at the comment above it seems pretty clear that this is exactly my case. I will try using the 5.4 shutdown script and see what happens, but obviously the long term solution would be to get this fixed in the release as I don't want to be using any non-standard redhat scripts on these machines. Are you able to find out the reasoning for the change? Thanks, Matt. 16-APR-2010 03:26:55 Matt Clark Hi Kenji, To answer your questions:- 1. There was no IO on the system (devices not even mounted). 2. Sync not necessary as devices not mounted. 3. The connection errors are because the network has shutdown through the init scripts and therefore can't get to the iscsi device. 4. Yes we are using the Redhat iscsi-iniator-utils. 5. Downgraded the kernel and now find that I am having the same problem on 2.6.18-164. So maybe it's related to changes in the iscsi package. That explains why you didn't see the problem with 2.6.18-194 installed on your 5.4 system. Just to re-iterate, this problem wasn't there in 5.4 and I use exactly the same kickstart to rebuild these machines as fresh 5.5 machines. I did not do a yum upgrade from 5.4. Hopefully this is a simple issue with rcX.d shutdown ordering. I'll try and reinstall a 5.4 image. Actually if you could give me an ls of the /etc/rc5.d dir from your test system that could possibly help. Thanks, Matt. 16-APR-2010 03:26:55 Matt Clark Status changes from "Waiting on Customer" to "Waiting on Red Hat". 15-APR-2010 06:30:24 Suzuoki, Kenji Status changes from "Waiting on Red Hat" to "Waiting on Customer". 15-APR-2010 06:05:19 Suzuoki, Kenji Hello, I could not reproduce the issue by simply upgrading the kernel to 2.6.18-194.el5. As the message suggest that the "Synchronizing SCSI cache for disk <disk name>" message seems that the kernel sends a instruction to the device through the normal SCSI command structure and it waits for the command to complete. Could you provide us the following information ? 1. Was the system doing some I/O to the iSCSI device when shutting down ? 2. Can you execute sync command before shutting / rebooting the system to see if you still encounter the issue ? 3. There are many connection error related message in the screen shot. Are you aware of the cause and where it is from ? 4. Does the system use iscsi-initiator-utils package provided by Red Hat ? 5. Reproducer Since we did not manage to reproduce the issue by simply updating the kernel, we wonder if you could provide us the detailed steps of how to reproduce the issue. Best Regards, Kenji Suzuoki Red Hat Global Support Services 15-APR-2010 04:16:49 Suzuoki, Kenji Hi Matt, My name is Kenji Suzuoki, a technician in APAC region. I have taken the ownership of the issue. From the URL and the symptom, it seems like a bug. I am currently trying to upgrade my RHEL5.4 system from 2.6.18-164.el5 to 2.6.18-194.el5 to see if it is reproducible in my end as well. I will get back to you once I get some more information or managed to reproduce the issue. Also it would be greatly appreciated if you can switch the kernel version to RHEL5.4 (2.6.18-164.el5) and confirm that the problem does NOT appear with the same setting on your end as well. Best Regards, Kenji Suzuoki Red Hat Global Support Services 14-APR-2010 04:16:17 Matt Clark Sorry that should read that the VM's are NOT set to autostart on this machine hence are not part of the issue. 14-APR-2010 04:14:26 Matt Clark File sosreport-mclark.2011844-996904-82cf37.tar.bz2 attached 14-APR-2010 04:14:25 Matt Clark Hi Dominic, The machine doesn't reboot after this. Because there are 4 interfaces on the iscsi and there are 2 LUN's shared, I would expect that it might take 24 (4x2x3) minutes to reboot. I did leave a machine in this state for quite some time and I am pretty sure it was longer than this, so the answer is I am not sure. Even at 24 minutes that is just too long. As for virtual machines, yes there are virtual machines on this physical host however my tests were done without firing up any of the virtual machines. I.e. a reboot directly after the machine started (and the VM's are set to autostart). SOS report attached. Thread were something that sounds like this issue is discussed. http://osdir.com/ml/linux.iscsi.open-iscsi/2008-05/msg00198.html RHEL version is 5.5 and I can confirm this was not happening in 5.4. Unfortunately I need 5.5 to solve bug 487763 (multiple MAC's on a bonded interface with respect to the bridging interface). Thanks, Matt. 14-APR-2010 04:14:25 Matt Clark Status changes from "Waiting on Customer" to "Waiting on Red Hat". 13-APR-2010 13:10:36 Dominic Padinjattumkara Geevarghese Status changes from "Waiting on Red Hat" to "Waiting on Customer". 13-APR-2010 13:04:43 Dominic Padinjattumkara Geevarghese Dear Sir, Thank you for contacting Red Hat Support. Are you using iscsi on VMs ?. Please provide details. Also, I am able to see the line "timing out command, waited 180s " from the screenshot. Does it reboot after 180s ? Or it wait even after mentioned time ? What is the RHEL version you are using ?. It would be great if you could provide a sosreport to understand the general configuration details, logs etc. Please refer http://kbase.redhat.com/faq/docs/DOC-2366 Thanks, Dominic 13-APR-2010 05:07:14 Dominic Padinjattumkara Geevarghese Status changes from "Open" to "Waiting on Red Hat". 13-APR-2010 03:52:48 Matt Clark There seems to be an issue with the ordering of the removal of the network and the flushing of the scsi cache. Looking at a few posts it seems that the flushing of the scsi cache is a kernel feature (and not something that can be run before the network is taken down). Basically the network is taken down, and then as one of the very last steps the md devices are stopped which triggers a scsi cache sync and this hangs as there is no access to the iSCSI device. Is there something I can do to avoid this? 13-APR-2010 03:52:48 Matt Clark File iscsi-cache-issue.bmp attached.
Created attachment 407234 [details] screenshot of shutdown when using the redhat 5.4 iscsid script This screen shot is from a fresh build of redhat 5.5 with only the /etc/init.d/iscsid script replaced with the one from the redhat 5.4 iscsi-initiator-utils.
(In reply to comment #0) > Is there something I can do to avoid this? > 13-APR-2010 03:52:48 Matt Clark > File iscsi-cache-issue.bmp attached. You can work around this problem by changing the cache settings on the target, so it does not require a cache sync to be sent on shutdown. I think you would set the cache settings to something like write through. If this is not possible I think you can run the chkconfig --level 06 network off by hand. However, I am working on a fix and should be done shortly. I think all we need is a iscsiadm -m node --logout=all call added to the /etc/init.d/iscsi script in the "stop" section, but I am have to double check that for boot, it is setting the node.startup=boot so boot/root sessions do not get shutdown too.
I am a bit lacking in the understanding of how the iscsiadm persistency works, so this may be an irrelevant question but wouldn't that mean you would have to re-login to the each of the iscsi portals at boot? Or does the automatic login still function as a result of the entries in /var/lib/iscsi/send_targets? I don't have a test machine to play with for the next couple of days so I can't try this myself...
(In reply to comment #3) > I am a bit lacking in the understanding of how the iscsiadm persistency works, > so this may be an irrelevant question but wouldn't that mean you would have to > re-login to the each of the iscsi portals at boot? Or does the automatic login We already log into all the targets at boot. Currently when you shutdown/reboot, the session does not get a complete shutdown. There is no iscsi logout sent. But the disks are synced if needed. On startup then the initiator sends a login command and the target recognizes this as being a continuation of the old session or starts a new one if it has cleaned up the old one.
Just wanted to update with some status. My first fix that I tried in comment #2 broke setups that did iscsi root. I thought they used the startup=boot flags, but do not. I am working on a more complex fix.
*** Bug 590173 has been marked as a duplicate of this bug. ***
Hi, I am still working on a fix for this. I just wanted to add a temp workaround. You can just run the same commands that the iscsi script was running. However, you only need to turn this when you have made changes to the net init scripts (like when you update your system or init scripts rpm). The iscsi scripts ran it every time the iscsi script ran incase a user updated the net init scripts settings after installing the iscsi tools. So after you have installed iscsi-initiator-utils and the init scripts just run: chkconfig --level 06 network off rm /etc/rc0.d/*network rm /etc/rc6.d/*network
*** Bug 584912 has been marked as a duplicate of this bug. ***
Created attachment 429730 [details] Patch for /etc/init.d/network to check iSCSI sessions. Hi, I made a modification in a /etc/init.d/network to check if there is an existing iSCSI session during reboot/shutdown. If there is one, the network service does not stop.
(In reply to comment #10) > Created an attachment (id=429730) [details] > Patch for /etc/init.d/network to check iSCSI sessions. > > Hi, > > I made a modification in a /etc/init.d/network to check if there is an existing > iSCSI session during reboot/shutdown. If there is one, the network service does > not stop. Nice. Thanks for the patch. I will check with the net scripts maintainer to see if it is ok with them. It seems to handle all the setups/scenarios.
(In reply to comment #10) > Created an attachment (id=429730) [details] > Patch for /etc/init.d/network to check iSCSI sessions. > > Hi, > > I made a modification in a /etc/init.d/network to check if there is an existing > iSCSI session during reboot/shutdown. If there is one, the network service does > not stop. A good patch, but fails if there is more than one iSCSI session open. This would do the job for multiple sessions: if [ `find /sys/class/iscsi_session/ -mindepth 1 -maxdepth 1 -type d | wc -l` -ge 1 ]; then
I can confirm that (In reply to comment #15) > (In reply to comment #10) > > Created an attachment (id=429730) [details] [details] > > Patch for /etc/init.d/network to check iSCSI sessions. > > > > Hi, > > > > I made a modification in a /etc/init.d/network to check if there is an existing > > iSCSI session during reboot/shutdown. If there is one, the network service does > > not stop. > > A good patch, but fails if there is more than one iSCSI session open. > This would do the job for multiple sessions: > > > if [ `find /sys/class/iscsi_session/ -mindepth 1 -maxdepth 1 -type d | wc -l` > -ge 1 ]; then Yes it's true I tested your patch with "find /sys/class/iscsi_session/ -mindepth 1 -maxdepth 1 -type d | wc -l" and it's work very well. I use MSA2312i with 4 sessions. If I use this: [ -d /sys/class/iscsi_session/session* ] && echo "OK" I got: -bash: [: too many arguments Is there any chance that this patch we'll appear in a next release.
(In reply to comment #16) > Is there any chance that this patch we'll appear in a next release. For the net script patch in this bz, I am waiting on the init script maintainer to review the patch and ok it. I made a iscsi-initiator-utils z stream release that added some code to turn off the network shutdown when iscsi rpm is installed (basically does what the iscsi init script was doing before). It is not perfect and is not a complete fix, but is is better than we have now. It is being tested now. Hopefully it will just be a band aid until we hear back from the init script maintainer.
(In reply to comment #17) > I made a iscsi-initiator-utils z stream release that added some code to turn > off the network shutdown when iscsi rpm is installed (basically does what the > iscsi init script was doing before). Oh yeah, I put the rpm I mentioned here: http://people.redhat.com/mchristi/iscsi/rhel5.6/iscsi-initiator-utils/
I just moved my iscsi to a CentOS 5.5 system (the 5.4->5.5 system was fine), ran into this bug, tried iscsi-initiator-utils-6.2.0.871-0.18.el5.x86_64.rpm per comment 18, but failed to fix the issue. I'm hanging on reboot while syncing scsi cache for sde, which is in /etc/fstab as: /dev/sde1 /b xfs noatime,_netdev,nodev 0 4 Boot-up is fine, iscsi logs in and /b is mounted. Root is local disk.
What does: chkconfig --list network output? Is there /etc/rc0.d/*network or /etc/rc6.d/*network links? If you run: chkconfig --level 06 network off rm /etc/rc0.d/*network rm /etc/rc6.d/*network by hand does it work (try several reboots to make sure something was not resetting the network init scripts to on)?
chkconfig --list network network 0:off 1:off 2:on 3:on 4:on 5:on 6:off ls -1 /etc/rc[06].d/*network /etc/rc0.d/K90network /etc/rc6.d/K90network sudo chkconfig --level 06 network off ls -1 /etc/rc[06].d/*network /etc/rc0.d/K90network /etc/rc6.d/K90network sudo rm /etc/rc[06].d/*network ls -1 /etc/rc[06].d/*network ls: /etc/rc[06].d/*network: No such file or directory rebooted twice (first time was okay) with no hang. Seems /etc/rc[06].d/*network need to be manually removed. Thanks, Mike! Let me know if I can be of further assistance in coming to the final resolution.
Adding Dell's request for 5.5-z and RHEl5.6 fix. Dell would be testing the fix.
We are also seeing this issue with a Dell MD3000i using their delivered MPP multi-path drivers. The patch above using the find variation fixed the problem. It seems to me that these _netdev, non root, iSCSI devices SHOULD be removed before network is stopped. The following in /etc/init.d/iscsi is what is causing the iscsi scripts to not remove the devices. # If this is a final shutdown/halt, do nothing since # lvm/dm, md, power path, etc do not always handle this if [ "$RUNLEVEL" = "6" -o "$RUNLEVEL" = "0" -o "$RUNLEVEL" = "1" ]; then success return fi Which script should be monitoring these network dependent devices? Not sure. Should we just leave network up until the plug is pulled? Thanks for all the info! If there is anything I can do, or any information I can provide, please let me know. -Chris
Created attachment 449573 [details] Don't turn of net on shutdown if iscsi is running This combines the patch from comment #10 with the comment from #15. initscript devs, is this patch ok? The previous fix I tried in this bz is not working when the initscripts are installed before iscsi.
Given that we silently exit regardless of the runlevel if root is on a network block device, not sure why we'd test the runlevel here. But it's a reasonable fix.
I've tried patch from comment 38, and it's work very well on reboot (I used shutdown -r now). I've just patched my /etc/init.d/network and reboot my host.
Just my two cents: I remember that even when shutdown worked on the client, the TCP connections (or on ISCSI level too) according to target were not closed, preventing target reboot. Are they closed properly now?
(In reply to comment #43) > Just my two cents: > > I remember that even when shutdown worked on the client, the TCP connections > (or on ISCSI level too) according to target were not closed, preventing target > reboot. > > Are they closed properly now? No. In RHEL 6 we do a explicit logout on shutdown/reboot, but in RHEL 5 we still leave them open due to apps using iscsi not being prepared for the devices to be removed (in RHEL5 apps thought it would work like fibre channel where during shutdown/reboot the /dev/sdXs do not get removed).
(In reply to comment #31) > We are also seeing this issue with a Dell MD3000i using their delivered MPP > multi-path drivers. The patch above using the find variation fixed the problem. > It seems to me that these _netdev, non root, iSCSI devices SHOULD be removed > before network is stopped. The following in /etc/init.d/iscsi is what is > causing the iscsi scripts to not remove the devices. Hi Chris, same issue here. This is what worked for me: The MPP driver install actually does handle this situation properly, using the less than elegant method of adding a few commands to /etc/init.d/iscsi in stop(). It adds the following code, between the check for root-on-iscsi and 'iscsiadm -m node --logoutall=all': #BEGIN_MPP_ADDITION # added by MPP/RDAC driver to prevent filesystem corruption on mpp iscsi devices. if [ -x /opt/mpp/mppiscsi_umountall ] ; then /opt/mpp/mppiscsi_umountall -tkur5 fi #END_MPP_ADDITION The problem is since the RUNLEVEL check from the comments above has been added, stop() returns before it gets there. I moved the MPP addition above the RUNLEVEL check so it gets executed before stop() returns, which seems to work. I'm tempted to remove the RUNLEVEL check so iscsiadm logs out properly, but I'm not sure I want to change more than I have to. So, the stop() function in /etc/init.d/iscsi on my system starts like this: stop() { rm -f /var/lock/subsys/iscsi #BEGIN_MPP_ADDITION # added by MPP/RDAC driver to prevent filesystem corruption on mpp iscsi devices. if [ -x /opt/mpp/mppiscsi_umountall ] ; then /opt/mpp/mppiscsi_umountall -tkur5 fi #END_MPP_ADDITION # If this is a final shutdown/halt, do nothing since # lvm/dm, md, power path, etc do not always handle this .... The system reboots properly now, no longer hanging on "Syncing disk cache".
Update: My above fix worked until I actually had a filesystem mounted, then back to hanging on Syncing disk cache. The filesystem was mounted with _netdev, so it was unmounted early in the shutdown sequence (checked with a 'mount' to print out during the process). The next workaround was to revert my above changes and stop the physical interfaces from shutting down, which works. Is there a disadvantage to leaving the network adapters up until power-off/reboot? /etc/init.d/network: 246c246,249 < for i in $vpninterfaces $xdslinterfaces $bridgeinterfaces $vlaninterfaces $remaining; do --- > # MAP 20101013 - remove 'remaining' set (physical) since it hoses up iscsi > # shutdown / mpp > #for i in $vpninterfaces $xdslinterfaces $bridgeinterfaces $vlaninterfaces $remaining; do > for i in $vpninterfaces $xdslinterfaces $bridgeinterfaces $vlaninterfaces ; do One issue with this would be if the iSCSI route was on a vpn, xdsl, or bridge interface, since those still get shut down.
(In reply to comment #47) > Update: > My above fix worked until I actually had a filesystem mounted, then back to > hanging on Syncing disk cache. The filesystem was mounted with _netdev, so it > was unmounted early in the shutdown sequence (checked with a 'mount' to print > out during the process). > > The next workaround was to revert my above changes and stop the physical > interfaces from shutting down, which works. Is there a disadvantage to leaving > the network adapters up until power-off/reboot? > That is what we were doing prior to RHEL 5.5 which is why we are hitting this problem now. See the patch in comment #38 which leaves the network on if iscsi is running. Also for nfs and iscsi root we do this now.
Mike, I tried this patch from comment #38 and it worked - Do you know when it will be released?
(In reply to comment #49) > Mike, > > I tried this patch from comment #38 and it worked - Do you know when it will be > released? It looks like it is checked in and being QAd for 5.6.
I take my comment#49 back: Actually this did not work as I tried it without mapping any volumes to the host but once I mapped some volumes and rebooted, the host showed the soft panic below and the host never came back up - session logout did not help. The host was accessible via ssh. I used that to disable the iscsi ports then reboot and it worked then renabled them back again and restablish the sessions iscsi package version: iscsi-initiator-utils-6.2.0.871-0.16.el5 Oct 22 17:40:15 kswc-warden shutdown[5304]: shutting down for system reboot Oct 22 17:40:16 kswc-warden kernel: INFO: task events/0:14 blocked for more than 120 seconds. Oct 22 17:40:16 kswc-warden kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 17:40:16 kswc-warden kernel: events/0 D ffff81012ff64000 0 14 1 15 13 (L-TLB) Oct 22 17:40:16 kswc-warden kernel: ffff810037f35a40 0000000000000046 ffffffff880755a6 0000000000000000 Oct 22 17:40:16 kswc-warden kernel: ffff81012ff64000 000000000000000a ffff81012fb4b080 ffff81010271b080 Oct 22 17:40:16 kswc-warden kernel: 000000257f0dc7dd 00000000000033e7 ffff81012fb4b268 0000000000000001 Oct 22 17:40:16 kswc-warden kernel: Call Trace: Oct 22 17:40:16 kswc-warden kernel: [<ffffffff880755a6>] :scsi_mod:scsi_done+0x0/0x18 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8006417d>] wait_for_completion+0x8f/0xa2 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8008e16d>] default_wake_function+0x0/0xe Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80064c6f>] __mutex_lock_slowpath+0x60/0x9b Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80064cb9>] .text.lock.mutex+0xf/0x14 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8009ecdc>] flush_workqueue+0x3f/0x87 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8014e897>] cfq_exit_queue+0x14/0xf4 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8014371a>] elevator_exit+0x29/0x45 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff801461f6>] blk_cleanup_queue+0x37/0x42 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807d6dd>] :scsi_mod:scsi_device_dev_release_usercontext+0x8f/0xd9 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8009ebb9>] execute_in_process_context+0x23/0x5a Oct 22 17:40:16 kswc-warden kernel: [<ffffffff801519ef>] kobject_cleanup+0x53/0x7e Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80151a1a>] kobject_release+0x0/0x9 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80035748>] kref_put+0x6f/0x7a Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807c707>] :scsi_mod:scsi_probe_and_add_lun+0x9a0/0x9c9 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807ac4d>] :scsi_mod:scsi_execute_req+0x78/0xce Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807d00f>] :scsi_mod:__scsi_scan_target+0x410/0x5c7 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff880cc729>] :mppUpper:mpp_SynchronousIo+0x104/0x13d Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807d20b>] :scsi_mod:scsi_scan_channel+0x45/0x70 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8807d2f6>] :scsi_mod:scsi_scan_host_selected+0xc0/0xfa Oct 22 17:40:16 kswc-warden kernel: [<ffffffff882fa9b9>] :mppVhba:mppLnx_vhba_regVirtualHost+0x673/0x691 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff882faaf4>] :mppVhba:mppLnx_register_virtual_hosts+0x11d/0x168 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff882fab81>] :mppVhba:mppLnx_vhbaScanHost+0x42/0x6f Oct 22 17:40:16 kswc-warden kernel: [<ffffffff882fae9e>] :mppVhba:mppLnx_vdAddWorkHandler+0x2f0/0x32b Oct 22 17:40:16 kswc-warden kernel: [<ffffffff882fabae>] :mppVhba:mppLnx_vdAddWorkHandler+0x0/0x32b Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8004dc37>] run_workqueue+0x94/0xe4 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8004a472>] worker_thread+0x0/0x122 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8004a562>] worker_thread+0xf0/0x122 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8008e16d>] default_wake_function+0x0/0xe Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80032bdc>] kthread+0xfe/0x132 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8005efb1>] child_rip+0xa/0x11 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff80032ade>] kthread+0x0/0x132 Oct 22 17:40:16 kswc-warden kernel: [<ffffffff8005efa7>] child_rip+0x0/0x11 Oct 22 17:40:16 kswc-warden kernel: Oct 22 17:40:16 kswc-warden kernel: INFO: task hald-probe-seri:4179 blocked for more than 120 seconds. Oct 22 17:40:16 kswc-warden kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 17:40:17 kswc-warden kernel: hald-probe-se D ffff810080057aa0 0 4179 4077 4422 4177 (NOTLB) Oct 22 17:40:17 kswc-warden kernel: ffff81012d5f7db8 0000000000000082 0000000000000000 0000000000000001 Oct 22 17:40:17 kswc-warden kernel: 0000000000000296 0000000000000009 ffff81012d968820 ffff81012fc0c7a0 Oct 22 17:40:17 kswc-warden kernel: 00000024695de668 00000000000be794 ffff81012d968a08 000000032e08b180 Oct 22 17:40:17 kswc-warden kernel: Call Trace: Oct 22 17:40:17 kswc-warden kernel: [<ffffffff8009ec6f>] flush_cpu_workqueue+0x7f/0xad Oct 22 17:40:17 kswc-warden kernel: [<ffffffff800a1ba4>] autoremove_wake_function+0x0/0x2e Oct 22 17:40:17 kswc-warden kernel: [<ffffffff80064b05>] mutex_lock+0xd/0x1d Oct 22 17:40:17 kswc-warden kernel: [<ffffffff8009ecfd>] flush_workqueue+0x60/0x87 Oct 22 17:40:17 kswc-warden kernel: [<ffffffff801a9f79>] release_dev+0x503/0x67b Oct 22 17:40:17 kswc-warden kernel: [<ffffffff80067b88>] do_page_fault+0x4fe/0x874 Oct 22 17:40:17 kswc-warden kernel: [<ffffffff80053ca3>] tty_release+0x11/0x1a Oct 22 17:40:17 kswc-warden kernel: [<ffffffff80012ac5>] __fput+0xd3/0x1bd Oct 22 17:40:17 kswc-warden kernel: [<ffffffff80023bd1>] filp_close+0x5c/0x64 Oct 22 17:40:17 kswc-warden kernel: [<ffffffff8001dff3>] sys_close+0x88/0xbd Oct 22 17:40:17 kswc-warden kernel: [<ffffffff8005e28d>] tracesys+0xd5/0xe0 Oct 22 17:40:17 kswc-warden kernel:
(In reply to comment #51) > I take my comment#49 back: Actually this did not work as I tried it without > mapping any volumes to the host but once I mapped some volumes and rebooted, > the host showed the soft panic below and the host never came back up - session > logout did not help. The host was accessible via ssh. I used that to disable > the iscsi ports then reboot and it worked then renabled them back again and > restablish the sessions Did this ever work for you or did the problem just start in RHEL 5.5? We never logged out of sessions before. In RHEL 5.4 and before just left them running and network up. In RHEL 5.5 we brought down the network. The patch in this bz is just adding back the behavior of leaving the network up. > > Oct 22 17:40:15 kswc-warden shutdown[5304]: shutting down for system reboot > :scsi_mod:__scsi_scan_target+0x410/0x5c7 Why are you scanning the target at shutdown?
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Prior to this update, an attempt to reboot or shut down a system with a running Internet Small Computer System Interface (iSCSI) daemon may have caused the system to stop responding. This was caused by the fact that the system was waiting for iSCSI devices to sync, even though the network was already shut down. With this update, the /etc/rc.d/init.d/network startup script has been modified not to deactivate network interfaces when the iSCSI daemon is running, and the system can be shut down or rebooted as expected.
To avoid this: .... Shutting down system logger: find: /sys/class/iscsi_session/: No such file or directory Shutting down interface eth0: ... You could replace if [ `find /sys/class/iscsi_session/ -mindepth 1 -maxdepth 1 -type d | wc -l` -ge 1 ]; then with: if [ $(ls -d /sys/class/iscsi_session/*/. 2>/dev/null | wc -l) -ge 1 ]; then
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-0075.html