Bug 170434

Summary:

Deadlock in fc_target_unblock while shutting down the system

Product:

Red Hat Enterprise Linux 4

Reporter:

Bino J Sebastian <bino.sebastian>

Component:

kernel

Assignee:

Doug Ledford <dledford>

Status:

CLOSED ERRATA

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

4.0

CC:

bino.sebastian, coughlan, james.smart, jbaron

Target Milestone:

---

Target Release:

---

Hardware:

i386

OS:

Linux

Whiteboard:

Fixed In Version:

RHSA-2006-0575

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2006-08-10 21:25:02 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

181409, 185624, 192916

Attachments:

Description	Flags
customer's console messages with lpfc_log_verbose=0xffff	none
console messages from my local reproducer attempt	none
first stab at a patch (untested)	none
respun patch based on upstream mailing list thread	none
respun patch	none

Description Bino J Sebastian 2005-10-11 18:13:21 UTC

Description of problem:
If link goes down for an lpfc HBA during a system shutdown or just 
before the system shutdown system goes to a deadlock state during the 
system shut down. This was  happening  with lpfc 8.0.6.x2 driver and 
RHEL4 U1(2.6.9-11.EL) / RHEL4 U2 kernels.


Version-Release number of selected component (if applicable):
2.6.9-11.EL

How reproducible:
Every time

Steps to Reproduce:
1. unplug Fibre channel cable and execute "shutdown -h now"
2.
3.
  
Actual results:
System deadlocks and shutdown did not complete.

Expected results:
Succesful completion of shutdown process

Additional info:
	After investigating the upstream Fibre Channel block/unblock implementations,
it looks like the issue does not exist in the upstream 
kernel.

Following is the patch which fixed this issue in the upstream kernel:
http://marc.theaimsgroup.com/?l=linux-scsi&m=109776477201239

This bug is created to request merging of the above patch to next 
Redhat4 Update release. If you need any help or more information 
regarding this issue please feel free to contact
James Smart     Email: james.smart
Bino Sebastian  Email: bino.sebastian

Comment 1 Bino J Sebastian 2005-10-11 18:16:33 UTC

Here is more information about the deadlock:
Test Environment:
==============
OS: RHEL4 U1(2.6.9-11.EL) / RHEL4 U2
Server: i386 system
HBA:  LP1050 
FW: 1.91a1 
Driver: lpfc 8.0.16.17(topology=6, no switch) 
Storage:  NEC iStorageS2300/S100 
Test procedure : Unplug the Fibre channel cable and execute "shutdown -h now" 
command. The shutdown will not complate and will cause a deadlock.

Root cause of the problem:
====================
	When the link goes down for an lpfc HBA, the driver will upcall 
fc_target_block for all the targets. After the nodev timeout expire the 
driver upcall fc_target_unblock function to unblock the device. The 
completion of fc_target_unblock function is necessary for the completion 
of new commands.
============
 void
fc_target_unblock(struct scsi_target *starget)
{
        /*
         * Stop the target timer first. Take no action on the del_timer
         * failure as the state machine state change will validate the
         * transaction.
         */
        if (cancel_delayed_work(&fc_starget_dev_loss_work(starget)))
                flush_scheduled_work();

        device_for_each_child(&starget->dev, NULL, fc_device_unblock);
}
int device_for_each_child(struct device * dev, void * data,
                     int (*fn)(struct device *, void *))
{
         struct device * child;
         int error = 0;
         down_read(&devices_subsys.rwsem);
         list_for_each_entry(child, &dev->children, node) {
                if((error = fn(child, data)))
                         break;
         }
         up_read(&devices_subsys.rwsem);
         return error;
}
==========
	fc_target_unblock function calls device_for_each_child, 
which need devices_subsys.rwsem semaphore to execute the function. 
This semaphore is causing the deadlock. This semaphore is acquired 
by the  halt process during shutdown. Following are the stack traces 
of lpfc_dpc thread and the halt thread which are causing the deadlock

Halt process hold the devices_subsys.rwsem semaphore and wait for the 
SYNC scsi command to complete. lpfc driver cannot complete SYNC scsi 
command  without completion of fc_target_unblock which require 
devices_subsys.rwsem semaphore.

========
50 void device_shutdown(void)
51 {
52         struct device * dev;
53 
54         down_write(&devices_subsys.rwsem);  <==== Holding the semaphore and
calling sd->shutdown()
55         list_for_each_entry_reverse(dev, &devices_subsys.kset.list, kobj.entry) {
56                 pr_debug("shutting down %s: ", dev->bus_id);
57                 if (dev->driver && dev->driver->shutdown) {
58                         pr_debug("Ok\n");
59                         dev->driver->shutdown(dev);
60                 } else
61                         pr_debug("Ignored.\n");
62         }
63         up_write(&devices_subsys.rwsem);
64 
65         sysdev_shutdown();
66 }
========
lpfc_dpc_0    D C0407F1B  3444   209      1           225   208 (L-TLB)
f7497edc 00000046 0000270f c0407f1b ffffff00 0000007b 01d0c09e 039c4853
       f7497ee4 c03fd168 c202dd60 00000002 001484c4 14c08856 0000e712 f7e00b30
       f75e56b0 f75e581c f75e56b0 00000008 ffff0001 c03348e8 c03348e8 00000000
Call Trace:
 [<c02c5d35>] rwsem_down_read_failed+0x143/0x162 <=== Waiting for
devices_subsys.rwsem semaphore
 [<c02135cd>] .text.lock.core+0x47/0x5e
 [<f8829533>] fc_device_unblock+0x0/0xd [scsi_transport_fc]
 [<f88a63fa>] lpfc_target_unblock+0xa0/0xba [lpfc]
 [<f88a6551>] lpfc_target_remove+0x82/0xbe [lpfc]
 [<f889ee95>] lpfc_freenode+0x2b3/0x2c7 [lpfc]
 [<f889ef75>] lpfc_nlp_remove+0xcc/0xde [lpfc]
 [<f88a494e>] lpfc_disc_state_machine+0xe0/0x102 [lpfc]
 [<f889cfb0>] lpfc_process_nodev_timeout+0xa5/0xaa [lpfc]
 [<f889d0e3>] lpfc_disc_done+0x12e/0x1e4 [lpfc]
 [<f889d2a8>] lpfc_do_dpc+0x10f/0x138 [lpfc]
 [<f889d199>] lpfc_do_dpc+0x0/0x138 [lpfc]
 [<c01041f1>] kernel_thread_helper+0x5/0xb
halt          D 00000001  2464 18712      1                 245 (NOTLB)
f2c47de0 00000082 c02172f8 00000001 f7e562b0 f75a3c00 f88441e0 00000001
       f7e562b0 c22813f8 c2035d60 00000003 0041b6ea d8da1e3d 0000e70f f7e005b0
       f7227630 f722779c 00000000 00000004 f2c47e50 f2c47e4c f2c47dfc f2c47e34
Call Trace:
 [<c02172f8>] elv_next_request+0xbe/0xce  <== Waiting for SYNC command to complete
 [<f88441e0>] scsi_request_fn+0x2f9/0x30d [scsi_mod]
 [<c02c56ce>] wait_for_completion+0x94/0xcb
 [<c011dc6f>] default_wake_function+0x0/0xc
 [<f88441e0>] scsi_request_fn+0x2f9/0x30d [scsi_mod]
 [<c011dc6f>] default_wake_function+0x0/0xc
 [<f8843307>] scsi_wait_req+0x5e/0x8c [scsi_mod]
 [<f8843257>] scsi_wait_done+0x0/0x52 [scsi_mod]
 [<c02c597f>] __cond_resched+0x14/0x39
 [<f8822a28>] sd_sync_cache+0x74/0xd3 [sd_mod]
 [<f8823d50>] sd_shutdown+0x25/0x31 [sd_mod] 
 [<c021614b>] device_shutdown+0x56/0x74 <== Holding the devices_subsys.rwsem
semaphore
 [<c012d146>] sys_reboot+0x17c/0x2a3
 [<c012a36f>] signal_wake_up+0x1e/0x2c
 [<c012ab81>] __group_send_sig_info+0x8f/0x98
 [<c012ac52>] group_send_sig_info+0x59/0x61
 [<c012ad3f>] kill_proc_info+0x3c/0x42
 [<c012c2d9>] sys_kill+0x4b/0x50
 [<c016b5af>] destroy_inode+0x36/0x45
 [<c0169cfc>] dput+0x34/0x19b
 [<c0156d33>] __fput+0xda/0x100
 [<c01558f5>] filp_close+0x59/0x5f
 [<c02c7377>] syscall_call+0x7/0xb
============

Comment 5 Jeff Layton 2006-01-10 18:55:20 UTC

Created attachment 123006 [details]
customer's console messages with lpfc_log_verbose=0xffff

Log messages from customer's machine with lpfc_log_verbose=0xffff

Comment 6 Jeff Layton 2006-01-10 18:57:32 UTC

Created attachment 123007 [details]
console messages from my local reproducer attempt

my attempt at a reproducer here, also with lpfc_log_verbose=0xffff

This box rebooted fine.

Comment 7 Jeff Layton 2006-01-11 18:06:11 UTC

Created attachment 123062 [details]
first stab at a patch (untested)

Since I'm not able to reproduce this here, I'm going to try to backport the fix
decribed in the email discussion linked to in an earlier post. The 2.6.14 code
is actually a bit different (there is a helper function), so I may do another
iteration that's closer to what the current upstream code looks like.

Note that this is not at all tested yet.

Comment 8 Bino J Sebastian 2006-01-11 19:34:54 UTC

In Emulex lab we have a test environment which reliably reproduce this
issue. I will test this patch with this test environment.

Comment 9 Jeff Layton 2006-01-11 20:16:07 UTC

Created attachment 123068 [details]
respun patch based on upstream mailing list thread

Here's a respun version of the patch I posted earlier. This should apply
cleanly to the latest RHEL4 CVS tree. I've done some brief testing with it, and
don't see any obvious regressions, but this really needs to be reviewed by
someone with more familiarity with this code.

To the Emulex folks, please let me know how this works in your test lab.

Comment 10 Jeff Layton 2006-01-11 20:45:08 UTC

Having trouble getting to theaimslist.com today -- an alternate archive of this
thread is here:

http://www.archivum.info/linux-scsi@vger.kernel.org/2004-10/msg00315.html

I'm still reading up on it, it looks like there were some later messages that I
missed, so the patch I just posted may not be everything we need here.

Comment 11 Bino J Sebastian 2006-01-11 22:11:25 UTC

I tested the latest patch against 2.6.9-22ELsmp kernel. The patch fixed the
deadlock in fc_target_unblock. 
I will do some more regression testing in the Emulex lab.
Which Redhat release will contain this patch ?

Thanks for your help in resolving this issue.

Comment 12 Bino J Sebastian 2006-01-12 03:07:05 UTC

My regression testing with this patch is successful.

Comment 13 Jeff Layton 2006-01-12 13:43:59 UTC

Thanks for testing the patch. Placing this bug on the RHEL4U4Proposed list.

The earliest it would show up is Update 4. The patch still needs to go through
peer-review and internal regression testing before it makes it way to an actual
release.

I'll be posting an updated patch in a bit that more closely matches the one in
the original mailing list post. It's essentially the same as the last one, but
removes some extraneous functions as well.

I'll post some test kernels with this a little later today.

Comment 14 Jeff Layton 2006-01-12 13:46:27 UTC

Created attachment 123112 [details]
respun patch

Newest respin of the patch. This one also removes 2 extraneous functions that
are no longer needed.

Comment 15 Jeff Layton 2006-01-12 15:20:39 UTC

I've built a test kernel with the latest patch and have placed it here:

http://people.redhat.com/jlayton/BZ170434/

I only bothered to build the i686 SMP kernel. Please test this as well, and let
me know if it still solves the issue.

Comment 19 Tom Coughlan 2006-01-17 17:02:07 UTC

Jeff, thanks very much for getting to the bottom of this. 

James, this fix will be checked in after U3 ships. It will then be available as
a U3 hot-fix for specific customers who request it. It will also be in U4, along
with whatever other driver updates we do for U4. 

Would you like us to change the driver version string (currently 8.0.16.18) when
we add this patch? Usually we would add "-rh1" or something, so we can identify
what we are dealing with in the field. Do you have a preference?

Comment 20 Tom Coughlan 2006-01-17 17:09:25 UTC

Woops, never mind. The patch is against scsi_transport_fc, not the driver, so
the driver rev. stays the same. Sorry I missed that.

Comment 21 Jeff Layton 2006-01-17 17:47:52 UTC

But since this is against scsi_transport_fc, we also need to do regression
testing with the qlogic driver. Doing a cursory code inspection first to see if
the qlogic driver touches any of this.

Comment 24 Jason Baron 2006-04-03 17:45:33 UTC

committed in stream U4 build 34.11. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/

Comment 26 Bob Johnson 2006-04-11 16:12:42 UTC

This issue is on Red Hat Engineering's list of planned work items 
for the upcoming Red Hat Enterprise Linux 4.4 release.  Engineering 
resources have been assigned and barring unforeseen circumstances, Red 
Hat intends to include this item in the 4.4 release.

Comment 29 Red Hat Bugzilla 2006-08-10 21:25:04 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html