Bug 680140

Summary:

emc_clariion error handler panics with multiple failures

Product:

Red Hat Enterprise Linux 6

Reporter:

Eddie Williams <eddie.williams>

Component:

kernel

Assignee:

Mike Snitzer <msnitzer>

Status:

CLOSED ERRATA

QA Contact:

Storage QE <storage-qe>

Severity:

high

Docs Contact:

Priority:

urgent

Version:

6.0

CC:

agk, bdonahue, berthiaume_wayne, bmarzins, dhoward, dwysocha, eddie.williams, heinzm, jwest, mbroz, msnitzer, prajnoha, prockai, zkabelac

Target Milestone:

Keywords:

ZStream

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

kernel-2.6.32-128.el6

Doc Type:

Bug Fix

Doc Text:

Deleting an SCSI (Small Computer System Interface) device attached to a device handler caused applications running in user space, which were performing I/O operations on that device, to become unresponsive. This was due to the fact that the SCSI device handler's activation did not propagate the SCSI device deletion via an error code and a callback to the Device-Mapper Multipath. With this update, deletion of an SCSI device attached to a device handler is properly handled and no longer causes certain applications to become unresponsive.

Story Points:

---

Clone Of:

Clones:

680480 (view as bug list)

Environment:

Last Closed:

2011-05-19 12:05:36 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

696889

Attachments:

Description	Flags
serial port output captured from panic after FC switch powered off	none
patch to scsi_dh_emc.c to avoid panic in clariion_activate()	none
gdb output from multipathd coredump	none

Description Eddie Williams 2011-02-24 13:19:55 UTC

Description of problem:
While testing an EMC Clariion configured with device mapper multipath using SCSI-3 reservations to lock the storage (setting the hardware handler to "3 emc 0 1" to honor reservations) if multiple paths fail at the same time (powering off FC switch) then frequently a panic occurs (not 100% of the time).  The stack trace follows:

Modules linked in: ext3 jbd tg3 autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 dm_mirror dm_region_hash dm_log dm_r]
Modules linked in: ext3 jbd tg3 autofs4 sunrpc p4_clockmod freq_table speedstep_lib ipv6 dm_mirror dm_region_hash dm_log dm_r]Pid: 2511, comm: kmpath_handlerd Not tainted 2.6.32-71.el6.x86_64 #1 eserver xSeries 346 -[8840A3Z]-
RIP: 0010:[<ffffffff8125f9d9>]  [<ffffffff8125f9d9>] strnlen+0x9/0x40
RSP: 0018:ffff880232de3b50  EFLAGS: 00010086
RAX: ffffffff8165722e RBX: ffffffff81ad9f80 RCX: 0000000000000002
RDX: 779425f86d9de564 RSI: ffffffffffffffff RDI: 779425f86d9de564
RBP: ffff880232de3b50 R08: 0000000000000073 R09: 00000000fffffffe
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff81ad9ba5
R13: 779425f86d9de564 R14: 00000000ffffffff R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 00000037164d41c0 CR3: 0000000226252000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kmpath_handlerd (pid: 2511, threadinfo ffff880232de2000, task ffff880234da0040)
Stack:
 ffff880232de3b90 ffffffff81261820 0000000000000001 ffffffff81ad9ba5
<0> ffffffffa02c977a ffffffffa02c9778 ffff880232de3d00 ffffffff81ad9f80
<0> ffff880232de3c30 ffffffff812628e0 0000000000000004 0000000affffffff
Call Trace:
 [<ffffffff81261820>] string+0x40/0x100
 [<ffffffff812628e0>] vsnprintf+0x310/0x5f0
 [<ffffffff81262cd1>] vscnprintf+0x11/0x30
 [<ffffffff8106c463>] vprintk+0xc3/0x4f0
 [<ffffffff814c7c23>] ? printk+0x41/0x46
 [<ffffffffa0265fb0>] ? pg_init_done+0x0/0x220 [dm_multipath]
 [<ffffffff814c7c23>] printk+0x41/0x46
 [<ffffffffa02c8c7e>] clariion_activate+0xde/0x1b0 [scsi_dh_emc]
 [<ffffffffa0265fb0>] ? pg_init_done+0x0/0x220 [dm_multipath]
 [<ffffffff8135277b>] scsi_dh_activate+0x9b/0xe0
 [<ffffffffa0265f80>] ? activate_path+0x0/0x30 [dm_multipath]
 [<ffffffffa0265fae>] activate_path+0x2e/0x30 [dm_multipath]
 [<ffffffff8108c610>] worker_thread+0x170/0x2a0
[<ffffffff81091ca0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8108c4a0>] ? worker_thread+0x0/0x2a0
 [<ffffffff81091936>] kthread+0x96/0xa0
 [<ffffffff810141ca>] child_rip+0xa/0x20
 [<ffffffff810918a0>] ? kthread+0x0/0xa0
 [<ffffffff810141c0>] ? child_rip+0x0/0x20
Code: 66 90 48 83 c2 01 80 3a 00 75 f7 48 89 d0 48 29 f8 c9 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 85 f6 48 89
RIP  [<ffffffff8125f9d9>] strnlen+0x9/0x40
 RSP <ffff880232de3b50>
do_IRQ: 0.169 No irq handler for vector (irq -1)
do_IRQ: 0.73 No irq handler for vector (irq -1)
\uffffirq 16: nobody cared (try booting with the "irqpoll" option)
handlers:
[<ffffffff81383700>] (usb_hcd_irq+0x0/0x90)
Disabling IRQ #16
lpfc 0000:04:04.0: 1:1303 Link Up Event x1 received Data: x1 xf7 x8 x0 x0 x0 0
[drm:radeon_legacy_get_ext_tmds_info_from_combios] *ERROR* Unsupported gpio 0
[drm:radeon_combios_external_tmds_setup] *ERROR* Unknown id 1
[drm:radeon_combios_external_tmds_setup] *ERROR* Unknown id 3
[drm:radeon_combios_external_tmds_setup] *ERROR* Unknown id 7
lpfc 0000:04:04.0: 1:(0):2754 PRLI failure DID:011600 Status:x9/x90000
Kernel panic - not syncing: Waiting TID List traversal

panic occurred, switching back to text console

The panic appears to occur in scsi_dh_emc.c in clariion_activate() where sdev_printk() is called after a failure.  It seems that the TRESPASS command is being attempted on a path that is in the process of being failed.  The device seems to be cleared before it can gather the information to log.  It seems that in the failed cases it should just call the fh() and return.  I build a module that does that and I could not then reproduce the failure (I was able to successfully perform the same test multiple times without a failure).  

I found in the bugzilla database 467709 that also deals with TRESPASS issues.  I tried the updated device-mapper-multipath referenced in that bugzilla (0.4.9-37.el6.bz467709).  With this version the problem did not seem to duplicate as easily but I was still able to duplicate the panic after several attempts.

Version-Release number of selected component (if applicable):  I am using stock RHEL 6.0 kernel/modules initially to duplicate the problem and duplicated the failure with 0.4.9-37.el6.bz467709 version of device-mapper-multipath and kpartx.

In my testing I have the LUN locked with SCSI-3 persistent reservations.  I am not sure if this is required to duplicate.  I will do some testing without the reservations to see if the problem continues to duplicate.


How reproducible:
With stock RHEL 6.0 packages the panic occurs probably 90% of the time when there are multiple failures at the same time where an active path is one of them.

Steps to Reproduce:
1. start activity to the Clariion array using device-mapper-multipath
2. power off fibre switch where the switch has one connection to the server and two connections to the Clariion array.  Note that there is a second fibre channel switch with similar connections for from device-mapper-multipath's perspective there are 4 paths to the storage.  Also note that the fibre switch that is powered off needs to be the one where the IO path is active.
3. when device-mapper-multipath detects the failed path it will try to activate the next path.
  
Actual results:  System panics when the path it tries to activate also fails.


Expected results:  The next path should be activated and IO's rerouted.


Additional info:  With the stock RHEL 6.0 bits I used the following multipath.conf setting for the Clariion:

        device {
                vendor "DGC"
                product ".*"
                product_blacklist "LUNZ"
                path_grouping_policy group_by_prio
                getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
                path_selector "round-robin 0"
                path_checker emc_clariion
                features "1 queue_if_no_path"
                #hardware_handler "0"
                hardware_handler "3 emc 0 1"
                prio emc
                #failback immediate
                failback 10
                rr_weight uniform
                #no_path_retry 60
                no_path_retry 12
                rr_min_io 1000
        }

With the 0.4.9-37.el6.bz467709 version I used the following multipath.conf:

        device {
                vendor "DGC"
                product ".*"
                product_blacklist "LUNZ"
                path_grouping_policy group_by_prio
                getuid_callout "/lib/udev/scsi_id --whitelisted --device=/dev/%n"
                path_selector "round-robin 0"
                path_checker emc_clariion
                features "1 queue_if_no_path"
                #hardware_handler "0"
                hardware_handler "3 emc 0 1"
                prio emc
                #failback immediate
                #failback 10
                failback followover
                rr_weight uniform
                #no_path_retry 60
                no_path_retry 12
                rr_min_io 1000
        }

Comment 2 Eddie Williams 2011-02-24 14:02:10 UTC

Created attachment 480763 [details]
serial port output captured from panic after FC switch powered off

I duplicated the scenario where I powered a FC switch off and the system paniced.  Note that the panic was different in that I did not get a stack trace.  I have the log level set to 7 (echo 7 > /proc/sys/kernel/printk).  I have the kernel configured to output to the serial console:

title Red Hat Enterprise Linux (2.6.32-71.el6.x86_64) with serial console
        root (hd0,0)
        kernel /vmlinuz-2.6.32-71.el6.x86_64 ro root=/dev/mapper/vg_fiji-lv_root rd_LVM_LV=vg_fiji/lv_root rd_LVM_LV=vg_fiji/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYBOARDTYPE=pc KEYTABLE=us crashkernel=auto rhgb quiet console=ttyS0,9600n8 console=tty0
        initrd /initramfs-2.6.32-71.el6.x86_64.img

Comment 3 Eddie Williams 2011-02-24 15:02:19 UTC

I turned reservations off and was able to duplicate the same panic in clariion_activate.  I will put my patched scsi_dh_emc.ko and verify that with the new device-mapper-multipath this still resolves the panic.

Comment 4 Mike Snitzer 2011-02-24 15:19:26 UTC

(In reply to comment #3)
> I turned reservations off and was able to duplicate the same panic in
> clariion_activate.  I will put my patched scsi_dh_emc.ko and verify that with
> the new device-mapper-multipath this still resolves the panic.

OK, thanks for all your work on this.

Please attach your fix to this bz once you've verified all works as expected.

Comment 5 Eddie Williams 2011-02-24 20:53:37 UTC

Created attachment 480855 [details]
patch to scsi_dh_emc.c to avoid panic in clariion_activate()

So for with the attached patch to scsi_dh_emc.c I have not had the panic.  I am instead seeing a core on multipathd on occasion though.  I have not looked into the core yet.  If the daemon is restarted then everything continues OK.  

The patch is to simply avoid printing the message in clariion_activate when there is a failure.  This is perhaps fixing the symptom rather than the problem.  The issue here is that it looks like the code it touching the device after it has been freed.  Should there be a lock held to avoid this so rather than changing the message the right fix is to lock the device so it does not disappear while we are working on it?

Comment 6 Eddie Williams 2011-02-25 15:46:49 UTC

Created attachment 481021 [details]
gdb output from multipathd coredump

during double-path failure testing either by physically powering off a fibre channel switch or simulating it the multipathd will sometimes coredump.  The failure is a segmentation fault in sysfs_get_timeout() in libmultipath/discovery.c

Should I write a separate bugzilla for this?

Comment 7 Mike Snitzer 2011-02-25 17:42:31 UTC

(In reply to comment #6)
> Created attachment 481021 [details]
> gdb output from multipathd coredump
> 
> during double-path failure testing either by physically powering off a fibre
> channel switch or simulating it the multipathd will sometimes coredump.  The
> failure is a segmentation fault in sysfs_get_timeout() in
> libmultipath/discovery.c
> 
> Should I write a separate bugzilla for this?

I created bug 680480, thanks.

Comment 8 Eddie Williams 2011-03-16 14:52:20 UTC

Has there been any work on this issue at Red Hat?  Is the patch I provided accepted as the fix or is there a better fix that Red Hat will implement?

Thanks

Comment 9 Mike Snitzer 2011-03-21 20:03:24 UTC

I posted a slightly revised patch (based on comment#5) to linux-scsi:
http://marc.info/?l=linux-scsi&m=130073744426591&w=2

It'll hopefully get included in 2.6.39.  This patch should be posted for inclusion in RHEL6.1 by this Thursday (making it eligible for RHEL6.1 snapshot2).

Comment 10 Mike Snitzer 2011-03-24 13:28:45 UTC

(In reply to comment #9)
> I posted a slightly revised patch (based on comment#5) to linux-scsi:
> http://marc.info/?l=linux-scsi&m=130073744426591&w=2
> 
> It'll hopefully get included in 2.6.39.  This patch should be posted for
> inclusion in RHEL6.1 by this Thursday (making it eligible for RHEL6.1
> snapshot2).

The above referenced patch has proven unnecessary.  But the following refcounting fix is needed (discovered while chasing this issue): http://marc.info/?l=linux-scsi&m=130074943509100&w=2

Comment 12 Aristeu Rozanski 2011-03-30 14:33:56 UTC

Patch(es) available on kernel-2.6.32-128.el6

Comment 17 Martin Prpič 2011-05-05 08:54:22 UTC

    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Deleting an SCSI (Small Computer System Interface) device attached to a device handler caused applications running in user space, which were performing I/O operations on that device, to become unresponsive. This was due to the fact that the SCSI device handler's activation did not propagate the SCSI device deletion via an error code and a callback to the Device-Mapper Multipath. With this update, deletion of an SCSI device attached to a device handler is properly handled and no longer causes certain applications to become unresponsive.

Comment 18 errata-xmlrpc 2011-05-19 12:05:36 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0542.html