Bug 658854 - [NetApp 6.1 bug] RHEL6.0 FC host hits kernel panic at scsi_error_handler [rhel-6.0.z]
Summary: [NetApp 6.1 bug] RHEL6.0 FC host hits kernel panic at scsi_error_handler [rhe...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.0
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Frantisek Hrbata
QA Contact: Gris Ge
URL:
Whiteboard:
: 692670 (view as bug list)
Depends On: 636771
Blocks: 580566 683532
TreeView+ depends on / blocked
 
Reported: 2010-12-01 14:10 UTC by RHEL Program Management
Modified: 2018-11-14 14:34 UTC (History)
15 users (show)

Fixed In Version: kernel-2.6.32-71.16.1.el6
Doc Type: Bug Fix
Doc Text:
A Red Hat Enterprise Linux 6.0 host (with root on a local disk) with dm-multipath configured on multiple LUNs (Logical Unit Number) hit kernel panic (at scsi_error_handler) with target controller faults during an I/O operation on the dm-multipath devices. This was caused by multipath using the blk_abort_queue() function to allow lower latency path deactivation. The call to blk_abort_queue proved to be unsafe due to a race (between blk_abort_queue and scsi_request_fn). With this update, the race has been resolved and kernel panic no longer occurs on Red Hat Enterprise Linux 6.0 hosts.
Clone Of:
Environment:
Last Closed: 2011-02-22 17:38:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2011:0283 0 normal SHIPPED_LIVE Moderate: kernel security, bug fix, and enhancement update 2011-02-22 17:38:22 UTC

Description RHEL Program Management 2010-12-01 14:10:04 UTC
This bug has been copied from bug #636771 and has been proposed
to be backported to 6.0 z-stream (EUS).

Comment 7 Gris Ge 2011-02-16 06:41:03 UTC
Frantisek,

I failed to find the upstream commit 224cb3e981f1b2f9f93dbd49eaef505d17d894c2 as https://bugzilla.redhat.com/show_bug.cgi?id=636771#c36 mentioned in 2.6.32-71.18.1.el6.

static void deactivate_path(struct work_struct *work) is missing.

Comment 9 Gris Ge 2011-02-16 12:30:40 UTC
OK.
This patch just revert 224cb3e981f1b2f9f93dbd49eaef505d17d894c2.

Code reviewed. kernel 2.6.32-71.18.1.el6 has reverted the commit.

Set as Sanity Only.

Comment 10 Mike Snitzer 2011-02-16 14:47:57 UTC
(In reply to comment #9)
> OK.
> This patch just revert 224cb3e981f1b2f9f93dbd49eaef505d17d894c2.
> 
> Code reviewed. kernel 2.6.32-71.18.1.el6 has reverted the commit.
> 
> Set as Sanity Only.

You may have only done "Sanity Only" but FYI: Barry Donahue and I have done exhaustive testing (using NetAPp's test scripts) on NetApp storage in westford.

Comment 11 errata-xmlrpc 2011-02-22 17:38:44 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0283.html

Comment 12 Martin Prpič 2011-02-23 15:04:26 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
A Red Hat Enterprise Linux 6.0 host (with root on a local disk) with dm-multipath configured on multiple LUNs (Logical Unit Number) hit kernel panic (at scsi_error_handler) with target controller faults during an I/O operation on the dm-multipath devices. This was caused by multipath using the blk_abort_queue() function to allow lower latency path deactivation. The call to blk_abort_queue proved to be unsafe due to a race (between blk_abort_queue and scsi_request_fn). With this update, the race has been resolved and kernel panic no longer occurs on Red Hat Enterprise Linux 6.0 hosts.

Comment 13 Don Howard 2011-04-13 17:20:08 UTC
*** Bug 692670 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.