Bug 701951

Summary: System Hang when there is smart error on IBM platform
Product: Red Hat Enterprise Linux 6 Reporter: kashyap <kashyap.desai>
Component: kernelAssignee: Tomas Henzl <thenzl>
Status: CLOSED ERRATA QA Contact: Gris Ge <fge>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.1CC: coughlan, eric.moore, fge, jwest, kzhang, rlary, sathya.prakash, syeghiay, thenzl
Target Milestone: rcKeywords: ZStream
Target Release: 6.1   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-2.6.32-153.el6 Doc Type: Bug Fix
Doc Text:
A kernel panic in the mpt2sas driver could occur on an IBM system using a drive with SMART (Self-Monitoring, Analysis and Reporting Technology) issues. This was because the driver was sending an SEP request while the kernel was in the interrupt context, causing the driver to enter the sleep state. With this update, a fake event is not executed from the interrupt context, assuring the SEP request is properly issued.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 13:21:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 714189, 714190    

Description kashyap 2011-05-04 12:03:28 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
Possible to reproduce with specific system/hw


Steps to Reproduce:

1. Requires IBM system and drive having SMART issues.
2. Boot using above system and after sometime you can observe kernel crash
in mpt2sas driver

  
Actual results:
There is kernel crash in mpt2sas driver.


Expected results:
There should not be a kernel crash.


Additional info:
This patch has been tested well by LSI. This is issue is causing kernel crash, it would be good to see a fix in RHEL6.1.

Comment 1 kashyap 2011-05-04 12:04:28 UTC
Please find the patch for this issue at below link

http://marc.info/?l=linux-scsi&m=130450584627653&w=2

~ Kashyap

Comment 3 RHEL Program Management 2011-05-04 13:29:47 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 11 RHEL Program Management 2011-05-16 18:59:42 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 12 Gris Ge 2011-06-14 13:49:56 UTC
It's hard for us to find a disk with SMART error in Red Hat.
I will try to shake a disk with power on it. Hope it got "Current_Pending_Sector" SMART error.

Can reporter or any partner willing to help us to testing once rpm build ready?

Comment 13 kashyap 2011-06-15 04:02:35 UTC
(In reply to comment #12)
> It's hard for us to find a disk with SMART error in Red Hat.
> I will try to shake a disk with power on it. Hope it got
> "Current_Pending_Sector" SMART error.
> 
> Can reporter or any partner willing to help us to testing once rpm build ready?

Gris,
Original fix was tested by IBM. We don't have required h/w to test the fix.
Best I can do is verification of patch part of RL6.1

Comment 17 Martin Prpič 2011-07-12 11:37:13 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
A kernel panic in the mpt2sas driver could occur on an IBM system using a drive with SMART (Self-Monitoring, Analysis and Reporting Technology) issues. This was because the driver was sending an SEP request while the kernel was in the interrupt context, causing the driver to enter the sleep state. With this update, a fake event is not executed from the interrupt context, assuring the SEP request is properly issued.

Comment 19 Aristeu Rozanski 2011-08-11 19:47:37 UTC
Patch(es) available on kernel-2.6.32-153.el6

Comment 22 Gris Ge 2011-08-12 05:16:59 UTC
No Hardware. 
Patch found in kernel-2.6.32-183.el6
Sanity Only.

Comment 23 errata-xmlrpc 2011-12-06 13:21:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html