Bug 142976

Summary: RHEL 3.0 v2.4.21-20.0.1.EL kernel panics after several hours of running raw I/O
Product: Red Hat Enterprise Linux 3 Reporter: Heather Conway <conway_heather>
Component: kernelAssignee: Larry Woodman <lwoodman>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 3.0CC: acjohnso, lwoodman, petrides, riel, sct
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-10-19 19:11:13 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
trace of Oops without PowerPath running on the system
none
trace of the Oops that occurred with PowerPath
none
Oops with no PP in text format
none
Oops with PP in text format none

Description Heather Conway 2004-12-15 15:44:42 UTC
Description of problem:
On an Opteron based system, the RHEL 3.0 v2.4.21-20.0.1.EL kernel 
panics after several hours of running raw I/O.  This panic occurs 
both with and without PowerPath.  Per the PowerPath team:
One of the dd processes issues an IO and calls kiobuf_wait_for_io(), 
which in turn calls schedule(). In schedule(), the kernel attempts to 
perform a context switch, an panics because the task struct pointer 
passed in for the new process is NULL. This points to kernel memory 
corruption, more specifically corruption of CPU runqueueus.

Version-Release number of selected component (if applicable):
kernel-source-2.4.21-20.0.1.EL x86_64

How reproducible:
Run raw I/O for serveral hours on an AMD64 Opteron-based system

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Heather Conway 2004-12-15 15:46:01 UTC
Created attachment 108621 [details]
trace of Oops without PowerPath running on the system

Enclosing a trace of the Oops that occurred without PowerPath running on the
system.

Comment 2 Heather Conway 2004-12-15 15:49:23 UTC
Created attachment 108623 [details]
trace of the Oops that occurred with PowerPath 

Enclosing a trace of the Oops that occurred without PowerPath running on the
system.

Comment 3 Heather Conway 2004-12-15 15:49:54 UTC
I neglected to mention that this is PowerPath v4.3.1.

Comment 4 Ernie Petrides 2004-12-15 23:04:07 UTC
Heather, please attach a trace (oops output) from a non-tainted kernel.
Also, please do not attach Microsoft Word documents in the future.


Comment 5 Heather Conway 2005-01-28 14:32:02 UTC
I have not been able to replicate this and have not received any 
feedback from the PowerPath team so I am considering this issue as 
NOTABUG.

Comment 6 Heather Conway 2005-01-28 20:29:30 UTC
Created attachment 110369 [details]
Oops with no PP in text format

Comment 7 Heather Conway 2005-01-28 20:30:39 UTC
oops - I closed the wrong Bugzilla.  The Oops output is being 
attached in text format.

Comment 8 Heather Conway 2005-01-28 20:32:05 UTC
Created attachment 110370 [details]
Oops with PP in text format

Attaching text document of Oops with PowerPath installed.

Comment 9 AJ Johnson 2005-04-30 00:00:52 UTC
I am seeing a very similar issue, but the system doesn't panic.  What is the
status of this bug?

Comment 13 Larry Woodman 2005-09-16 14:36:39 UTC
Is this still a problem with the patest RHEL3-U6 update?  The reason I ask is
that several generic kernel changes and changes to the x86_64 specific code have
been made to RHEL3 since 2.1.21-20.  Can someone please verify that this problem
still occurs with the latest kernel?

Thanks, Larry Woodman


Comment 15 RHEL Program Management 2007-10-19 19:11:13 UTC
This bug is filed against RHEL 3, which is in maintenance phase.
During the maintenance phase, only security errata and select mission
critical bug fixes will be released for enterprise products. Since
this bug does not meet that criteria, it is now being closed.
 
For more information of the RHEL errata support policy, please visit:
http://www.redhat.com/security/updates/errata/
 
If you feel this bug is indeed mission critical, please contact your
support representative. You may be asked to provide detailed
information on how this bug is affecting you.