Bug 151210 - kswapd kernel oopses with multithreaded applications
kswapd kernel oopses with multithreaded applications
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
2
All Linux
medium Severity high
: ---
: ---
Assigned To: Dave Jones
Brian Brock
:
: 151211 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-03-15 19:48 EST by Bevan Bennett
Modified: 2015-01-04 17:17 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-04-16 00:26:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Combined syslog grepped for oopsing process (12.96 KB, text/plain)
2005-03-15 19:53 EST, Bevan Bennett
no flags Details
Oops syslog details for one system "vanadium" (10.41 KB, text/plain)
2005-03-15 20:01 EST, Bevan Bennett
no flags Details
Combined syslog grepped for kernel versions (10.19 KB, text/plain)
2005-03-15 20:08 EST, Bevan Bennett
no flags Details

  None (edit)
Description Bevan Bennett 2005-03-15 19:48:14 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3) Gecko/20041020

Description of problem:
Across a variety of server hardware (IBM eServer 330, Dell PE1750, Dell PE2650, Dell PE2550, AMD64 dual opteron system) I've been seeing a large number of kernel Oopses over the past several weeks.  The systems will keep running at first, but usually end up locking up once Oopsing has started.  Looking over my logs, I have 134 logged oopses since Feb 28th spread over 28 different systems.

Of those 134, 49 appear to be initiating with kswapd; the others are reported from somewhat random other processes (often multithreaded java-based applications) after a kswapd oops has occurred on the running kernel.

I've read through the current list of 'kernel oops' bugs (I really have) and haven't found anything substantially similar, but I've missed dups before, and I apologize if that's the case. If nothing else, I have a large quantity of data available on these oopses.

Version-Release number of selected component (if applicable):
kernel-2.6.10-1.9_FC2smp  kernel-2.6.10-1.14_FC2smp  2.6.10-1.770_FC2smp

How reproducible:
Sometimes

Steps to Reproduce:
1. Run memory intensive, multithreaded applications
2. Wait
3.
  

Actual Results:  Kernel Oopses and eventual hang.

Expected Results:  User app can crash if neccessary, but the kernel should go on.

Additional info:

I'll attach one set of representative log messages, plus a combined syslog listing grepped for 'Process: ' in case someone would like to request others.
Comment 1 Bevan Bennett 2005-03-15 19:52:29 EST
*** Bug 151211 has been marked as a duplicate of this bug. ***
Comment 2 Bevan Bennett 2005-03-15 19:53:29 EST
Created attachment 112040 [details]
Combined syslog grepped for oopsing process
Comment 3 Bevan Bennett 2005-03-15 20:01:13 EST
Created attachment 112041 [details]
Oops syslog details for one system "vanadium"
Comment 4 Bevan Bennett 2005-03-15 20:08:21 EST
Created attachment 112042 [details]
Combined syslog grepped for kernel versions
Comment 5 Dave Jones 2005-04-16 00:26:25 EDT
Fedora Core 2 has now reached end of life, and no further updates will be
provided by Red Hat.  The Fedora legacy project will be producing further kernel
updates for security problems only.

If this bug has not been fixed in the latest Fedora Core 2 update kernel, please
try to reproduce it under Fedora Core 3, and reopen if necessary, changing the
product version accordingly.

Thank you.
Comment 6 Bevan Bennett 2005-04-18 20:17:11 EDT
Congratulations Kernel! It turns out that, in a roundabout way, there was a
hardware problem effecting my entire machine room. It seems our power out here
is not very good and that, combined with a few too many servers on the same
circut, caused hardware flutter and kernel Oopsing under load.  The
multi-threaded corrolation came in because those apps were more likely to
generate near maximum CPU utilization. Not the easiest problem to diagnose...

I recently got in a huge shipment of UPSs, moved all the servers to them, and
we've been crash free for a week now. (*knocks on wood*)

Closing the bug with appropriate commentary.

Note You need to log in before you can comment on or make changes to this bug.