Bug 146344 - kernel oops in kjournald on 2.6.10- smp kernels
kernel oops in kjournald on 2.6.10- smp kernels
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
3
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Stephen Tweedie
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-27 04:03 EST by Vincent Schonau
Modified: 2007-11-30 17:10 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-09-05 03:33:59 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
kernel oops output (39.86 KB, text/plain)
2005-01-27 04:05 EST, Vincent Schonau
no flags Details
similar oops from kernel-smp-2.6.9-1.667 (1.62 KB, text/plain)
2005-02-07 05:41 EST, Vincent Schonau
no flags Details
same oops on 2.6.10-1.766_FC3smp (1.83 KB, text/plain)
2005-02-23 02:52 EST, Vincent Schonau
no flags Details
dmsg for the system on 2.6.10-1.766_FC3smp (13.61 KB, text/plain)
2005-02-23 02:53 EST, Vincent Schonau
no flags Details
Patch to fix race in journal_unmap_buffer() (1.36 KB, patch)
2005-03-18 10:07 EST, Stephen Tweedie
no flags Details | Diff

  None (edit)
Description Vincent Schonau 2005-01-27 04:03:11 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en) AppleWebKit/125.5.6 (KHTML, 
like Gecko) Safari/125.12

Description of problem:
My FC3 system has been suffering apparently random crashes since the upgrade to 2.6.10 
FC3 kernels. I was finally able to capture an oops by logging the serial console.

This is for kernel-smp-2.6.10-1.741_FC3. The crashes have been occurring since FC3 
went from 2.6.9 to 2.6.10.

The sytem is a Supermicro 6012P-i:
http://www.supermicro.nl/products/system/1U/6012/SYS-6012P-i.cfm
with an Intel E7501 chipset, 2x P4 at 2.4GHz . 

There's no indication in logs or anywhere else about what triggers this oops, but perhaps 
I'm overlooking something.

The system appears to continue to respond to 'pings', but as far as I can see, al other 
processes hang or die.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.10-1.741_FC3

How reproducible:
Always

Steps to Reproduce:
No specific steps to reproduce. Occurs at apparently random times. 

Actual Results:  System crash.

Additional info:
Comment 1 Vincent Schonau 2005-01-27 04:05:21 EST
Created attachment 110281 [details]
kernel oops output
Comment 2 jjaakkol 2005-02-01 09:43:18 EST
I have the same problem on IBM xserver 435 (dual xeon system with
hyperthreading on both processors). My stack trace looks exatly the
same, with same symptoms. The server is a heavily loaded mail server,
but does not use iptables modules. Seems to be ext3 and journald
related. I was using data=writeback on one of our filesystems.
Comment 3 Vincent Schonau 2005-02-07 05:41:02 EST
Created attachment 110713 [details]
similar oops from kernel-smp-2.6.9-1.667

It appears this problem originates before 2.6.10; I went back to
kernel-smp-2.6.9-1.667, which resulted in the attached oops.
Comment 4 Vincent Schonau 2005-02-23 02:52:11 EST
Created attachment 111322 [details]
same oops on 2.6.10-1.766_FC3smp
Comment 5 Vincent Schonau 2005-02-23 02:53:05 EST
Created attachment 111323 [details]
dmsg for the system on 2.6.10-1.766_FC3smp
Comment 6 Stephen Tweedie 2005-03-18 10:07:12 EST
Created attachment 112127 [details]
Patch to fix race in journal_unmap_buffer()

This patch fixes a race condition between journal_unmap_buffer() and
journal_commit_transaction().  It involves journal_put_journal_head() being
called without any locking, and thus hitting a small window in kjournald where
the buffer's b_transaction can be temporarily NULL.  If that triggers, the
journal_unmap_buffer() ends up throwing away the journal_head that is still in
use by journal_commit_transaction.
Comment 7 Stephen Tweedie 2005-03-18 16:07:41 EST
This patch been committed to CVS and will be in the next update release.

Note You need to log in before you can comment on or make changes to this bug.