Description of problem: Two oopses on an AS box (e.25 kernel) while trying to write to the journal - failing assertion of locked buffer - comments in journal.c at line 372 suggest that the assertion is possibly incorrect - oopses can be viewed in issue tracker as number 27638 Version-Release number of selected component (if applicable): How reproducible: Unknown - waiting on customer for details on reproducability Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Please post that oops information here.
Assertion failure in journal_write_metadata_buffer() at journal.c:372: "buffer_jdirty(jh2bh(jh_in))" ------------[ cut here ]------------ kernel BUG at journal.c:372! invalid operand: 0000 Kernel 2.4.9-e.25enterprise CPU: 3 EIP: 0010:[<f8899674>] Tainted: P EFLAGS: 00010286 EIP is at journal_write_metadata_buffer [jbd] 0x74 eax: 00000020 ebx: 00000000 ecx: c02f7844 edx: 0001e30e esi: 00000000 edi: e34eb420 ebp: d0b27160 esp: f3323e08 ds: 0018 es: 0018 ss: 0018 Process kjournald (pid: 38, stackpage=f3323000) Stack: f889e3e1 00000174 0000013a f3348400 00000000 00000000 f2b37730 00000000 e34eb420 d0b27160 f8896c73 e34eb420 f2b37730 f3323e58 0000030d 00000000 00000fd4 e3c5302c 00000003 e34eb420 eb663550 0000030d f6314a00 00000000 Call Trace: [<f889e3e1>] .LC63 [jbd] 0x28b [<f8896c73>] journal_commit_transaction [jbd] 0x773 [<f8820d8f>] rw_intr [sd_mod] 0x20f [<c01255d0>] process_timeout [kernel] 0x0 [<c0118c7b>] wake_up_process [kernel] 0xb [<c0124d41>] __run_timers [kernel] 0xd1 [<c0125384>] run_local_timers [kernel] 0x94 [<c0114288>] smp_apic_timer_interrupt [kernel] 0xb8 [<c0119945>] schedule [kernel] 0x385 [<f88994a6>] kjournald [jbd] 0x146 [<f8899340>] commit_timeout [jbd] 0x0 [<c0105836>] arch_kernel_thread [kernel] 0x26 [<f8899360>] kjournald [jbd] 0x0
A patch in later 2.4 kernels (for transaction.c:612 oopses) introduced a bug that triggered this specific assert failure rather frequently. We identified a flaw in the way that the buffer_jdirty state was being handled, and that is fixed in upstream kernels. I'm not certain whether the same flaw could be triggered differently by AS-2.1 ext3 --- certainly, the window is much smaller in that kernel, as the flaw was reported very frequently when the later-2.4 kernel transaction.c:612 fix was added, but we have only a few isolated reports of it on AS-2.1. But there's a good chance that the later buffer_jdirty fix will fix this on AS-2.1 too. We have patches back-ported to AS-2.1 to fix both of these issues, and those are in testing internally.
when will this update be through QA?
The fully-supported, fully-QAed release is scheduled to be part of the forthcoming U3 major AS-2.1 update release. We've got the kernel in testing, though, and we hope that a beta, engineering build will be available as soon as Monday for customers to evaluate.
Is the beta now available?
An unsupported engineering kernel containing this fix is now available for testing and evaluation at http://people.redhat.com/~jbaron/.private/testing/2.4.9-e.27.18.test/