Bug 104526
| Summary: | oops in journaling code (journal.c:372) | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Neil Horman <nhorman> |
| Component: | kernel | Assignee: | Stephen Tweedie <sct> |
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 2.1 | CC: | bnocera, sct, summer, tao |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | QU3 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2004-01-09 00:14:47 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 106054 | ||
Please post that oops information here. Assertion failure in journal_write_metadata_buffer() at journal.c:372:
"buffer_jdirty(jh2bh(jh_in))"
------------[ cut here ]------------
kernel BUG at journal.c:372!
invalid operand: 0000
Kernel 2.4.9-e.25enterprise
CPU: 3
EIP: 0010:[<f8899674>] Tainted: P
EFLAGS: 00010286
EIP is at journal_write_metadata_buffer [jbd] 0x74
eax: 00000020 ebx: 00000000 ecx: c02f7844 edx: 0001e30e
esi: 00000000 edi: e34eb420 ebp: d0b27160 esp: f3323e08
ds: 0018 es: 0018 ss: 0018
Process kjournald (pid: 38, stackpage=f3323000)
Stack: f889e3e1 00000174 0000013a f3348400 00000000 00000000 f2b37730 00000000
e34eb420 d0b27160 f8896c73 e34eb420 f2b37730 f3323e58 0000030d 00000000
00000fd4 e3c5302c 00000003 e34eb420 eb663550 0000030d f6314a00 00000000
Call Trace: [<f889e3e1>] .LC63 [jbd] 0x28b
[<f8896c73>] journal_commit_transaction [jbd] 0x773
[<f8820d8f>] rw_intr [sd_mod] 0x20f
[<c01255d0>] process_timeout [kernel] 0x0
[<c0118c7b>] wake_up_process [kernel] 0xb
[<c0124d41>] __run_timers [kernel] 0xd1
[<c0125384>] run_local_timers [kernel] 0x94
[<c0114288>] smp_apic_timer_interrupt [kernel] 0xb8
[<c0119945>] schedule [kernel] 0x385
[<f88994a6>] kjournald [jbd] 0x146
[<f8899340>] commit_timeout [jbd] 0x0
[<c0105836>] arch_kernel_thread [kernel] 0x26
[<f8899360>] kjournald [jbd] 0x0
A patch in later 2.4 kernels (for transaction.c:612 oopses) introduced a bug that triggered this specific assert failure rather frequently. We identified a flaw in the way that the buffer_jdirty state was being handled, and that is fixed in upstream kernels. I'm not certain whether the same flaw could be triggered differently by AS-2.1 ext3 --- certainly, the window is much smaller in that kernel, as the flaw was reported very frequently when the later-2.4 kernel transaction.c:612 fix was added, but we have only a few isolated reports of it on AS-2.1. But there's a good chance that the later buffer_jdirty fix will fix this on AS-2.1 too. We have patches back-ported to AS-2.1 to fix both of these issues, and those are in testing internally. when will this update be through QA? The fully-supported, fully-QAed release is scheduled to be part of the forthcoming U3 major AS-2.1 update release. We've got the kernel in testing, though, and we hope that a beta, engineering build will be available as soon as Monday for customers to evaluate. Is the beta now available? An unsupported engineering kernel containing this fix is now available for testing and evaluation at http://people.redhat.com/~jbaron/.private/testing/2.4.9-e.27.18.test/ |
Description of problem: Two oopses on an AS box (e.25 kernel) while trying to write to the journal - failing assertion of locked buffer - comments in journal.c at line 372 suggest that the assertion is possibly incorrect - oopses can be viewed in issue tracker as number 27638 Version-Release number of selected component (if applicable): How reproducible: Unknown - waiting on customer for details on reproducability Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: