Bug 237815
| Summary: | Assertion failure in journal_commit_transaction() at fs/jbd/commit.c:793: "jh->b_next_transaction == ((void *)0)" | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Matthew Coffey <mcoffey> |
| Component: | kernel | Assignee: | Eric Sandeen <esandeen> |
| Status: | CLOSED DUPLICATE | QA Contact: | Martin Jenner <mjenner> |
| Severity: | high | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.4 | CC: | jbaron |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2007-05-31 18:14:35 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
This looks like a dup of Bug 158363: Assert panic in fs/jbd/commit.c:790:journal_commit_transaction() which is fixed in kernel 2.6.9-42.5 and later, i.e. RHEL4U5. Dup'ing to bug 158363, the fix for which is available in kernels 2.6.9-50.EL and beyond (i.e., RHEL4U5). If the problem persists with that update, please re-open this bug. Thanks, -Eric *** This bug has been marked as a duplicate of 158363 *** |
Description of problem: Kernel panic with the message: Assertion failure in journal_commit_transaction() at fs/jbd/commit.c:793: "jh->b_next_transaction == ((void *)0)" The server is an HP DL360g5 running Oracle RAC. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux AS release 4 (Nahant Update 4) How reproducible: Only one occurrence to date. Steps to Reproduce: 1. Server had been up for 58 hours and the load on the system at the time of the kernel panic was minimal. 2. 3. Actual results: Full OOPS Assertion failure in journal_commit_transaction() at fs/jbd/commit.c:793: "jh->b_next_transaction == ((void *)0)" ^M----------- [cut here ] --------- [please bite here ] --------- ^MKernel BUG at commit:793 ^Minvalid operand: 0000 [1] SMP ^MCPU 2 ^MModules linked in: hangcheck_timer md5 ipv6 qioctlmod sunrpc ds yenta_socket pcmcia_core dm_mirror dm_round_robin dm_multipath dm_mod button battery ac ohci_hcd hw_random bonding(U) e1000 tg3 floppy ext3 jbd qla2300 qla2xxx scsi_transport_fc cciss sd_mod scsi_mod ^MPid: 569, comm: kjournald Not tainted 2.6.9-42.0.2.ELsmp ^MRIP: 0010:[<ffffffffa00956bd>] <ffffffffa00956bd>{:jbd:journal_commit_transaction+4073} ^MRSP: 0018:00000101fd5ffbb8 EFLAGS: 00010212 ^MRAX: 0000000000000075 RBX: 0000000000000000 RCX: ffffffff803e1fe8 ^MRDX: ffffffff803e1fe8 RSI: 0000000000000246 RDI: ffffffff803e1fe0 ^MRBP: 0000010085ec00e0 R08: ffffffff803e1fe8 R09: 0000000000000000 ^MR10: 0000000100000000 R11: ffffffff8011e884 R12: 0000010034434348 ^MR13: 00000100df633580 R14: 00000101fe3e6a00 R15: 0000000000000000 ^MFS: 0000002a970416e0(0000) GS:ffffffff804e5280(0000) knlGS:00000000f7f248e0 ^MCS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b ^MCR2: 0000002a959add44 CR3: 0000000008032000 CR4: 00000000000006e0 ^MProcess kjournald (pid: 569, threadinfo 00000101fd5fe000, task 00000101fe34a7f0) ^MStack: 00000ed400000000 00000100a2b8d12c 00000000800327e0 00000100a126d660 ^M 00000100a9147818 0000000000001568 0000000000000000 00000101fe34a7f0 ^M ffffffff80135752 00000101fd5ffc30 ^MCall Trace:<ffffffff80135752>{autoremove_wake_function+0} <ffffffff8013271e>{try_to_wake_up+876} ^M <ffffffff80135752>{autoremove_wake_function+0} <ffffffff8030a1c1>{thread_return+0} ^M <ffffffff8030a219>{thread_return+88} <ffffffff8013fdf3>{del_timer+107} ^M <ffffffffa0097914>{:jbd:kjournald+250} <ffffffff80135752>{autoremove_wake_function+0} ^M <ffffffff80135752>{autoremove_wake_function+0} <ffffffffa0097814>{:jbd:commit_timeout+0} ^M <ffffffff80110f47>{child_rip+8} <ffffffffa009781a>{:jbd:kjournald+0} ^M <ffffffff80110f3f>{child_rip+0} ^MCode: 0f 0b d0 a8 09 a0 ff ff ff ff 19 03 4c 89 e7 e8 b5 e3 ff ff ^MRIP <ffffffffa00956bd>{:jbd:journal_commit_transaction+4073} RSP <00000101fd5ffbb8> ^M <0>Kernel panic - not syncing: Oops ^M Uhhuh. NMI received. Dazed and confused, but trying to continue ^MUhhuh. NMI received. Dazed and confused, but trying to continue ^MYou probably have a hardware problem with your RAM chips ^MUhhuh. NMI received. Dazed and confused, but trying to continue ^MYou probably have a hardware problem with your RAM chips ^MUhhuh. NMI received. Dazed and confused, but trying to continue ^MYou probably have a hardware problem with your RAM chips ^MYou probably have a hardware problem with your RAM chips This could be caused by a memory error but, the DIMMs all report OK according to hpasm. Expected results: Additional info: This looks very much like bugzilla # 161101 except that the line in question has moved from 790 to 793. There's no mention of this bug being fixed in the description of the new kernel description that's cited in the ERRATA as fixing the bug. I tried attaching the sysreport from the server, but it is too big for the upload form.