Bug 76275
Summary: | Assertion failure in unmap_underlying_metadata() at buffer.c:1542: "!buffer_jlist_eq(old_bh, 3)" | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Philip Pokorny <ppokorny> |
Component: | kernel | Assignee: | Larry Woodman <lwoodman> |
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 2.1 | CC: | epierce, jike.zhou |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-09-28 11:13:32 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Philip Pokorny
2002-10-18 23:41:10 UTC
which kernel modules are you using ? [root@golem root]# uname -a Linux golem.mydomain.com 2.4.9-e.8enterprise #1 SMP Fri Jul 19 15:27:35 EDT 2002 i686 unknown [root@golem root]# lsmod Module Size Used by Tainted: P e1000 52740 1 eepro100 21968 1 usb-uhci 26948 0 (unused) usbcore 68896 1 [usb-uhci] ext3 73536 16 jbd 55048 16 [ext3] raid1 16548 13 megaraid 28160 19 aic7xxx 127200 26 sd_mod 13468 45 scsi_mod 125420 3 [megaraid aic7xxx sd_mod] Has this bug fixed? Not to my knowledge. It has not happened again, but the conditions that triggered the problem are not the kind of things I'm interested in inducing on a 24x7 production server. I cant reproduce this problem internally. There were several changes in this area over the past several AS2.1 kernel errata. If you can reproduce this problem with the latest kernel errata(e.24) I'd like to close this bug. Larry Woodman Kernel has been upgraded and SCSI cables were shortened. Problem has not happened again, so close the ticket. Sorry to add to this, I know this bug is over a year old, but I just had the same error with a new (2.4.9-e.48smp) kernel. EXT3-fs error (device sd(8,18)): ext3_free_blocks: bit already cleared for block 53500615 EXT3-fs error (device sd(8,18)): ext3_free_blocks: bit already cleared for block 253599764 EXT3-fs error (device sd(8,18)): ext3_free_blocks: bit already cleared for block 357537704 Assertion failure in unmap_underlying_metadata() at buffer.c:1542: "! buffer_jlist_eq(old_bh, 3)" ------------[ cut here ]------------ kernel BUG at buffer.c:1542! invalid operand: 0000 Kernel 2.4.9-e.48smp CPU: 2 EIP: 0010:[<c0148c1b>] Not tainted EFLAGS: 00010286 EIP is at unmap_underlying_metadata [kernel] 0xdb eax: 00000020 ebx: efb97500 ecx: c02f6ec4 edx: 00005a3a esi: f501e740 edi: d5f0aee0 ebp: 00000800 esp: f55fdd24 ds: 0018 es: 0018 ss: 0018 Process nfsd (pid: 965, stackpage=f55fd000) Stack: c02670c2 00000606 00000001 00000800 00001000 c0148ff1 00000000 00001000 c0149012 efb97500 efb97920 fe15c000 f55fdd64 00000800 00000001 efb97500 00000070 00000014 f887bfd0 f8878cc7 00000014 00000070 c2419934 dc216740 Call Trace: [<c02670c2>] .rodata.str1.1 [kernel] 0x301d (0xf55fdd24) [<c0148ff1>] __block_prepare_write [kernel] 0x121 (0xf55fdd38) [<c0149012>] __block_prepare_write [kernel] 0x142 (0xf55fdd44) [<f887bfd0>] .LC13 [jbd] 0x0 (0xf55fdd6c) [<f8878cc7>] __jbd_kmalloc [jbd] 0x27 (0xf55fdd70) [<c0149932>] block_prepare_write [kernel] 0x22 (0xf55fdd8c) [<f88849e0>] ext3_get_block [ext3] 0x0 (0xf55fdda0) [<f8884ed1>] ext3_prepare_write [ext3] 0xb1 (0xf55fddac) [<f88849e0>] ext3_get_block [ext3] 0x0 (0xf55fddbc) [<c0130f47>] __find_lock_page [kernel] 0x97 (0xf55fddd0) [<c0133d44>] generic_file_write [kernel] 0x434 (0xf55fdde4) [<f8a10c87>] nfsd_open [nfsd] 0x27 (0xf55fde38) [<f8882a82>] ext3_file_write [ext3] 0x22 (0xf55fde54) [<f8a1131c>] nfsd_write [nfsd] 0x14c (0xf55fde74) [<c01da03c>] kfree_skbmem [kernel] 0xc (0xf55fdeb8) [<f8892060>] ext3_file_operations [ext3] 0x0 (0xf55fdedc) [<f8a167b0>] nfsd3_proc_write [nfsd] 0xf0 (0xf55fdf3c) [<f8a20240>] nfsd_procedures3 [nfsd] 0xe0 (0xf55fdf60) [<f8a20240>] nfsd_procedures3 [nfsd] 0xe0 (0xf55fdf68) [<f8a0d5b1>] nfsd_dispatch [nfsd] 0xc1 (0xf55fdf6c) [<f89fd999>] svc_process_Rsmp_d8cce609 [sunrpc] 0x349 (0xf55fdf88) [<f8a1fbd8>] nfsd_version3 [nfsd] 0x0 (0xf55fdf9c) [<f8a1fbf8>] nfsd_program [nfsd] 0x0 (0xf55fdfa0) [<f8a0d39b>] nfsd [nfsd] 0x20b (0xf55fdfbc) [<c0105856>] arch_kernel_thread [kernel] 0x26 (0xf55fdff0) [<f8a0d190>] nfsd [nfsd] 0x0 (0xf55fdff8) Code: 0f 0b 59 5b b8 01 00 00 00 f0 0f b3 46 18 19 c0 85 c0 74 07 <0>Kernel panic: not continuing This machine is part of a NFS server cluster. The cluster ran as it should and transferred the service to the second machine, however, it also kernel panicked with the same error message. When the machines were rebooted, they would come up fine until they started the cluster service and mounted this filesystem, when they would kernel panic. I booted with the rescue disc and disabled the cluster service, then ran fsck on the partition. 72 duplicate blocks were found and repaired and the filesystem has been remounted. |