This service will be undergoing maintenance at 00:00 UTC, 2016-09-28. It is expected to last about 1 hours
Bug 196914 - kernel BUG at fs/buffer.c:2789!
kernel BUG at fs/buffer.c:2789!
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
rawhide
ia64 Linux
high Severity high
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
: 204008 (view as bug list)
Depends On:
Blocks: fedora-ia64 208404
  Show dependency treegraph
 
Reported: 2006-06-27 12:38 EDT by Doug Chapman
Modified: 2007-11-30 17:11 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-10-02 20:37:48 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Doug Chapman 2006-06-27 12:38:15 EDT
Description of problem:
This is seen occasionally in the LTP test suite using the fsx-linux test.  If I
run that specific test in a loop I can reproduce this 100% on my HP Integrity
servers.

The test hangs after the trace is printed, sometimes I can log in after this but
any filesystem access usually hangs at that point so the system in unusable.

kernel BUG at fs/buffer.c:2789!
kjournald[458]: bugcheck! 0 [1]
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc
ip_conntrack_netbios_ns ipt_REJECT iptable_filter ip_tables xt_state
ip_conntrack nfnetlink xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 vfat
fat button parport_pc lp parport qla2xxx ide_cd e1000 sg ohci_hcd
scsi_transport_fc ehci_hcd cdrom dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd
mptspi scsi_transport_spi mptscsih sd_mod scsi_mod mptbase

Pid: 458, CPU 1, comm:            kjournald
psr : 0000101008526030 ifs : 8000000000000309 ip  : [<a0000001001417a0>]    Not
tainted
ip is at submit_bh+0xa0/0x380
unat: 0000000000000000 pfs : 0000000000000309 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000005541
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a0000001001417a0 b6  : a0000001000a63a0 b7  : a0000001002a5f00
f6  : 1003e0000000000000090 f7  : 1003e20c49ba5e353f7cf
f8  : 1003e0000000000000465 f9  : 1003e000000000e100000
f10 : 1003e0000000035a4e900 f11 : 1003e431bde82d7b634db
r1  : a000000100b73fa0 r2  : a00000010098b9b8 r3  : e000004045128ff4
r8  : 0000000000000023 r9  : a0000001009756d0 r10 : a00000010098b9e8
r11 : a00000010098b9e8 r12 : e00000404512fce0 r13 : e000004045128000
r14 : a00000010098b9b8 r15 : 0000000000000000 r16 : ffffffffdead4ead
r17 : 00000000dead4ead r18 : a0000001008bcdec r19 : 0000000000000000
r20 : 0000000000000074 r21 : a000000100974cc8 r22 : 0000000000000001
r23 : a0000001007c3080 r24 : a000000100974cc8 r25 : a00000010098b9c0
r26 : a00000010098b9c0 r27 : e000000035de0040 r28 : e000000035de0038
r29 : 00000027ffffffd8 r30 : 00000000000000c8 r31 : 000000000027ffd8

Call Trace:
 [<a000000100013ae0>] show_stack+0x40/0xa0
                                sp=e00000404512f870 bsp=e000004045129300
 [<a0000001000143e0>] show_regs+0x840/0x880
                                sp=e00000404512fa40 bsp=e0000040451292a8
 [<a000000100037280>] die+0x1c0/0x2c0
                                sp=e00000404512fa40 bsp=e000004045129260
 [<a0000001000373d0>] die_if_kernel+0x50/0x80
                                sp=e00000404512fa60 bsp=e000004045129230
 [<a0000001005f6710>] ia64_bad_break+0x270/0x4a0
                                sp=e00000404512fa60 bsp=e000004045129208
 [<a00000010000c640>] ia64_leave_kernel+0x0/0x280
                                sp=e00000404512fb10 bsp=e000004045129208
 [<a0000001001417a0>] submit_bh+0xa0/0x380
                                sp=e00000404512fce0 bsp=e0000040451291b8
 [<a000000100144e50>] ll_rw_block+0x210/0x280
                                sp=e00000404512fce0 bsp=e000004045129178
 [<a00000020064b660>] journal_commit_transaction+0xa20/0x2d40 [jbd]
                                sp=e00000404512fce0 bsp=e000004045129108
 [<a000000200656770>] kjournald+0x1b0/0x500 [jbd]
                                sp=e00000404512fd20 bsp=e0000040451290b0
 [<a0000001000a6100>] kthread+0x220/0x2a0
                                sp=e00000404512fd50 bsp=e000004045129068
 [<a000000100011ff0>] kernel_thread_helper+0x30/0x60
                                sp=e00000404512fe30 bsp=e000004045129040
 [<a0000001000090c0>] start_kernel_thread+0x20/0x40
                                sp=e00000404512fe30 bsp=e000004045129040



Version-Release number of selected component (if applicable):
kernel-2.6.17-1.2307_FC6

How reproducible:
100% if the specific test is run in a loop


Steps to Reproduce:
1. obtain the latest LTP test suite from ltp.sf.net
2. unpack the suite:
          # tar zxvf ltp-full-20060515.tgz
3. build just this test with:
          # make testcases/kernel/fs/fsx-linux/fsx-linux
4. run the test in a loop:
          # while true; do
            testcases/kernel/fs/fsx-linux/fsx-linux -N 10000 /tmp/testfile
            done
5. panic/oops usually happens within 2 minutes on my systems


  
Actual results:


Expected results:


Additional info:
Comment 1 Prarit Bhargava 2006-09-18 13:11:51 EDT
Doug, care to retest and let me know if this is occuring on the latest builds
(or at least 2630, fc6-test3)?

Thanks,

P.
Comment 2 Doug Chapman 2006-09-18 15:14:25 EDT
Just re-tested this with 2.6.17-1.2647.fc6 and I still see the error:

kernel BUG at fs/buffer.c:2793!
kjournald[572]: bugcheck! 0 [1]
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc
ip_conntrack_netbios_ns ipt_REJECT iptable_filter ip_tables xt_state
ip_conntrack nfnetlink xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 vfat
fat dm_multipath button parport_pc lp parport sr_mod cdrom joydev sg tg3 shpchp
dm_snapshot dm_zero dm_mirror dm_mod usb_storage mptsas mptscsih mptbase
scsi_transport_sas sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd

Pid: 572, CPU 3, comm:            kjournald
psr : 00001010085a6010 ifs : 8000000000000309 ip  : [<a00000010015aba0>]    Not
tainted
ip is at submit_bh+0xa0/0x380
unat: 0000000000000000 pfs : 0000000000000309 rsc : 0000000000000003
rnat: 0000000000000000 bsps: 0000000000000000 pr  : 0000000000009541
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70433f
csd : 0000000000000000 ssd : 0000000000000000
b0  : a00000010015aba0 b6  : a0000001000112e0 b7  : a000000100296c80
f6  : 1003e00000000000000a0 f7  : 1003e20c49ba5e353f7cf
f8  : 1003e00000000000004e2 f9  : 1003e000000000fa00000
f10 : 1003e000000003b9aca00 f11 : 1003e431bde82d7b634db
r1  : a000000100bb1800 r2  : a0000001009c92f0 r3  : e00001007c289034
r8  : 0000000000000023 r9  : a0000001009c5a10 r10 : a0000001009c9320
r11 : a0000001009c9320 r12 : e00001007c28fce0 r13 : e00001007c288000
r14 : a0000001009c92f0 r15 : 0000000000000000 r16 : ffffffffdead4ead
r17 : 00000000dead4ead r18 : a0000001008f886c r19 : a0000001009c5a08
r20 : 0000000000000000 r21 : a0000001009b1e98 r22 : 0000000000000004
r23 : a0000001007ff100 r24 : a0000001009b1e98 r25 : a0000001009c92f8
r26 : a0000001009c92f8 r27 : a0000001007a1020 r28 : a0000001007a0008
r29 : e000010076ae8060 r30 : a0000001007a002c r31 : e000010076ae802c

Call Trace:
 [<a000000100013e80>] show_stack+0x40/0xa0
                                sp=e00001007c28f870 bsp=e00001007c289340
 [<a000000100014780>] show_regs+0x840/0x880
                                sp=e00001007c28fa40 bsp=e00001007c2892e8
 [<a000000100037b60>] die+0x1c0/0x2a0
                                sp=e00001007c28fa40 bsp=e00001007c2892a0
 [<a000000100037c90>] die_if_kernel+0x50/0x80
                                sp=e00001007c28fa60 bsp=e00001007c289270
 [<a000000100625d90>] ia64_bad_break+0x270/0x4a0
                                sp=e00001007c28fa60 bsp=e00001007c289248
 [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280
                                sp=e00001007c28fb10 bsp=e00001007c289248
 [<a00000010015aba0>] submit_bh+0xa0/0x380
                                sp=e00001007c28fce0 bsp=e00001007c289200
 [<a00000010015e230>] ll_rw_block+0x210/0x280
                                sp=e00001007c28fce0 bsp=e00001007c2891b8
 [<a00000020073f6a0>] journal_commit_transaction+0xa20/0x2d80 [jbd]
                                sp=e00001007c28fce0 bsp=e00001007c289148
 [<a00000020074a790>] kjournald+0x1b0/0x500 [jbd]
                                sp=e00001007c28fd20 bsp=e00001007c2890f0
 [<a0000001000adea0>] kthread+0x220/0x2a0
                                sp=e00001007c28fd50 bsp=e00001007c2890a8
 [<a0000001000123f0>] kernel_thread_helper+0x30/0x60
                                sp=e00001007c28fe30 bsp=e00001007c289080
 [<a0000001000090c0>] start_kernel_thread+0x20/0x40
                                sp=e00001007c28fe30 bsp=e00001007c289080

Comment 3 Matthew Wilcox 2006-09-25 13:17:06 EDT
We've seen what I assume to be the same bug (line 2784 in our case, but I note
there are three BUG statements in a row) with RHEL U4 Beta 3 using a Smart Array
P400 Controller on a Montecito based machine.  It was 23 hours into a reboot
test when we hit it.
Comment 4 Eric Sandeen 2006-09-25 15:50:11 EDT
*** Bug 204008 has been marked as a duplicate of this bug. ***
Comment 5 Eric Sandeen 2006-09-25 16:13:40 EDT
See also bug 207739 and the patch attached to it...
Comment 6 Matthew Wilcox 2006-09-25 16:53:28 EDT
I'm not authorised to access that bug.
Comment 7 Eric Sandeen 2006-09-25 17:12:15 EDT
Whoops sorry!

It's referencing an upstream patch:

http://lkml.org/lkml/2006/9/7/236

Perhaps you could test with that in place?
Comment 8 Eric Sandeen 2006-09-27 18:27:07 EDT
easier-to-get-to patch:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.18-rc7/2.6.18-rc7-mm1/broken-out/jbd-fix-commit-of-ordered-data-buffers.patch

Should apply cleanly to FC6 as it is today.

Any reports of success/failure w/ that patch would be appreciated.
Comment 9 Doug Chapman 2006-09-27 20:00:18 EDT
Eric,

I built a kernel using kernel-2.6.18-1.2699.fc6.src.rpm + the patch above and it
does indeed appear to fix the problem.

Comment 10 Dave Jones 2006-10-02 20:37:48 EDT
Current rawhide kernels have this patch, so I'm going to close this.
I did manage to trip up a BUG at fs/buffer.c:2793 bug today however, which looks
to be a different bug, which Eric is chasing in bug 209005

Note You need to log in before you can comment on or make changes to this bug.