Description of problem: When using mkisofs a BUG() was hit in mm/slab.c . The bug appears to be ext3 related. Version-Release number of selected component (if applicable): kernel-2.6.20-1.3088.fc7 How reproducible: Has happened twice on Altix3 (64p/128G). Steps to Reproduce: 1. Attempt to create an iso using mkisofs. Actual results: altix3.rhts.boston.redhat.com login: kernel BUG at mm/slab.c:2381! mkisofs[18405]: bugcheck! 0 [1] Modules linked in: loop autofs4 hidp nfs lockd nfs_acl rfcomm l2cap bluetooth sunrpc nf_conntrack_netbios_ns nf_conntrack_ipv4 xt_state nf_conntrack nfnetlink xt_tcpudp ipt_REJECT iptable_filter ip_tables ip6_tables x_tables vfat fat dm_multipath button parport_pc lp parport tg3 sg dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc mptspi mptscsih scsi_transport_spi mptbase qla1280 sd_mod scsi_mod ext3 jbd mbcache ehci_hcd ohci_hcd uhci_hcd Pid: 18405, CPU 0, comm: mkisofs psr : 0000101008522030 ifs : 8000000000000610 ip : [<a000000100155ae0>] Not tainted ip is at cache_alloc_refill+0x2c0/0x5c0 unat: 0000000000000000 pfs : 0000000000000610 rsc : 0000000000000003 rnat: ccccccccc8c00000 bsps: 000000000000fffc pr : 0000000000599559 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a000000100155ae0 b6 : a0000001004a2be0 b7 : a0000001003cc2a0 f6 : 1003e00000000000000a0 f7 : 1003e20c49ba5e353f7cf f8 : 1003e00000000000004e2 f9 : 1003e000000000fa00000 f10 : 1003e000000003b9aca00 f11 : 1003e431bde82d7b634db r1 : a000000100d85c90 r2 : a000000100b9d680 r3 : e00009b004358ca4 r8 : 0000000000000021 r9 : a000000100b9d6b8 r10 : a000000100b9d6b8 r11 : a000000100b99790 r12 : e00009b00435fc50 r13 : e00009b004358000 r14 : a000000100b9d680 r15 : 0000000000000000 r16 : ffffffffdead4ead r17 : 00000000dead4ead r18 : a000000100aaf324 r19 : a000000100b99760 r20 : e000003003107310 r21 : ffffffffffff7310 r22 : e000003003110000 r23 : ffffffffffff0028 r24 : 8000010800000000 r25 : 0000010000000000 r26 : 0fd0000000000000 r27 : 0000000000000004 r28 : 0000000000000026 r29 : 80000001fdc01040 r30 : 0000000000000000 r31 : 0000000000001040 Call Trace: [<a000000100013ee0>] show_stack+0x40/0xa0 sp=e00009b00435f7e0 bsp=e00009b004359510 [<a0000001000147e0>] show_regs+0x840/0x880 sp=e00009b00435f9b0 bsp=e00009b0043594b8 [<a000000100037f00>] die+0x1c0/0x2a0 sp=e00009b00435f9b0 bsp=e00009b004359470 [<a000000100038030>] die_if_kernel+0x50/0x80 sp=e00009b00435f9d0 bsp=e00009b004359440 [<a0000001005ffb10>] ia64_bad_break+0x270/0x4a0 sp=e00009b00435f9d0 bsp=e00009b004359418 [<a00000010000bf80>] ia64_leave_kernel+0x0/0x280 sp=e00009b00435fa80 bsp=e00009b004359418 [<a000000100155ae0>] cache_alloc_refill+0x2c0/0x5c0 sp=e00009b00435fc50 bsp=e00009b004359390 [<a000000100156900>] kmem_cache_alloc+0x1a0/0x280 sp=e00009b00435fc50 bsp=e00009b004359360 [<a000000236404090>] journal_add_journal_head+0x50/0x560 [jbd] sp=e00009b00435fc50 bsp=e00009b004359330 [<a0000002363f8ff0>] journal_dirty_data+0x70/0x700 [jbd] sp=e00009b00435fc50 bsp=e00009b0043592e8 [<a0000002364c3050>] ext3_journal_dirty_data+0x30/0xc0 [ext3] sp=e00009b00435fc50 bsp=e00009b0043592b8 [<a0000002364c0e70>] walk_page_buffers+0xd0/0x160 [ext3] sp=e00009b00435fc50 bsp=e00009b004359250 [<a0000002364c7960>] ext3_ordered_commit_write+0xc0/0x1e0 [ext3] sp=e00009b00435fc50 bsp=e00009b004359200 [<a000000100102250>] generic_file_buffered_write+0x9b0/0xdc0 sp=e00009b00435fc50 bsp=e00009b004359130 [<a000000100102d50>] __generic_file_aio_write_nolock+0x6f0/0x800 sp=e00009b00435fce0 bsp=e00009b0043590b8 [<a000000100102f40>] generic_file_aio_write+0xe0/0x1e0 sp=e00009b00435fd20 bsp=e00009b004359058 [<a0000002364bdc80>] ext3_file_write+0x60/0x200 [ext3] sp=e00009b00435fd20 bsp=e00009b004359020 [<a000000100166f80>] do_sync_write+0x180/0x200 sp=e00009b00435fd20 bsp=e00009b004358fe0 [<a0000001001685c0>] vfs_write+0x200/0x3a0 sp=e00009b00435fe20 bsp=e00009b004358f90 [<a000000100169110>] sys_write+0x70/0xe0 sp=e00009b00435fe20 bsp=e00009b004358f18 [<a00000010000bd30>] ia64_trace_syscall+0xd0/0x110 sp=e00009b00435fe30 bsp=e00009b004358f18 [<a000000000010620>] __kernel_syscall_via_break+0x0/0x20 sp=e00009b004360000 bsp=e00009b004358f18 Expected results: Additional info: Seen on "F7-test4" for ia64. P.
This kind of bug also occurs while installing fedora 7 test4 on SAS disks on a Cell/B.E. based system (ppc64). ERROR: via: reading /dev/sdb[Input/output error] No handlers could be found for logger "yum.YumBase" ------------[ cut here ]------------ cpu 0x2: Vector: 700 (Program Check) at [c00000007d34f300] pc: c0000000000f6670: .cache_alloc_refill+0x150/0x2d4 lr: c0000000000f6638: .cache_alloc_refill+0x118/0x2d4 sp: c00000007d34f580 msr: 9000000000029032 current = 0xc00000007dd660c0 paca = 0xc0000000005de880 pid = 1073, comm = anaconda kernel BUG at mm/slab.c:2381! enter ? for help [c00000007d34f640] c0000000000f6ee4 .kmem_cache_alloc+0x130/0x1b0 [c00000007d34f6e0] d00000000015280c .ext3_alloc_inode+0x24/0x68 [ext3] [c00000007d34f760] c00000000011a68c .alloc_inode+0x50/0x1ec [c00000007d34f7f0] c00000000011abf0 .new_inode+0x30/0xc4 [c00000007d34f880] d000000000147718 .ext3_new_inode+0x8c/0xc54 [ext3] [c00000007d34f9a0] d000000000150b18 .ext3_create+0xbc/0x160 [ext3] [c00000007d34fa60] c00000000010b234 .vfs_create+0x158/0x200 [c00000007d34fb00] c00000000010f184 .open_namei+0x1cc/0x848 [c00000007d34fbe0] c0000000000fe338 .do_filp_open+0x38/0x70 [c00000007d34fd00] c0000000000fe3e4 .do_sys_open+0x74/0x130 [c00000007d34fdb0] c00000000013c600 .compat_sys_open+0x24/0x38 [c00000007d34fe30] c0000000000087c8 syscall_exit+0x0/0x40 --- Exception: c01 (System Call) at 000000000fe1da54 SP (ffa7f2a0) is in userspace
Reports are now from ia64 & ppc64 -- switching arch to "All". P.
The current builds have various debugging options disabled. There are however kernel-debug packages which are equivalent, with the debugging turned back on available at http://people.redhat.com/davej/kernels/Fedora/fc7/RPMS.kernel/i686/ Whilst further diagnosing this bug, it'd be good if you could run with those instead of the rawhide kernels (there's a .repo file a few dirs up from that url, you can then just yum install kernel-debug)
Sorry, but we are not able to reproduce the bug again. All we changed was the FW level and the setup of the SAS disks. Now it works fine. If the bug will appear again I'll be pleased to try the debug-kernel. Sorry.
2377 static void check_spinlock_acquired(struct kmem_cache *cachep) 2378 { 2379 #ifdef CONFIG_SMP 2380 check_irq_off(); 2381 assert_spin_locked(&cachep->nodelists[numa_node_id()]->list_lock); 2382 #endif 2383 } is what we hit. So it's checking a lock on the cache... well... this probably isn't ext3's fault is it? Though I don't see anything in newer kernels that resolves it (... I don't claim to be a mm/* expert...) Has this ever been tested on newer kernels? -Eric
Just tried this on F8T1. It works :) Switching to NOTABUG. P.