Bug 204625

Summary: ia64: modprobe dieing with a bugcheck
Product: [Fedora] Fedora Reporter: Prarit Bhargava <prarit>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 6CC: dchapman, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: ia64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-09-20 12:04:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 163350    

Description Prarit Bhargava 2006-08-30 14:47:24 UTC
Description of problem:

modprobe bugcheck! -- causes install to fail.

Version-Release number of selected component (if applicable): 20060830


How reproducible: 100%


Steps to Reproduce:
1. Attempt to install via NFS
  
Actual results:

After attempting to identify what video hardware, etc., is in the system, the 
kernel dies with:

modprobe[4819]: bugcheck! 0 [1]                                                 
Modules linked in: dm_emc dm_round_robin dm_multipath dm_snapshot dm_mirror
dm_zero dm_mod xfs jfs reiserfs lock_nolock gfs2 ext3 jbd msdos raid456 xor
raid1 raid0 qla1280 qla2xxx scsi_transport_fc mptspi scsi_transport_spi mptscsih
mptbase tg3 ioc4 iscsi_tcp libiscsi scsi_transport_iscsi sr_mod sd_mod scsi_mod
ide_cd cdrom ipv6 squashfs loop nfs nfs_acl fscache lockd sunrpc vfat fat cramfs
      
                                                                                
Pid: 4819, CPU 13, comm:             modprobe                                   
psr : 0000101008522030 ifs : 800000000000038c ip  : [<a000000100140ff0>]    Not
tainted                                                                         
ip is at check_slabp+0x210/0x240                                                
unat: 0000000000000000 pfs : 000000000000038c rsc : 0000000000000003            
rnat: 0000000044f56bbd bsps: 000000002ebae400 pr  : 999a156aa95a6555            
ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f            
csd : 0000000000000000 ssd : 0000000000000000                                   
b0  : a000000100140ff0 b6  : a000000100066fa0 b7  : a000000100240660            
f6  : 0fffbccccccccc8c00000 f7  : 0ffdaa200000000000000                         
f8  : 100008000000000000000 f9  : 10002a000000000000000                         
f10 : 0fffcccccccccc8c00000 f11 : 1003e0000000000000000                         
r1  : a000000100ba13c0 r2  : a0000001009b8ba0 r3  : e00008b0796a9044            
r8  : 0000000000000021 r9  : a0000001009b6d20 r10 : a0000001009b8bd0            
r11 : a0000001009b8bd0 r12 : e00008b0796afb50 r13 : e00008b0796a8000            
r14 : a0000001009b8ba0 r15 : 0000000000000000 r16 : ffffffffdead4ead            
r17 : 00000000dead4ead r18 : a000000100841b64 r19 : a0000001009b6d18           
r20 : 0000000000000000 r21 : a0000001009a1a58 r22 : a0000001009b8ed8
r23 : a0000001007f3100 r24 : a0000001009a1a58 r25 : a0000001009b8ba8
r26 : a0000001009b8ba8 r27 : a0000001009a1be8 r28 : a000000100841b68
r29 : 000000000000000d r30 : a000000100841b70 r31 : e00008b0796a9044

Call Trace:
 [<a000000100013e80>] show_stack+0x40/0xa0
                                sp=e00008b0796af6e0 bsp=e00008b0796a95e8
 [<a000000100014780>] show_regs+0x840/0x880
                                sp=e00008b0796af8b0 bsp=e00008b0796a9590
 [<a000000100037b80>] die+0x1c0/0x2a0
                                sp=e00008b0796af8b0 bsp=e00008b0796a9548
 [<a000000100037cb0>] die_if_kernel+0x50/0x80
                                sp=e00008b0796af8d0 bsp=e00008b0796a9518
 [<a00000010061e350>] ia64_bad_break+0x270/0x4a0
                                sp=e00008b0796af8d0 bsp=e00008b0796a94f0
 [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280
                                sp=e00008b0796af980 bsp=e00008b0796a94f0
 [<a000000100140ff0>] check_slabp+0x210/0x240
                                sp=e00008b0796afb50 bsp=e00008b0796a9490
 [<a000000100144260>] __cache_alloc_node+0x100/0x300
                                sp=e00008b0796afb50 bsp=e00008b0796a9448
 [<a000000100144580>] alternate_node_alloc+0x120/0x160
                                sp=e00008b0796afb50 bsp=e00008b0796a9418
 [<a0000001001458e0>] kmem_cache_zalloc+0xa0/0x260
                                sp=e00008b0796afb50 bsp=e00008b0796a93e0
 [<a00000010011fab0>] do_brk+0x390/0x520
                                sp=e00008b0796afb50 bsp=e00008b0796a9398
 [<a0000001001c2330>] set_brk+0x70/0x120
                                sp=e00008b0796afb80 bsp=e00008b0796a9368
 [<a0000001001c3d50>] load_elf_binary+0x10f0/0x2860
                                sp=e00008b0796afb80 bsp=e00008b0796a9268
 [<a000000100170af0>] search_binary_handler+0x210/0x6e0
                                sp=e00008b0796afb90 bsp=e00008b0796a9218
 [<a000000100173660>] do_execve+0x2c0/0x4e0
                                sp=e00008b0796afb90 bsp=e00008b0796a91a8
 [<a000000100012c20>] sys_execve+0x60/0xa0
                                sp=e00008b0796afb90 bsp=e00008b0796a9170
 [<a00000010000bc50>] ia64_execve+0x30/0x140
                                sp=e00008b0796afb90 bsp=e00008b0796a9120
 [<a00000010000c490>] __ia64_trace_syscall+0xd0/0x110
                                sp=e00008b0796afb90 bsp=e00008b0796a9120
 [<a00000010000c380>] execve+0x0/0x20
                                sp=e00008b0796afd60 bsp=e00008b0796a9120
 [<a0000001000a3820>] __exec_usermodehelper+0x140/0x160
                                sp=e00008b0796afd60 bsp=e00008b0796a90e0
 [<a0000001000a3940>] ____call_usermodehelper+0x100/0x140
                                sp=e00008b0796afd60 bsp=e00008b0796a90b8
 [<a0000001000123f0>] kernel_thread_helper+0x30/0x60
                                sp=e00008b0796afe30 bsp=e00008b0796a9090
 [<a0000001000090c0>] start_kernel_thread+0x20/0x40
                                sp=e00008b0796afe30 bsp=e00008b0796a9090
 <0>BUG: spinlock lockup on CPU#2, events/2/196, e00000b0030696e0 (Not tainted)

Call Trace:
 [<a000000100013e80>] show_stack+0x40/0xa0
                                sp=e00005b005187b30 bsp=e00005b005181280
 [<a000000100013f10>] dump_stack+0x30/0x60
                                sp=e00005b005187d00 bsp=e00005b005181268
 [<a0000001002acf80>] _raw_spin_lock+0x200/0x260
                                sp=e00005b005187d00 bsp=e00005b005181230
 [<a00000010061c970>] _spin_lock_irq+0x30/0x60
                                sp=e00005b005187d00 bsp=e00005b005181210
 [<a000000100147530>] drain_array+0xb0/0x200
                                sp=e00005b005187d00 bsp=e00005b0051811b8
 [<a00000010014ad90>] cache_reap+0x270/0x580
                                sp=e00005b005187d00 bsp=e00005b005181170
 [<a0000001000a4060>] run_workqueue+0x1c0/0x280
                                sp=e00005b005187d00 bsp=e00005b005181130
 [<a0000001000a5f40>] worker_thread+0x1a0/0x240
                                sp=e00005b005187d00 bsp=e00005b005181100
 [<a0000001000adc60>] kthread+0x220/0x2a0
                                sp=e00005b005187d50 bsp=e00005b0051810b8
 [<a0000001000123f0>] kernel_thread_helper+0x30/0x60
                                sp=e00005b005187e30 bsp=e00005b005181090
 [<a0000001000090c0>] start_kernel_thread+0x20/0x40
                                sp=e00005b005187e30 bsp=e00005b005181090

Expected results:  No panic/BUG should be seen.

Comment 1 Prarit Bhargava 2006-08-30 15:16:49 UTC
Issue is not seen with latest kernel from 20050830 ...

P.

Comment 2 Prarit Bhargava 2006-08-30 15:36:27 UTC
Ouch ... this is being caused by slab corruption -- the BUG was triggered by:

static void check_slabp(struct kmem_cache *cachep, struct slab *slabp)
{
        kmem_bufctl_t i;
        int entries = 0;

        /* Check slab's freelist to see if this obj is there. */
        for (i = slabp->free; i != BUFCTL_END; i = slab_bufctl(slabp)[i]) {
                entries++;
                if (entries > cachep->num || i >= cachep->num)
                        goto bad;
        }
        if (entries != cachep->num - slabp->inuse) {
bad:
                printk(KERN_ERR "slab: Internal list corruption detected in "
                                "cache '%s'(%d), slabp %p(%d). Hexdump:\n",
                        cachep->name, cachep->num, slabp, slabp->inuse);
                for (i = 0;
                     i < sizeof(*slabp) + cachep->num * sizeof(kmem_bufctl_t);
                     i++) {
                        if (i % 16 == 0)
                                printk("\n%03x:", i);
                        printk(" %02x", ((unsigned char *)slabp)[i]);
                }
                printk("\n");
                BUG();
        }
}


P.

Comment 3 Prarit Bhargava 2006-08-30 15:51:08 UTC
Clarifying comment #1:  The issue only seems to hit during the installation -- I
can load the kernel via rpm and boot without any issues.

Comment 4 David Lawrence 2006-09-05 16:00:26 UTC
Changing to proper owner, kernel-maint.

Comment 5 Prarit Bhargava 2006-09-20 12:04:42 UTC
I'm willing to bet that this was caused by the squashfs corruption issue, 204638.

It definately has the footprint of the corruption ... marking as a duplicate.

P.

*** This bug has been marked as a duplicate of 204638 ***