Hide Forgot
Description of problem: If the driver is part of the initrd, there is a kernel crash as seen in the attachment. The crash happens in scsi_alloc_host which is the very first major call made. Also, the crash does not happen when the system has booted up and insmod is done. Version-Release number of selected component (if applicable): kernel-2.6.32-114.0.1.el6 How reproducible: Every time Steps to Reproduce: 1.install the kernel 2. reboot 3. Actual results: Expected results: Additional info:
Red Hat Enterprise Linux Server release 6.0 (Santiago) Kernel 2.6.32-95.el6.scsiall2.x86_64 on an x86_64 DellSlot12.lab.sj.emulex.com login: bnx2i: dev eth1 does not support iSCSI bnx2i: dev eth0 does not support iSCSI No Boot Session for this pci_func,session Hndl = -1 scsi host9: beiscsi_dev_probe-No Boot Devices !!!!! ------------[ cut here ]------------ kernel BUG at mm/slab.c:3067! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/class CPU 2 Modules linked in: be2iscsi(+) iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse autofs4 sunrpc ipv6 dm_mirror dm_region_hash dm_log uinput power_meter hwmon sr_mod cdrom sg dcdbas iTCO_wdt iTCO_vendor_support i7core_edac edac_core be2net bnx2 ext3 jbd mbcache sd_mod crc_t10dif usb_storage pata_acpi ata_generic ata_piix dm_mod [last unloaded: scsi_transport_iscsi] Modules linked in: be2iscsi(+) iscsi_boot_sysfs bnx2i cnic uio cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi fuse autofs4 sunrpc ipv6 dm_mirror dm_region_hash dm_log uinput power_meter hwmon sr_mod cdrom sg dcdbas iTCO_wdt iTCO_vendor_support i7core_edac edac_core be2net bnx2 ext3 jbd mbcache sd_mod crc_t10dif usb_storage pata_acpi ata_generic ata_piix dm_mod [last unloaded: scsi_transport_iscsi] Pid: 5762, comm: modprobe Not tainted 2.6.32-95.el6.scsiall2.x86_64 #1 PowerEdge M610 RIP: 0010:[<ffffffff8114f3c4>] [<ffffffff8114f3c4>] cache_alloc_refill+0x1e4/0x240 RSP: 0018:ffff8801a0df1b78 EFLAGS: 00010082 RAX: 0000000000000004 RBX: ffff8801afd50540 RCX: 00000000ffffffff RDX: 0000000000000168 RSI: 0000000000000000 RDI: ffff8801afc21a00 RBP: ffff8801a0df1bd8 R08: 0000000000000246 R09: 0000000000000040 R10: 0000000000000003 R11: 0000000000000000 R12: ffff8801a8dc51c0 R13: ffff8801afc219c0 R14: 0000000000000004 R15: ffff8801a5dc5e00 FS: 00007fa5d2f44700(0000) GS:ffff880031c40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000020f8108 CR3: 00000001a8e65000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process modprobe (pid: 5762, threadinfo ffff8801a0df0000, task ffff8801a509d560) Stack: ffff8801a0df1bb8 0000000000000093 ffff8801afc21a00 000492d0813f37dc <0> ffff8801afc219e0 ffff8801afc219d0 0000000000000001 00000000000022e8 <0> 0000000000000010 00000000000080d0 ffff8801afd50540 ffffffff8133ebb0 Call Trace: [<ffffffff8133ebb0>] ? scsi_host_alloc+0x40/0x320 [<ffffffff81150269>] __kmalloc+0x1a9/0x220 [<ffffffff8133ebb0>] scsi_host_alloc+0x40/0x320 [<ffffffff811d9390>] ? sysfs_ilookup_test+0x0/0x20 [<ffffffffa0209a97>] iscsi_host_alloc+0x27/0xe0 [libiscsi] [<ffffffffa00f2ea2>] beiscsi_dev_probe+0x11d/0xd7b [be2iscsi] [<ffffffff811da68c>] ? sysfs_add_one+0x2c/0x130 [<ffffffff81271817>] local_pci_probe+0x17/0x20 [<ffffffff81272a01>] pci_device_probe+0x101/0x120 [<ffffffff8132a0a2>] ? driver_sysfs_add+0x62/0x90 [<ffffffff8132a240>] driver_probe_device+0xa0/0x2a0 [<ffffffff8132a4eb>] __driver_attach+0xab/0xb0 [<ffffffff8132a440>] ? __driver_attach+0x0/0xb0 [<ffffffff813294a4>] bus_for_each_dev+0x64/0x90 [<ffffffff81329fde>] driver_attach+0x1e/0x20 [<ffffffff813298e0>] bus_add_driver+0x200/0x300 [<ffffffff8132a816>] driver_register+0x76/0x140 [<ffffffff81272c66>] __pci_register_driver+0x56/0xd0 [<ffffffffa00fb000>] ? beiscsi_module_init+0x0/0xa8 [be2iscsi] [<ffffffffa00fb000>] ? beiscsi_module_init+0x0/0xa8 [be2iscsi] [<ffffffffa00fb068>] beiscsi_module_init+0x68/0xa8 [be2iscsi] [<ffffffff8100204c>] do_one_initcall+0x3c/0x1d0 [<ffffffff810a7a6f>] sys_init_module+0xdf/0x250 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Code: 89 ff e8 a0 48 11 00 eb 99 66 0f 1f 44 00 00 41 c7 45 60 01 00 00 00 4d 8b 7d 20 4c 39 7d c0 0f 85 f2 fe ff ff eb 84 0f 0b eb fÿMounting proc filesystem Mounting sysfs filesystem
It's not entirely clear to me why the output is showing a login prompt at the beginning: Red Hat Enterprise Linux Server release 6.0 (Santiago) Kernel 2.6.32-95.el6.scsiall2.x86_64 on an x86_64 DellSlot12.lab.sj.emulex.com login: bnx2i: dev eth1 does not support iSCSI bnx2i: dev eth0 does not support iSCSI No Boot Session for this pci_func,session Hndl = -1 scsi host9: beiscsi_dev_probe-No Boot Devices !!!!! ------------[ cut here ]------------ kernel BUG at mm/slab.c:3067! invalid opcode: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000 But I'm presuming that all of the output from "bnx2i: dev eth1" and beyond is captured console output of a crash that occurred during boot? The crash is happening when scsi_alloc_host() was making this allocation: shost = kzalloc(sizeof(struct Scsi_Host) + privsize, gfp_mask); I'm not sure what the "privsize" would be, but given that the Scsi_Host struture itself is 1504 bytes, I'm guessing that it would come from the size-2048 slab cache and that the slab structure associated with it woud be "off-slab". Anyway, the kzallod() eventually reaches cache_alloc_refill(), where it BUG's here: /* * The slab was either on partial or free list so * there must be at least one object available for * allocation. */ BUG_ON(slabp->inuse >= cachep->num); From the registers, the "slabp->inuse" value is in the lower 32 bits of the RCX register or ffffffff, which is a nonsensical value. That inuse value tracks the number of objects within a slab page (or pages) that are currently in use. It's initialized to zero and then incremented or decremented as objects in the slab are allocated/deallocated: struct slab { struct list_head list; long unsigned int colouroff; void *s_mem; unsigned int inuse; kmem_bufctl_t free; short unsigned int nodeid; } Anyway to get to ffffffff either: (1) the slab structure itself was corrupted by some entity overwriting it, or (2) if it got decremented to -1 here: static void slab_put_obj(struct kmem_cache *cachep, struct slab *slabp, void *objp, int nodeid) { unsigned int objnr = obj_to_index(cachep, slabp, objp); #if DEBUG /* Verify that the slab belongs to the intended node */ WARN_ON(slabp->nodeid != nodeid); if (slab_bufctl(slabp)[objnr] + 1 <= SLAB_LIMIT + 1) { printk(KERN_ERR "slab: double free detected in cache " "'%s', objp %p\n", cachep->name, objp); BUG(); } #endif slab_bufctl(slabp)[objnr] = slabp->free; slabp->free = objnr; slabp->inuse--; } DEBUG is not turned on, so the double-free detection is not there. Given all of the error messages that preceded the final BUG(), I don't know which of the failure modes would be most likely. Without a dump, we don't know the contents of the rest of the slab data structure to see if they are also nonsensical. I don't know where the "2.6.32-95.el6.scsiall2.x86_64" kernel came from, but maybe it would be helpful to put a printk in slab_put_obj() above to at least verify that the slabp->inuse-- caused it? And if so, dump out the other structure contents? Again without a dump, it's hard to pinpoint anything else. Other than that, I don't know what else to suggest, as I declare complete cluelessness w/respect to initrd or scsi driver issues.
I would suggest the first thing to do would be to run this "2.6.32-95.el6.scsiall2" with its debug kernel. If kernel-debug-6.32-95.el6.scsiall2.x86_64.rpm was built with the existing kernel, then just install that that one and see what happens. Otherwise it would have to be rebuilt without turning off the debug kernel line in the kernel.spec file. Anyway, the debug kernel has CONFIG_SLAB_DEBUG turned on, and may catch the earlier slab-cache offense.
And while you are at it, set up to capture a kdump of the failure.
I already have the kdump vmcore. How can I send it to you. It's about 60 MB compressed
> I already have the kdump vmcore. How can I send it to you. Actually, I don't know -- usually vmcores come via GSS intermediary. But since this isn't an issue-tracker BZ, I don't know. Tom?
I have put the vmcore file in ftp.serverengines.com. I will send the username/passwd in a private email
OK, thanks, I'm downloading the vmcore... However, the vmcore is useless without the associated vmlinux file. As I mentioned before, I have no clue where you got this "2.6.32-95.el6.scsiall2" kernel that you're running? Can you put the kernel-debuginfo-2.6.32-95.el6.scsiall2.x86_64.rpm and kernel-debuginfo-common-x86_64-2.6.32-95.el6.scsiall2.x86_64.rpm files in the same place?
Yep. I forgot to upload the vmlnux. Only worry is I have recompiled the kernel but don't remember making any changes other than my driver for debugging. I do not have the rpm's you are referring to. I built this from src rpm. Pl let me know if you need any further info
If you built it from the src rpm, then the RPMS subdirectory should contain the debuginfo rpms. (That is, unless you specifically turned off the "with-debuginfo" %define in the kernel.spec file.) If there's any need to debug any kernel module code, then the full kernel-debuginfo package is required. Anyway, I see that this is a different crash path from the one that whose log is cut-and-pasted above: crash> bt PID: 8883 TASK: ffff88012783cb00 CPU: 0 COMMAND: "work_for_cpu" #0 [ffff8801298fb9d0] machine_kexec at ffffffff8103080b #1 [ffff8801298fba30] crash_kexec at ffffffff810b58b2 #2 [ffff8801298fbb00] oops_end at ffffffff814daa80 #3 [ffff8801298fbb30] die at ffffffff8100f2fb #4 [ffff8801298fbb60] do_trap at ffffffff814da374 #5 [ffff8801298fbbc0] do_invalid_op at ffffffff8100ceb5 #6 [ffff8801298fbc60] invalid_op at ffffffff8100bf5b [exception RIP: cache_alloc_refill+0x1e4] RIP: ffffffff81158b64 RSP: ffff8801298fbd10 RFLAGS: 00010086 RAX: 0000000000000004 RBX: ffff88012fd50540 RCX: 00000000fffffffd RDX: 0000000000000115 RSI: 0000000000000000 RDI: ffff88012fc21a00 RBP: ffff8801298fbd70 R8: 0000000000000246 R9: 0000000000000040 R10: 0000000000000002 R11: 0000000000000000 R12: ffff88012fe96940 R13: ffff88012fc219c0 R14: 0000000000000004 R15: ffff880127a73500 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffff8801298fbd68] scsi_host_alloc at ffffffff8134d7e0 #8 [ffff8801298fbdc8] scsi_host_alloc at ffffffff8134d7e0 #9 [ffff8801298fbdf8] iscsi_host_alloc at ffffffffa04d2a97 [libiscsi] #10 [ffff8801298fbe28] beiscsi_dev_probe at ffffffffa027eeb2 [be2iscsi] #11 [ffff8801298fbeb8] local_pci_probe at ffffffff8127e4a7 #12 [ffff8801298fbec8] do_work_for_cpu at ffffffff81088598 #13 [ffff8801298fbee8] kthread at ffffffff8108d476 #14 [ffff8801298fbf48] kernel_thread at ffffffff8100c1ca crash> It runs into the similar kind of slab corruption, though, where the "privsize" tacked onto the end of the Scsi_Host structure must be pretty big, because the slab-size that has a bad inuse counter of -3 (4294967293 unsigned) is the size-16384 slab: crash> kmem -s CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS SSIZE ... [ cut ] ... ffff88012fd705c0 size-32768 32768 5 5 5 32k ffff88012fd60580 size-16384(DMA) 16384 0 0 0 16k kmem: size-16384: partial list: slab: ffff880127a73500 bad inuse counter: 4294967293 ffff88012fd50540 size-16384 16384 2 2 2 16k ffff88012fd40500 size-8192(DMA) 8192 0 0 0 8k ffff88012fd304c0 size-8192 8192 65 65 65 8k ... The whole slab structure is not corrupt, so it looks like a multiple-free situation: crash> slab ffff880127a73500 struct slab { list = { next = 0xffff880129c83dc0, prev = 0xffff88012fc219c0 }, colouroff = 0x0, s_mem = 0xffff8801298cc000, inuse = 0xfffffffd, free = 0x0, nodeid = 0x0 } crash> I look at this some more tomorrow, but you really should run the kernel-debuginfo kernel which has CONFIG_SLAB_DEBUG turned on. If you left the "with_debug" %define in the kernel-spec file as is, then that kernel should have been built, and its rpm files can be found in the RPM subdirectory. Note the difference in the kernel.spec file: This one builds the kernel-debug kernel, which has CONFIG_SLAB_DEBUG turned on: %define with_debug %{?_without_debug: 0} %{?!_without_debug: 1} This one, for each kernel built, will also build the associated kernel-debuginfo packages: %define with_debuginfo %{?_without_debuginfo: 0} %{?!_without_debuginfo: 1}
With respect to the kernel.spec file settings that I suggested above, I guess I'm even more confused as to the genesis of this second kernel: > Pid: 8883, comm: work_for_cpu Not tainted 2.6.32 #1 PowerEdge M805 where the kernel version string is: Linux version 2.6.32 (root.sj.emulex.com) (gcc version 4.4.4 20100726 (Red Hat 4.4.4-13) (GCC) ) #1 SMP Wed Feb 23 14:57:50 PST 2011 Was this kernel built from a generic 2.6.32 tree, i.e., not even built from the "2.6.32-95.el6" base like the first one was? I can only suggest things if you follow an rpmbuild procedure that starts with a RHEL6 src.rpm, where you do something like this: # rpm -ivh kernel-2.6.32-71.el6.src.rpm # cd $HOME/rpmbuild/SPECS Then modify the SPECS/kernel.spec in two places: (1) Create a .buildid %define as it suggests, and (2) It's probably also worth changing "with_firmware" to "1": # kernel-firmware %define with_firmware %{?_with_firmware: 1} %{?!_with_firmware: 0} Then lay down the source tree: # rpm -bp kernel.spec Then go to $HOME/rpmbuild/BUILD/kernel-2.6.32/linux-2.6.32.x86_64. For each file that you modify, first copy the original file to a "file.orig" file. For each file that you create from scratch, create an empty "file.orig" file. Then make your changes. Then go to $HOME/rpmbuild/BUILD/kernel-2.6.32, and enter: # gendiff linux-2.6.32.x86_64 .orig > ../../SOURCES/linux-kernel-test.patch Sanity-check the linux-kernel-test.patch, and then build everything; # cd $HOME/rpmbuild/SPECS # rpmbuild -ba kernel.spec That will build the base kernel, the kernel-debug kernel, the kernel-firmware and debuginfo rpms (and a couple others). "rpm -ivh" the base kernel, the kernel-debug kernel, and the kernel-firmware rpms. Reboot into the base kernel, and if the problem still occurs, then reboot with the kernel-debug kernel -- which will hopefully catch the first double-free offender. That's about all I can suggest at this point.
I have tried to repro this bug using RHEL6.1 Alpha release and haven't been able to. So, closing this bug