Description of problem: While running modprobe + rmmod lpfc driver in a loop on a IA64 system, the panicked with following stack trace: ===================== scsi_id[10711]: NaT consumption 2216203124768 [1] Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 vfat fat button parport_pc lp parport joydev sg shpchp ide_cd e1000 cdrom dm_snapshot dm_ze ro dm_mirror dm_mod scsi_transport_fc mptspi mptscsih mptbase scsi_transport_spi sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd Pid: 10711, CPU 1, comm: scsi_id psr : 00001210085a6010 ifs : 8000000000000309 ip : [<a0000001003d9041>] Not tainted ip is at attribute_container_device_trigger+0x81/0x260 unat: 0000000000000000 pfs : 0000000000000309 rsc : 0000000000000003 rnat: 0000000000000000 bsps: 0000000000000000 pr : 0000000000556959 ldrs: 0000000000000000 ccv : 0000000000000000 fpsr: 0009804c8a70033f csd : 0000000000000000 ssd : 0000000000000000 b0 : a0000001003d9060 b6 : a000000202097de0 b7 : a000000100010830 f6 : 1003e0000000000000000 f7 : 0ffdd8000000000000000 f8 : 000000000000000000000 f9 : 1000c8000000000000000 f10 : 000000000000000000000 f11 : 000000000000000000000 r1 : a000000100bfe220 r2 : a0000002021adb58 r3 : a0000002020a0248 r8 : 0000000000000000 r9 : a000000202097de0 r10 : e00000011aeb10a8 r11 : e0000040fd610998 r12 : e0000040d1b4fe00 r13 : e0000040d1b48000 r14 : a00000020217cc60 r15 : a00000020217cd70 r16 : e0000040fd610c30 r17 : e0000040fd610c38 r18 : e0000040fd610c40 r19 : a000000100d0fd10 r20 : a000000100d0fd10 r21 : 0000000000000000 r22 : 0000000000000080 r23 : 0000000000004000 r24 : 0000000000004000 r25 : 00000000071b9a5d r26 : e0000000ccecfc01 r27 : 0000000000004000 r28 : 000000000103f17b r29 : e0000001040cc418 r30 : 0000000000000000 r31 : e0000001f8fe6050 Call Trace: [<a000000100014140>] show_stack+0x40/0xa0 sp=e0000040d1b4f820 bsp=e0000040d1b495f0 [<a000000100014a40>] show_regs+0x840/0x880 sp=e0000040d1b4f9f0 bsp=e0000040d1b49598 [<a000000100037ce0>] die+0x1c0/0x2c0 sp=e0000040d1b4f9f0 bsp=e0000040d1b49550 [<a000000100037e30>] die_if_kernel+0x50/0x80 sp=e0000040d1b4fa10 bsp=e0000040d1b49520 [<a000000100617870>] ia64_fault+0x10f0/0x1200 sp=e0000040d1b4fa10 bsp=e0000040d1b494c8 [<a00000010000c700>] __ia64_leave_kernel+0x0/0x280 sp=e0000040d1b4fc30 bsp=e0000040d1b494c8 [<a0000001003d9040>] attribute_container_device_trigger+0x80/0x260 sp=e0000040d1b4fe00 bsp=e0000040d1b49480 [<a0000001003d9490>] transport_remove_device+0x30/0x60 sp=e0000040d1b4fe20 bsp=e0000040d1b49460 [<a000000202166220>] scsi_target_reap_usercontext+0xc0/0x200 [scsi_mod] sp=e0000040d1b4fe20 bsp=e0000040d1b49428 [<a0000001000a5ec0>] execute_in_process_context+0x80/0x100 sp=e0000040d1b4fe20 bsp=e0000040d1b493f0 [<a0000002021642e0>] scsi_target_reap+0x1c0/0x220 [scsi_mod] sp=e0000040d1b4fe20 bsp=e0000040d1b493c0 [<a000000202166850>] scsi_device_dev_release_usercontext+0x150/0x1e0 [scsi_mod] sp=e0000040d1b4fe20 bsp=e0000040d1b49378 [<a0000001000a5ec0>] execute_in_process_context+0x80/0x100 sp=e0000040d1b4fe20 bsp=e0000040d1b49348 [<a0000002021666d0>] scsi_device_dev_release+0x30/0x60 [scsi_mod] sp=e0000040d1b4fe20 bsp=e0000040d1b49328 [<a0000001003cc640>] device_release+0x60/0x100 sp=e0000040d1b4fe20 bsp=e0000040d1b49308 [<a0000001002a1e80>] kobject_cleanup+0x100/0x180 sp=e0000040d1b4fe20 bsp=e0000040d1b492d0 [<a0000001002a1f20>] kobject_release+0x20/0x40 sp=e0000040d1b4fe20 bsp=e0000040d1b492b0 [<a0000001002a3d60>] kref_put+0x160/0x1a0 sp=e0000040d1b4fe20 bsp=e0000040d1b49288 [<a0000001002a1d50>] kobject_put+0x30/0x60 sp=e0000040d1b4fe20 bsp=e0000040d1b49268 [<a0000001001f0ab0>] sysfs_release+0xb0/0x1a0 sp=e0000040d1b4fe20 bsp=e0000040d1b49240 [<a000000100159420>] __fput+0x1a0/0x420 sp=e0000040d1b4fe30 bsp=e0000040d1b49200 [<a0000001001596e0>] fput+0x40/0x60 sp=e0000040d1b4fe30 bsp=e0000040d1b491d8 [<a000000100153050>] filp_close+0x110/0x140 sp=e0000040d1b4fe30 bsp=e0000040d1b491a8 [<a0000001001560c0>] sys_close+0x140/0x1a0 sp=e0000040d1b4fe30 bsp=e0000040d1b49130 [<a00000010000c490>] __ia64_trace_syscall+0xd0/0x110 sp=e0000040d1b4fe30 bsp=e0000040d1b49130 [<a000000000010620>] __start_ivt_text+0xffffffff00010620/0x400 sp=e0000040d1b50000 bsp=e0000040d1b49130 <0>Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): 2.6.18-8.el5 How reproducible: This is reproducible with an overnight run. Steps to Reproduce: 1. Run modprobe lpfc ; rmmod lpfc is a loop for overnight run. 2. 3. Actual results: System panicked. Expected results: Run for 24 hours with no panic Additional info:
Is it re-producible with the recent rhel 5 kernel? Do I need specific hardware to re-produce this bug?
my testing of the loop of modprobe lpfc; rmmod lpfc has been over 24hrs. It still runs happily. So I guess I cannot preproduce it without required hardware attached on my tiger4 box. So I guess I cannot help to chase the root cause and give fix without having the hardware.
I just came across a similiar report on i386, bug 234898, and sure looks like it a generic modprobe/rmmod issue with lpfc. If the issues are the same one, please close one as dup. many thanks.
Yes it should be same with bug 234898. thanks for the info. Closing it as dup... *** This bug has been marked as a duplicate of 234898 ***
In Emulex lab, we tested following patch with RHEL5.1 kernel http://marc.info/?l=linux-kernel&m=118727599702702&w=2 This issue is not reproducible with this patch. We request to merge this patch into next RHEL5 kernel release.