Hide Forgot
Description of problem: Kernel periodically panics during kfree operations. The caller of kfree is not consistent. I have seen it as attach_recursive_mnt, seq_release_private and key_cleanup. This has also occurred on multiple machines so most likely not a hardware problem. Version-Release number of selected component (if applicable): 2.6.35.10-74.fc14.x86_64 How reproducible: It seems to always eventually happen but haven't been able to determine what triggers the problem Actual results: Node crashes Additional info: One trace: Jan 19 14:49:09 o12-31 [ 5501.824030] ------------[ cut here ]------------ Jan 19 14:49:09 o12-31 [ 5501.825001] kernel BUG at mm/slub.c:2834! Jan 19 14:49:09 o12-31 [ 5501.825001] invalid opcode: 0000 [#1] Jan 19 14:49:09 o12-31 SMP Jan 19 14:49:09 o12-31 Jan 19 14:49:09 o12-31 [ 5501.825001] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Jan 19 14:49:09 o12-31 [ 5501.825001] CPU 0 Jan 19 14:49:09 o12-31 Jan 19 14:49:09 o12-31 [ 5501.825001] Modules linked in: Jan 19 14:49:09 nfsd Jan 19 14:49:09 lockd Jan 19 14:49:09 nfs_acl Jan 19 14:49:09 auth_rpcgss Jan 19 14:49:09 sunrpc Jan 19 14:49:09 exportfs Jan 19 14:49:09 ipt_REDIRECT Jan 19 14:49:09 xt_NOTRACK Jan 19 14:49:09 iptable_raw Jan 19 14:49:09 fuse Jan 19 14:49:09 ipmi_devintf Jan 19 14:49:09 ipmi_si Jan 19 14:49:09 ipmi_msghandler Jan 19 14:49:09 ipv6 Jan 19 14:49:09 iptable_nat Jan 19 14:49:09 nf_nat Jan 19 14:49:09 dm_multipath Jan 19 14:49:09 netconsole Jan 19 14:49:09 configfs Jan 19 14:49:09 radeon Jan 19 14:49:09 ttm Jan 19 14:49:09 mptsas Jan 19 14:49:09 drm_kms_helper Jan 19 14:49:09 drm Jan 19 14:49:09 mptscsih Jan 19 14:49:09 hed Jan 19 14:49:09 iTCO_wdt Jan 19 14:49:09 serio_raw Jan 19 14:49:09 i2c_algo_bit Jan 19 14:49:10 i2c_core Jan 19 14:49:10 mptbase Jan 19 14:49:10 i5100_edac Jan 19 14:49:10 dcdbas Jan 19 14:49:10 iTCO_vendor_support Jan 19 14:49:10 tg3 Jan 19 14:49:10 scsi_transport_sas Jan 19 14:49:10 edac_core Jan 19 14:49:10 [last unloaded: scsi_wait_scan] Jan 19 14:49:10 o12-31 Jan 19 14:49:10 o12-31 [ 5501.825001] Jan 19 14:49:10 o12-31 [ 5501.825001] Pid: 15, comm: events/0 Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300 Jan 19 14:49:10 o12-31 [ 5501.825001] RIP: 0010:[<ffffffff81108a6e>] Jan 19 14:49:10 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 19 14:49:10 o12-31 [ 5501.825001] RSP: 0018:ffff880226d4fde0 EFLAGS: 00010246 Jan 19 14:49:10 o12-31 [ 5501.825001] RAX: 0040000000080000 RBX: ffff880200000000 RCX: dead000000100100 Jan 19 14:49:10 o12-31 [ 5501.825001] RDX: 0040000000080000 RSI: ffffffff81dc7574 RDI: ffff880200000000 Jan 19 14:49:10 o12-31 [ 5501.825001] RBP: ffff880226d4fe10 R08: ffff880222092380 R09: ffff88021f6ffb80 Jan 19 14:49:10 o12-31 [ 5501.825001] R10: ffff88021fddb230 R11: 0000000000000202 R12: ffff880226d60000 Jan 19 14:49:10 o12-31 [ 5501.825001] R13: ffffffff811d5f3d R14: ffffea0007000000 R15: ffff880002018588 Jan 19 14:49:10 o12-31 [ 5501.825001] FS: 0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000 Jan 19 14:49:11 o12-31 [ 5501.825001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 19 14:49:11 o12-31 [ 5501.825001] CR2: 00007f67f17c8dc8 CR3: 000000021f4b4000 CR4: 00000000000406f0 Jan 19 14:49:11 o12-31 [ 5501.825001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 19 14:49:11 o12-31 [ 5501.825001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 19 14:49:11 o12-31 [ 5501.825001] Process events/0 (pid: 15, threadinfo ffff880226d4e000, task ffff880226d60000) Jan 19 14:49:11 o12-31 [ 5501.825001] Stack: Jan 19 14:49:11 o12-31 [ 5501.825001] ffff880226d4fdf0 Jan 19 14:49:11 ffff880222092300 Jan 19 14:49:11 ffff880226d60000 Jan 19 14:49:11 ffff880226d60000 Jan 19 14:49:11 o12-31 Jan 19 14:49:11 o12-31 [ 5501.825001] <0> Jan 19 14:49:11 ffff880226d60000 Jan 19 14:49:11 ffff880002018588 Jan 19 14:49:11 ffff880226d4fe30 Jan 19 14:49:11 ffffffff811d5f3d Jan 19 14:49:11 o12-31 Jan 19 14:49:11 o12-31 [ 5501.825001] <0> Jan 19 14:49:11 ffff880226d60000 Jan 19 14:49:11 ffff880002018580 Jan 19 14:49:11 ffff880226d4fee0 Jan 19 14:49:11 ffffffff8106277d Jan 19 14:49:11 o12-31 Jan 19 14:49:11 o12-31 [ 5501.825001] Call Trace: Jan 19 14:49:11 o12-31 [ 5501.825001] [<ffffffff811d5f3d>] key_cleanup+0xdb/0xef Jan 19 14:49:11 o12-31 [ 5501.825001] [<ffffffff8106277d>] worker_thread+0x1c5/0x251 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff811d5e62>] ? key_cleanup+0x0/0xef Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff810665cf>] ? autoremove_wake_function+0x0/0x39 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff810625b8>] ? worker_thread+0x0/0x251 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff81066135>] kthread+0x7f/0x87 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff810660b6>] ? kthread+0x0/0x87 Jan 19 14:49:12 o12-31 [ 5501.825001] [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 Jan 19 14:49:12 o12-31 [ 5501.825001] Code: Jan 19 14:49:12 o12-31 ff Jan 19 14:49:12 o12-31 14 Jan 19 14:49:12 o12-31 24 Jan 19 14:49:12 o12-31 49 Jan 19 14:49:12 o12-31 83 Jan 19 14:49:12 o12-31 c4 Jan 19 14:49:12 o12-31 10 Jan 19 14:49:12 o12-31 49 Jan 19 14:49:12 o12-31 83 Jan 19 14:49:12 o12-31 3c Jan 19 14:49:12 o12-31 24 Jan 19 14:49:12 o12-31 00 Jan 19 14:49:12 o12-31 eb Jan 19 14:49:12 o12-31 e4 Jan 19 14:49:12 o12-31 48 Jan 19 14:49:12 o12-31 83 Jan 19 14:49:12 o12-31 fb Jan 19 14:49:12 o12-31 10 Jan 19 14:49:12 o12-31 76 Jan 19 14:49:12 o12-31 79 Jan 19 14:49:12 o12-31 48 Jan 19 14:49:12 o12-31 89 Jan 19 14:49:12 o12-31 df Jan 19 14:49:12 o12-31 e8 Jan 19 14:49:12 o12-31 1b Jan 19 14:49:12 o12-31 e9 Jan 19 14:49:12 o12-31 ff Jan 19 14:49:12 o12-31 ff Jan 19 14:49:12 o12-31 49 Jan 19 14:49:12 o12-31 89 Jan 19 14:49:12 o12-31 c6 Jan 19 14:49:12 o12-31 48 Jan 19 14:49:12 o12-31 8b Jan 19 14:49:12 o12-31 00 Jan 19 14:49:12 o12-31 a8 Jan 19 14:49:12 o12-31 80 Jan 19 14:49:12 o12-31 75 Jan 19 14:49:12 o12-31 11 Jan 19 14:49:12 o12-31 f6 Jan 19 14:49:12 o12-31 c4 Jan 19 14:49:12 o12-31 c0 Jan 19 14:49:12 o12-31 75 Jan 19 14:49:12 o12-31 02 Jan 19 14:49:12 o12-31 f> Jan 19 14:49:12 o12-31 0b Jan 19 14:49:12 o12-31 4c Jan 19 14:49:12 o12-31 89 Jan 19 14:49:12 o12-31 f7 Jan 19 14:49:12 o12-31 e8 Jan 19 14:49:12 o12-31 a7 Jan 19 14:49:12 o12-31 37 Jan 19 14:49:12 o12-31 fd Jan 19 14:49:12 o12-31 ff Jan 19 14:49:12 o12-31 eb Jan 19 14:49:12 o12-31 56 Jan 19 14:49:12 o12-31 4d Jan 19 14:49:12 o12-31 8b Jan 19 14:49:12 o12-31 6e Jan 19 14:49:12 o12-31 10 Jan 19 14:49:12 o12-31 4c Jan 19 14:49:12 o12-31 8b Jan 19 14:49:12 o12-31 7d Jan 19 14:49:12 o12-31 08 Jan 19 14:49:12 o12-31 e8 Jan 19 14:49:12 o12-31 Jan 19 14:49:12 o12-31 [ 5501.825001] RIP Jan 19 14:49:12 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 19 14:49:12 o12-31 [ 5501.825001] RSP <ffff880226d4fde0> Jan 19 14:49:12 o12-31 [ 5504.559212] general protection fault: 0000 [#2] Jan 19 14:49:12 o12-31 Jan 19 14:49:12 o12-31 [ 5505.378360] ---[ end trace 44f45a7cc66c8504 ]--- Another trace: Jan 18 17:38:04 o12-29 [ 9018.944644] ------------[ cut here ]------------ Jan 18 17:38:04 o12-29 [ 9018.945001] kernel BUG at mm/slub.c:2834! Jan 18 17:38:04 o12-29 [ 9018.945001] invalid opcode: 0000 [#1] Jan 18 17:38:04 o12-29 SMP Jan 18 17:38:04 o12-29 Jan 18 17:38:04 o12-29 [ 9018.945001] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map Jan 18 17:38:04 o12-29 [ 9018.945001] CPU 0 Jan 18 17:38:04 o12-29 Jan 18 17:38:04 o12-29 [ 9018.945001] Modules linked in: Jan 18 17:38:04 nfsd Jan 18 17:38:04 lockd Jan 18 17:38:04 nfs_acl Jan 18 17:38:04 auth_rpcgss Jan 18 17:38:04 sunrpc Jan 18 17:38:04 exportfs Jan 18 17:38:04 ipt_REDIRECT Jan 18 17:38:04 xt_NOTRACK Jan 18 17:38:04 iptable_raw Jan 18 17:38:04 fuse Jan 18 17:38:04 ipmi_devintf Jan 18 17:38:04 ipmi_si Jan 18 17:38:04 ipmi_msghandler Jan 18 17:38:04 ipv6 Jan 18 17:38:04 o12-31 [ 9005.343703] audit_printk_skb: 12 callbacks suppressed Jan 18 17:38:04 iptable_nat Jan 18 17:38:04 nf_nat Jan 18 17:38:04 dm_multipath Jan 18 17:38:04 netconsole Jan 18 17:38:04 configfs Jan 18 17:38:04 radeon Jan 18 17:38:04 ttm Jan 18 17:38:04 drm_kms_helper Jan 18 17:38:04 drm Jan 18 17:38:04 i2c_algo_bit Jan 18 17:38:04 mptsas Jan 18 17:38:04 mptscsih Jan 18 17:38:04 tg3 Jan 18 17:38:04 i5100_edac Jan 18 17:38:04 i2c_core Jan 18 17:38:04 dcdbas Jan 18 17:38:04 iTCO_wdt Jan 18 17:38:04 mptbase Jan 18 17:38:04 iTCO_vendor_support Jan 18 17:38:04 serio_raw Jan 18 17:38:04 hed Jan 18 17:38:04 edac_core Jan 18 17:38:04 scsi_transport_sas Jan 18 17:38:04 [last unloaded: scsi_wait_scan] Jan 18 17:38:04 o12-29 Jan 18 17:38:04 o12-29 [ 9018.945001] Jan 18 17:38:04 o12-29 [ 9018.945001] Pid: 7299, comm: irqbalance Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300 Jan 18 17:38:04 o12-29 [ 9018.945001] RIP: 0010:[<ffffffff81108a6e>] Jan 18 17:38:05 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 18 17:38:05 o12-29 [ 9018.945001] RSP: 0018:ffff88021f831de8 EFLAGS: 00010246 Jan 18 17:38:05 o12-29 [ 9020.058633] RAX: 0040000000100068 RBX: ffff880200000000 RCX: ffff880225504f40 Jan 18 17:38:05 o12-29 [ 9020.058633] RDX: 0040000000100068 RSI: ffff880223287240 RDI: ffff880200000000 Jan 18 17:38:05 o12-29 [ 9020.058633] RBP: ffff88021f831e18 R08: ffff880225504858 R09: 0000000000000000 Jan 18 17:38:05 o12-29 [ 9020.058633] R10: 0000000000000000 R11: 0000000000000246 R12: ffff880221965700 Jan 18 17:38:05 o12-29 [ 9020.058633] R13: ffffffff8112f0fa R14: ffffea0007000000 R15: ffff880223287240 Jan 18 17:38:05 o12-29 [ 9020.058633] FS: 00007f231bd9d740(0000) GS:ffff880002000000(0000) knlGS:0000000000000000 Jan 18 17:38:05 o12-29 [ 9020.058633] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Jan 18 17:38:05 o12-29 [ 9020.058633] CR2: 00007f231bda5000 CR3: 000000021e9bc000 CR4: 00000000000406f0 Jan 18 17:38:05 o12-29 [ 9020.058633] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 18 17:38:06 o12-29 [ 9020.058633] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 18 17:38:06 o12-29 [ 9020.058633] Process irqbalance (pid: 7299, threadinfo ffff88021f830000, task ffff88021f0045c0) Jan 18 17:38:06 o12-29 [ 9020.058633] Stack: Jan 18 17:38:06 o12-29 [ 9020.058633] ffff88021f831df8 Jan 18 17:38:06 ffff8802236aefb0 Jan 18 17:38:06 ffff880221965700 Jan 18 17:38:06 ffff8802236aefb0 Jan 18 17:38:06 o12-29 Jan 18 17:38:06 o12-29 [ 9020.058633] <0> Jan 18 17:38:06 ffffffff81168a3e Jan 18 17:38:06 ffff880223287240 Jan 18 17:38:06 ffff88021f831e48 Jan 18 17:38:06 ffffffff8112f0fa Jan 18 17:38:06 o12-29 Jan 18 17:38:06 o12-29 [ 9020.058633] <0> Jan 18 17:38:06 ffffffff81105f49 Jan 18 17:38:06 ffff880223287240 Jan 18 17:38:06 ffff8802236aefb0 Jan 18 17:38:06 ffff880225504840 Jan 18 17:38:06 o12-29 Jan 18 17:38:06 o12-29 [ 9020.058633] Call Trace: Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff81168a3e>] ? seq_release_net+0x0/0x3d Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff8112f0fa>] seq_release_private+0x28/0x44 Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff81105f49>] ? __raw_local_irq_save+0x1d/0x23 Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff81168a72>] seq_release_net+0x34/0x3d Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff81168a3e>] ? seq_release_net+0x0/0x3d Jan 18 17:38:06 o12-29 [ 9020.058633] [<ffffffff81160853>] proc_reg_release+0xd1/0xee Jan 18 17:38:07 o12-29 [ 9020.058633] [<ffffffff81469bf6>] ? _raw_spin_lock+0xe/0x10 Jan 18 17:38:07 o12-29 [ 9020.058633] [<ffffffff811183e2>] fput+0x13b/0x1ed Jan 18 17:38:07 o12-29 [ 9020.058633] [<ffffffff81115613>] filp_close+0x66/0x70 Jan 18 17:38:07 o12-29 [ 9020.058633] [<ffffffff811156b2>] sys_close+0x95/0xd5 Jan 18 17:38:07 o12-29 [ 9020.058633] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b Jan 18 17:38:07 o12-29 [ 9020.058633] Code: Jan 18 17:38:07 o12-29 ff Jan 18 17:38:07 o12-29 14 Jan 18 17:38:07 o12-29 24 Jan 18 17:38:07 o12-29 49 Jan 18 17:38:07 o12-29 83 Jan 18 17:38:07 o12-29 c4 Jan 18 17:38:07 o12-29 10 Jan 18 17:38:07 o12-29 49 Jan 18 17:38:07 o12-29 83 Jan 18 17:38:07 o12-29 3c Jan 18 17:38:07 o12-29 24 Jan 18 17:38:07 o12-29 00 Jan 18 17:38:07 o12-29 eb Jan 18 17:38:07 o12-29 e4 Jan 18 17:38:07 o12-29 48 Jan 18 17:38:07 o12-29 83 Jan 18 17:38:07 o12-29 fb Jan 18 17:38:07 o12-29 10 Jan 18 17:38:07 o12-29 76 Jan 18 17:38:07 o12-29 79 Jan 18 17:38:07 o12-29 48 Jan 18 17:38:07 o12-29 89 Jan 18 17:38:07 o12-29 df Jan 18 17:38:07 o12-29 e8 Jan 18 17:38:07 o12-29 1b Jan 18 17:38:07 o12-29 e9 Jan 18 17:38:07 o12-29 ff Jan 18 17:38:07 o12-29 ff Jan 18 17:38:07 o12-29 49 Jan 18 17:38:07 o12-29 89 Jan 18 17:38:07 o12-29 c6 Jan 18 17:38:07 o12-29 48 Jan 18 17:38:07 o12-29 8b Jan 18 17:38:07 o12-29 00 Jan 18 17:38:07 o12-29 a8 Jan 18 17:38:07 o12-29 80 Jan 18 17:38:07 o12-29 75 Jan 18 17:38:07 o12-29 11 Jan 18 17:38:07 o12-29 f6 Jan 18 17:38:07 o12-29 c4 Jan 18 17:38:07 o12-29 c0 Jan 18 17:38:07 o12-29 75 Jan 18 17:38:07 o12-29 02 Jan 18 17:38:07 o12-29 f> Jan 18 17:38:07 o12-29 0b Jan 18 17:38:07 o12-29 4c Jan 18 17:38:07 o12-29 89 Jan 18 17:38:07 o12-29 f7 Jan 18 17:38:07 o12-29 e8 Jan 18 17:38:07 o12-29 a7 Jan 18 17:38:07 o12-29 37 Jan 18 17:38:07 o12-29 fd Jan 18 17:38:07 o12-29 ff Jan 18 17:38:07 o12-29 eb Jan 18 17:38:07 o12-29 56 Jan 18 17:38:07 o12-29 4d Jan 18 17:38:07 o12-29 8b Jan 18 17:38:07 o12-29 6e Jan 18 17:38:07 o12-29 10 Jan 18 17:38:07 o12-29 4c Jan 18 17:38:07 o12-29 8b Jan 18 17:38:07 o12-29 7d Jan 18 17:38:07 o12-29 08 Jan 18 17:38:07 o12-29 e8 Jan 18 17:38:07 o12-29 Jan 18 17:38:07 o12-29 [ 9020.058633] RIP Jan 18 17:38:07 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 18 17:38:07 o12-29 [ 9020.058633] RSP <ffff88021f831de8> Jan 18 17:38:07 o12-29 [ 9022.648903] ---[ end trace e75aad0bae2dbc7c ]---
Created attachment 474367 [details] Picture of another panic Wasn't able to capture the full text since this occurred during shutdown.
In kfree(): if (unlikely(!PageSlab(page))) { BUG_ON(!PageCompound(page));
Has this been happening all along or did it start with a recent kernel update?
(In reply to comment #3) > Has this been happening all along or did it start with a recent kernel update? I haven't tested thoroughly with the other 2.6.35 kernels, but it does look like the issue might exist in 2.6.35.9-64. I am about to start testing 2.6.35.6-48 to see if it has similar problems.
Not sure if this is helpful but I have another stack from 2.6.35.10-74. Jan 20 15:46:01 o12-30 [28597.628126] kernel BUG at mm/slub.c:2834! Jan 20 15:46:02 o12-30 [28597.628126] invalid opcode: 0000 [#1] Jan 20 15:46:02 o12-30 SMP Jan 20 15:46:02 o12-30 Jan 20 15:46:02 o12-30 [28597.628126] last sysfs file: /sys/module/ipmi_si/refcnt Jan 20 15:46:02 o12-30 [28597.628126] CPU 1 Jan 20 15:46:02 o12-30 Jan 20 15:46:02 o12-30 [28597.628126] Modules linked in: Jan 20 15:46:02 nfsd Jan 20 15:46:02 lockd Jan 20 15:46:02 nfs_acl Jan 20 15:46:02 auth_rpcgss Jan 20 15:46:02 sunrpc Jan 20 15:46:02 exportfs Jan 20 15:46:02 ipt_REDIRECT Jan 20 15:46:02 xt_NOTRACK Jan 20 15:46:02 iptable_raw Jan 20 15:46:02 fuse Jan 20 15:46:02 ipmi_si(-) Jan 20 15:46:02 ipmi_msghandler Jan 20 15:46:02 ipv6 Jan 20 15:46:02 iptable_nat Jan 20 15:46:02 nf_nat Jan 20 15:46:02 dm_multipath Jan 20 15:46:02 netconsole Jan 20 15:46:02 configfs Jan 20 15:46:02 radeon Jan 20 15:46:02 ttm Jan 20 15:46:02 drm_kms_helper Jan 20 15:46:02 drm Jan 20 15:46:02 mptsas Jan 20 15:46:02 i2c_algo_bit Jan 20 15:46:02 i2c_core Jan 20 15:46:02 iTCO_wdt Jan 20 15:46:02 tg3 Jan 20 15:46:02 mptscsih Jan 20 15:46:02 dcdbas Jan 20 15:46:02 iTCO_vendor_support Jan 20 15:46:02 hed Jan 20 15:46:02 mptbase Jan 20 15:46:02 i5100_edac Jan 20 15:46:02 serio_raw Jan 20 15:46:02 edac_core Jan 20 15:46:02 scsi_transport_sas Jan 20 15:46:02 [last unloaded: ipmi_devintf] Jan 20 15:46:02 o12-30 Jan 20 15:46:02 o12-30 [28597.628126] Jan 20 15:46:02 o12-30 [28597.628126] Pid: 21798, comm: modprobe Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300 Jan 20 15:46:02 o12-30 [28597.628126] RIP: 0010:[<ffffffff81108a6e>] Jan 20 15:46:02 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 20 15:46:02 o12-30 [28597.628126] RSP: 0018:ffff88021e8f9d88 EFLAGS: 00010246 Jan 20 15:46:02 o12-30 [28597.628126] RAX: 0040000000100078 RBX: ffff880200000000 RCX: 000000000015000e Jan 20 15:46:02 o12-30 [28597.628126] RDX: 0040000000100078 RSI: ffffea00076d56d0 RDI: ffff880200000000 Jan 20 15:46:03 o12-30 [28597.628126] RBP: ffff88021e8f9db8 R08: ffff88021e8f9cc8 R09: ffff880002015500 Jan 20 15:46:03 o12-30 [28597.628126] R10: ffff880200000001 R11: ffff88021e8f9d18 R12: ffff88021f3d66f8 Jan 20 15:46:03 o12-30 [28597.628126] R13: ffffffff81219e79 R14: ffffea0007000000 R15: 000000000040f190 Jan 20 15:46:03 o12-30 [28597.628126] FS: 00007fe42bfa1720(0000) GS:ffff880002080000(0000) knlGS:0000000000000000 Jan 20 15:46:03 o12-30 [28597.628126] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 20 15:46:03 o12-30 [28597.628126] CR2: 00007fe42bb411a0 CR3: 0000000225cdd000 CR4: 00000000000406e0 Jan 20 15:46:03 o12-30 [28597.628126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 20 15:46:03 o12-30 [28597.628126] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 20 15:46:03 o12-30 [28597.628126] Process modprobe (pid: 21798, threadinfo ffff88021e8f8000, task ffff880225b2c5c0) Jan 20 15:46:03 o12-30 [28597.628126] Stack: Jan 20 15:46:03 o12-30 [28597.628126] ffff880200000000 Jan 20 15:46:03 ffff88021f3d66c0 Jan 20 15:46:03 ffff88021f3d66f8 Jan 20 15:46:03 ffffffff81a90210 Jan 20 15:46:03 o12-30 Jan 20 15:46:03 o12-30 [28597.628126] <0> Jan 20 15:46:03 ffff880200000000 Jan 20 15:46:04 000000000040f190 Jan 20 15:46:04 ffff88021e8f9de8 Jan 20 15:46:04 ffffffff81219e79 Jan 20 15:46:04 o12-30 Jan 20 15:46:04 o12-30 [28597.628126] <0> Jan 20 15:46:04 ffff88021f3d66f8 Jan 20 15:46:04 ffffffff81219d50 Jan 20 15:46:04 ffffffff81a7bf70 Jan 20 15:46:04 ffffffffa010f870 Jan 20 15:46:04 o12-30 Jan 20 15:46:04 o12-30 [28597.628126] Call Trace: Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219e79>] kobject_release+0x129/0x1d9 Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219d50>] ? kobject_release+0x0/0x1d9 Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8121b382>] kref_put+0x43/0x4d Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219cc7>] kobject_put+0x47/0x4b Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff812d7b08>] bus_remove_driver+0xbd/0xed Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff812d8463>] driver_unregister+0x6c/0x74 Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81233795>] pci_unregister_driver+0x44/0x89 Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffffa010d4dc>] cleanup_ipmi_si+0x30/0xa8 [ipmi_si] Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8107af85>] sys_delete_module+0x1ca/0x23b Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8146cfdd>] ? do_page_fault+0x238/0x265 Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b Jan 20 15:46:05 o12-30 [28597.628126] Code: Jan 20 15:46:05 o12-30 ff Jan 20 15:46:05 o12-30 14 Jan 20 15:46:05 o12-30 24 Jan 20 15:46:05 o12-30 49 Jan 20 15:46:05 o12-30 83 Jan 20 15:46:05 o12-30 c4 Jan 20 15:46:05 o12-30 10 Jan 20 15:46:05 o12-30 49 Jan 20 15:46:05 o12-30 83 Jan 20 15:46:05 o12-30 3c Jan 20 15:46:05 o12-30 24 Jan 20 15:46:05 o12-30 00 Jan 20 15:46:05 o12-30 eb Jan 20 15:46:05 o12-30 e4 Jan 20 15:46:05 o12-30 48 Jan 20 15:46:05 o12-30 83 Jan 20 15:46:05 o12-30 fb Jan 20 15:46:05 o12-30 10 Jan 20 15:46:05 o12-30 76 Jan 20 15:46:05 o12-30 79 Jan 20 15:46:05 o12-30 48 Jan 20 15:46:05 o12-30 89 Jan 20 15:46:05 o12-30 df Jan 20 15:46:05 o12-30 e8 Jan 20 15:46:05 o12-30 1b Jan 20 15:46:05 o12-30 e9 Jan 20 15:46:05 o12-30 ff Jan 20 15:46:05 o12-30 ff Jan 20 15:46:05 o12-30 49 Jan 20 15:46:05 o12-30 89 Jan 20 15:46:05 o12-30 c6 Jan 20 15:46:05 o12-30 48 Jan 20 15:46:05 o12-30 8b Jan 20 15:46:05 o12-30 00 Jan 20 15:46:05 o12-30 a8 Jan 20 15:46:05 o12-30 80 Jan 20 15:46:05 o12-30 75 Jan 20 15:46:05 o12-30 11 Jan 20 15:46:05 o12-30 f6 Jan 20 15:46:05 o12-30 c4 Jan 20 15:46:05 o12-30 c0 Jan 20 15:46:05 o12-30 75 Jan 20 15:46:05 o12-30 02 Jan 20 15:46:05 o12-30 f> Jan 20 15:46:05 o12-30 0b Jan 20 15:46:05 o12-30 4c Jan 20 15:46:05 o12-30 89 Jan 20 15:46:05 o12-30 f7 Jan 20 15:46:05 o12-30 e8 Jan 20 15:46:05 o12-30 a7 Jan 20 15:46:05 o12-30 37 Jan 20 15:46:05 o12-30 fd Jan 20 15:46:05 o12-30 ff Jan 20 15:46:05 o12-30 eb Jan 20 15:46:05 o12-30 56 Jan 20 15:46:05 o12-30 4d Jan 20 15:46:05 o12-30 8b Jan 20 15:46:05 o12-30 6e Jan 20 15:46:05 o12-30 10 Jan 20 15:46:05 o12-30 4c Jan 20 15:46:05 o12-30 8b Jan 20 15:46:05 o12-30 7d Jan 20 15:46:05 o12-30 08 Jan 20 15:46:05 o12-30 e8 Jan 20 15:46:05 o12-30 Jan 20 15:46:05 o12-30 [28597.628126] RIP Jan 20 15:46:05 [<ffffffff81108a6e>] kfree+0x69/0xd7 Jan 20 15:46:05 o12-30 [28597.628126] RSP <ffff88021e8f9d88> Jan 20 15:46:05 o12-30 [28601.071743] ---[ end trace 9dc380e62a5d64dc ]---
The issue exists in 2.6.35.6-48 Jan 21 02:35:26 o12-28 [ 554.258001] kernel BUG at mm/slub.c:2834! Jan 21 02:35:26 o12-28 [ 554.258001] invalid opcode: 0000 [#1] Jan 21 02:35:26 o12-28 SMP Jan 21 02:35:26 o12-28 Jan 21 02:35:26 o12-28 [ 554.258001] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:05:00.0/host4/port-4:1/end_device-4:1/target4:0:1/4:0:1:0/block/sdb/dev Jan 21 02:35:27 o12-28 [ 554.258001] CPU 0 Jan 21 02:35:27 o12-28 Jan 21 02:35:27 o12-28 [ 554.258001] Modules linked in: Jan 21 02:35:27 nfsd Jan 21 02:35:27 lockd Jan 21 02:35:27 nfs_acl Jan 21 02:35:27 auth_rpcgss Jan 21 02:35:27 sunrpc Jan 21 02:35:27 exportfs Jan 21 02:35:27 ipt_REDIRECT Jan 21 02:35:27 xt_NOTRACK Jan 21 02:35:27 iptable_raw Jan 21 02:35:27 fuse Jan 21 02:35:27 ipmi_devintf Jan 21 02:35:27 ipmi_si Jan 21 02:35:27 ipmi_msghandler Jan 21 02:35:27 ipv6 Jan 21 02:35:27 iptable_nat Jan 21 02:35:27 nf_nat Jan 21 02:35:27 dm_multipath Jan 21 02:35:27 netconsole Jan 21 02:35:27 configfs Jan 21 02:35:27 radeon Jan 21 02:35:27 ttm Jan 21 02:35:27 drm_kms_helper Jan 21 02:35:27 mptsas Jan 21 02:35:27 drm Jan 21 02:35:27 mptscsih Jan 21 02:35:27 mptbase Jan 21 02:35:27 i2c_algo_bit Jan 21 02:35:27 scsi_transport_sas Jan 21 02:35:27 i2c_core Jan 21 02:35:27 hed Jan 21 02:35:27 dcdbas Jan 21 02:35:27 serio_raw Jan 21 02:35:27 iTCO_wdt Jan 21 02:35:27 iTCO_vendor_support Jan 21 02:35:27 i5100_edac Jan 21 02:35:27 tg3 Jan 21 02:35:27 edac_core Jan 21 02:35:27 [last unloaded: scsi_wait_scan] Jan 21 02:35:27 o12-28 Jan 21 02:35:27 o12-28 [ 554.258001] Jan 21 02:35:27 o12-28 [ 554.258001] Pid: 15, comm: events/0 Not tainted 2.6.35.6-48.fc14.x86_64 #1 0TY179/PowerEdge R300 Jan 21 02:35:27 o12-28 [ 554.258001] RIP: 0010:[<ffffffff81108796>] Jan 21 02:35:27 [<ffffffff81108796>] kfree+0x69/0xd7 Jan 21 02:35:27 o12-28 [ 554.258001] RSP: 0018:ffff880226d4fde0 EFLAGS: 00010246 Jan 21 02:35:27 o12-28 [ 554.258001] RAX: 0040000000100068 RBX: ffff880200000000 RCX: dead000000100100 Jan 21 02:35:27 o12-28 [ 554.258001] RDX: 0040000000100068 RSI: ffffffff81dc6574 RDI: ffff880200000000 Jan 21 02:35:27 o12-28 [ 554.258001] RBP: ffff880226d4fe10 R08: ffff8802255bfec0 R09: ffff8802224d7800 Jan 21 02:35:28 o12-28 [ 554.258001] R10: 0000000000000000 R11: 0000000000000246 R12: ffff880226d60000 Jan 21 02:35:28 o12-28 [ 554.258001] R13: ffffffff811d5b0d R14: ffffea0007000000 R15: ffff880002018588 Jan 21 02:35:28 o12-28 [ 554.258001] FS: 0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000 Jan 21 02:35:28 o12-28 [ 554.258001] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jan 21 02:35:28 o12-28 [ 554.258001] CR2: 00007f7375a617e8 CR3: 000000022413a000 CR4: 00000000000406f0 Jan 21 02:35:28 o12-28 [ 554.258001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Jan 21 02:35:28 o12-28 [ 554.258001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Jan 21 02:35:28 o12-28 [ 554.258001] Process events/0 (pid: 15, threadinfo ffff880226d4e000, task ffff880226d60000) Jan 21 02:35:28 o12-28 [ 554.258001] Stack: Jan 21 02:35:28 o12-28 [ 554.258001] ffff880226d4fdf0 Jan 21 02:35:28 ffff8802255bfe40 Jan 21 02:35:28 ffff880226d60000 Jan 21 02:35:28 ffff880226d60000 Jan 21 02:35:28 o12-28 Jan 21 02:35:28 o12-28 [ 554.258001] <0> Jan 21 02:35:28 ffff880226d60000 Jan 21 02:35:28 ffff880002018588 Jan 21 02:35:28 ffff880226d4fe30 Jan 21 02:35:28 ffffffff811d5b0d Jan 21 02:35:28 o12-28 Jan 21 02:35:28 o12-28 [ 554.258001] <0> Jan 21 02:35:28 ffff880226d60000 Jan 21 02:35:28 ffff880002018580 Jan 21 02:35:28 ffff880226d4fee0 Jan 21 02:35:28 ffffffff81062571 Jan 21 02:35:28 o12-28 Jan 21 02:35:28 o12-28 [ 554.258001] Call Trace: Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff811d5b0d>] key_cleanup+0xdb/0xef Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff81062571>] worker_thread+0x1c5/0x251 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff811d5a32>] ? key_cleanup+0x0/0xef Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff810663c3>] ? autoremove_wake_function+0x0/0x39 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff810623ac>] ? worker_thread+0x0/0x251 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff81065f29>] kthread+0x7f/0x87 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff81065eaa>] ? kthread+0x0/0x87 Jan 21 02:35:29 o12-28 [ 554.258001] [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10 Jan 21 02:35:29 o12-28 [ 554.258001] Code: Jan 21 02:35:29 o12-28 ff Jan 21 02:35:29 o12-28 14 Jan 21 02:35:29 o12-28 24 Jan 21 02:35:29 o12-28 49 Jan 21 02:35:29 o12-28 83 Jan 21 02:35:29 o12-28 c4 Jan 21 02:35:29 o12-28 10 Jan 21 02:35:29 o12-28 49 Jan 21 02:35:29 o12-28 83 Jan 21 02:35:29 o12-28 3c Jan 21 02:35:29 o12-28 24 Jan 21 02:35:29 o12-28 00 Jan 21 02:35:29 o12-28 eb Jan 21 02:35:29 o12-28 e4 Jan 21 02:35:29 o12-28 48 Jan 21 02:35:29 o12-28 83 Jan 21 02:35:29 o12-28 fb Jan 21 02:35:29 o12-28 10 Jan 21 02:35:29 o12-28 76 Jan 21 02:35:29 o12-28 79 Jan 21 02:35:29 o12-28 48 Jan 21 02:35:29 o12-28 89 Jan 21 02:35:29 o12-28 df Jan 21 02:35:29 o12-28 e8 Jan 21 02:35:29 o12-28 1b Jan 21 02:35:29 o12-28 e9 Jan 21 02:35:29 o12-28 ff Jan 21 02:35:29 o12-28 ff Jan 21 02:35:29 o12-28 49 Jan 21 02:35:29 o12-28 89 Jan 21 02:35:29 o12-28 c6 Jan 21 02:35:29 o12-28 48 Jan 21 02:35:29 o12-28 8b Jan 21 02:35:29 o12-28 00 Jan 21 02:35:29 o12-28 a8 Jan 21 02:35:29 o12-28 80 Jan 21 02:35:29 o12-28 75 Jan 21 02:35:29 o12-28 11 Jan 21 02:35:29 o12-28 f6 Jan 21 02:35:29 o12-28 c4 Jan 21 02:35:29 o12-28 c0 Jan 21 02:35:29 o12-28 75 Jan 21 02:35:29 o12-28 02 Jan 21 02:35:29 o12-28 f> Jan 21 02:35:29 o12-28 0b Jan 21 02:35:29 o12-28 4c Jan 21 02:35:29 o12-28 89 Jan 21 02:35:29 o12-28 f7 Jan 21 02:35:29 o12-28 e8 Jan 21 02:35:29 o12-28 43 Jan 21 02:35:29 o12-28 38 Jan 21 02:35:29 o12-28 fd Jan 21 02:35:29 o12-28 ff Jan 21 02:35:29 o12-28 eb Jan 21 02:35:29 o12-28 56 Jan 21 02:35:29 o12-28 4d Jan 21 02:35:29 o12-28 8b Jan 21 02:35:29 o12-28 6e Jan 21 02:35:29 o12-28 10 Jan 21 02:35:29 o12-28 4c Jan 21 02:35:29 o12-28 8b Jan 21 02:35:29 o12-28 7d Jan 21 02:35:29 o12-28 08 Jan 21 02:35:29 o12-28 e8 Jan 21 02:35:29 o12-28 Jan 21 02:35:29 o12-28 [ 554.258001] RIP Jan 21 02:35:29 [<ffffffff81108796>] kfree+0x69/0xd7 Jan 21 02:35:29 o12-28 [ 554.258001] RSP <ffff880226d4fde0>
Is there anything you need from me to help resolve this issue?
I grabbed the 2.6.36.2-12.rc1.fc15.x86_64 kernel from http://kojipkgs.fedoraproject.org/packages/kernel/ and it seems to have resolved the issue. The same nodes that were hitting the BUG on 2.6.35 have lasted over 3x longer on 2.6.36. Hope this helps to narrow down what could be the cause.
Nate, have you moved your systems to f15 at this point?
(In reply to comment #9) > Nate, have you moved your systems to f15 at this point? The systems were moved to a vanilla 2.6.36.4 and have been running that for the past few months, so I can not comment on f15's kernel.
Closing as fixed in F15 per comment #8
Hi, Do we know what exact fix was made for this in the kernel ? Any patch or commit id will be helpful. Thanks, Manish