Bug 671007

Summary: Kernel panics with kernel BUG at mm/slub.c:2834!
Product: [Fedora] Fedora Reporter: Nate Clark <redhat>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: 14CC: dm, gansalmon, itamar, jonathan, kernel-maint, madhu.chinakonda, manish.chopra, vicanderson
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-08-29 23:07:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
Picture of another panic none

Description Nate Clark 2011-01-19 21:59:48 UTC
Description of problem:
Kernel periodically panics during kfree operations. The caller of kfree is not consistent. I have seen it as attach_recursive_mnt, seq_release_private and key_cleanup. This has also occurred on multiple machines so most likely not a hardware problem.

Version-Release number of selected component (if applicable):
2.6.35.10-74.fc14.x86_64

How reproducible:
It seems to always eventually happen but haven't been able to determine what triggers the problem

Actual results:
Node crashes

Additional info:
One trace:

Jan 19 14:49:09 o12-31 [ 5501.824030] ------------[ cut here ]------------
Jan 19 14:49:09 o12-31 [ 5501.825001] kernel BUG at mm/slub.c:2834!
Jan 19 14:49:09 o12-31 [ 5501.825001] invalid opcode: 0000 [#1] 
Jan 19 14:49:09 o12-31 SMP
Jan 19 14:49:09 o12-31 
Jan 19 14:49:09 o12-31 [ 5501.825001] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Jan 19 14:49:09 o12-31 [ 5501.825001] CPU 0 
Jan 19 14:49:09 o12-31 
Jan 19 14:49:09 o12-31 [ 5501.825001] Modules linked in:
Jan 19 14:49:09  nfsd
Jan 19 14:49:09  lockd
Jan 19 14:49:09  nfs_acl
Jan 19 14:49:09  auth_rpcgss
Jan 19 14:49:09  sunrpc
Jan 19 14:49:09  exportfs
Jan 19 14:49:09  ipt_REDIRECT
Jan 19 14:49:09  xt_NOTRACK
Jan 19 14:49:09  iptable_raw
Jan 19 14:49:09  fuse
Jan 19 14:49:09  ipmi_devintf
Jan 19 14:49:09  ipmi_si
Jan 19 14:49:09  ipmi_msghandler
Jan 19 14:49:09  ipv6
Jan 19 14:49:09  iptable_nat
Jan 19 14:49:09  nf_nat
Jan 19 14:49:09  dm_multipath
Jan 19 14:49:09  netconsole
Jan 19 14:49:09  configfs
Jan 19 14:49:09  radeon
Jan 19 14:49:09  ttm
Jan 19 14:49:09  mptsas
Jan 19 14:49:09  drm_kms_helper
Jan 19 14:49:09  drm
Jan 19 14:49:09  mptscsih
Jan 19 14:49:09  hed
Jan 19 14:49:09  iTCO_wdt
Jan 19 14:49:09  serio_raw
Jan 19 14:49:09  i2c_algo_bit
Jan 19 14:49:10  i2c_core
Jan 19 14:49:10  mptbase
Jan 19 14:49:10  i5100_edac
Jan 19 14:49:10  dcdbas
Jan 19 14:49:10  iTCO_vendor_support
Jan 19 14:49:10  tg3
Jan 19 14:49:10  scsi_transport_sas
Jan 19 14:49:10  edac_core
Jan 19 14:49:10  [last unloaded: scsi_wait_scan]
Jan 19 14:49:10 o12-31 
Jan 19 14:49:10 o12-31 [ 5501.825001] 
Jan 19 14:49:10 o12-31 [ 5501.825001] Pid: 15, comm: events/0 Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300
Jan 19 14:49:10 o12-31 [ 5501.825001] RIP: 0010:[<ffffffff81108a6e>] 
Jan 19 14:49:10  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 19 14:49:10 o12-31 [ 5501.825001] RSP: 0018:ffff880226d4fde0  EFLAGS: 00010246
Jan 19 14:49:10 o12-31 [ 5501.825001] RAX: 0040000000080000 RBX: ffff880200000000 RCX: dead000000100100
Jan 19 14:49:10 o12-31 [ 5501.825001] RDX: 0040000000080000 RSI: ffffffff81dc7574 RDI: ffff880200000000
Jan 19 14:49:10 o12-31 [ 5501.825001] RBP: ffff880226d4fe10 R08: ffff880222092380 R09: ffff88021f6ffb80
Jan 19 14:49:10 o12-31 [ 5501.825001] R10: ffff88021fddb230 R11: 0000000000000202 R12: ffff880226d60000
Jan 19 14:49:10 o12-31 [ 5501.825001] R13: ffffffff811d5f3d R14: ffffea0007000000 R15: ffff880002018588
Jan 19 14:49:10 o12-31 [ 5501.825001] FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
Jan 19 14:49:11 o12-31 [ 5501.825001] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 19 14:49:11 o12-31 [ 5501.825001] CR2: 00007f67f17c8dc8 CR3: 000000021f4b4000 CR4: 00000000000406f0
Jan 19 14:49:11 o12-31 [ 5501.825001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 19 14:49:11 o12-31 [ 5501.825001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 19 14:49:11 o12-31 [ 5501.825001] Process events/0 (pid: 15, threadinfo ffff880226d4e000, task ffff880226d60000)
Jan 19 14:49:11 o12-31 [ 5501.825001] Stack:
Jan 19 14:49:11 o12-31 [ 5501.825001]  ffff880226d4fdf0
Jan 19 14:49:11  ffff880222092300
Jan 19 14:49:11  ffff880226d60000
Jan 19 14:49:11  ffff880226d60000
Jan 19 14:49:11 o12-31 
Jan 19 14:49:11 o12-31 [ 5501.825001] <0>
Jan 19 14:49:11  ffff880226d60000
Jan 19 14:49:11  ffff880002018588
Jan 19 14:49:11  ffff880226d4fe30
Jan 19 14:49:11  ffffffff811d5f3d
Jan 19 14:49:11 o12-31 
Jan 19 14:49:11 o12-31 [ 5501.825001] <0>
Jan 19 14:49:11  ffff880226d60000
Jan 19 14:49:11  ffff880002018580
Jan 19 14:49:11  ffff880226d4fee0
Jan 19 14:49:11  ffffffff8106277d
Jan 19 14:49:11 o12-31 
Jan 19 14:49:11 o12-31 [ 5501.825001] Call Trace:
Jan 19 14:49:11 o12-31 [ 5501.825001]  [<ffffffff811d5f3d>] key_cleanup+0xdb/0xef
Jan 19 14:49:11 o12-31 [ 5501.825001]  [<ffffffff8106277d>] worker_thread+0x1c5/0x251
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff811d5e62>] ? key_cleanup+0x0/0xef
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff810665cf>] ? autoremove_wake_function+0x0/0x39
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff810625b8>] ? worker_thread+0x0/0x251
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff81066135>] kthread+0x7f/0x87
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff810660b6>] ? kthread+0x0/0x87
Jan 19 14:49:12 o12-31 [ 5501.825001]  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Jan 19 14:49:12 o12-31 [ 5501.825001] Code: 
Jan 19 14:49:12 o12-31 ff
Jan 19 14:49:12 o12-31 14
Jan 19 14:49:12 o12-31 24
Jan 19 14:49:12 o12-31 49
Jan 19 14:49:12 o12-31 83
Jan 19 14:49:12 o12-31 c4
Jan 19 14:49:12 o12-31 10
Jan 19 14:49:12 o12-31 49
Jan 19 14:49:12 o12-31 83
Jan 19 14:49:12 o12-31 3c
Jan 19 14:49:12 o12-31 24
Jan 19 14:49:12 o12-31 00
Jan 19 14:49:12 o12-31 eb
Jan 19 14:49:12 o12-31 e4
Jan 19 14:49:12 o12-31 48
Jan 19 14:49:12 o12-31 83
Jan 19 14:49:12 o12-31 fb
Jan 19 14:49:12 o12-31 10
Jan 19 14:49:12 o12-31 76
Jan 19 14:49:12 o12-31 79
Jan 19 14:49:12 o12-31 48
Jan 19 14:49:12 o12-31 89
Jan 19 14:49:12 o12-31 df
Jan 19 14:49:12 o12-31 e8
Jan 19 14:49:12 o12-31 1b
Jan 19 14:49:12 o12-31 e9
Jan 19 14:49:12 o12-31 ff
Jan 19 14:49:12 o12-31 ff
Jan 19 14:49:12 o12-31 49
Jan 19 14:49:12 o12-31 89
Jan 19 14:49:12 o12-31 c6
Jan 19 14:49:12 o12-31 48
Jan 19 14:49:12 o12-31 8b
Jan 19 14:49:12 o12-31 00
Jan 19 14:49:12 o12-31 a8
Jan 19 14:49:12 o12-31 80
Jan 19 14:49:12 o12-31 75
Jan 19 14:49:12 o12-31 11
Jan 19 14:49:12 o12-31 f6
Jan 19 14:49:12 o12-31 c4
Jan 19 14:49:12 o12-31 c0
Jan 19 14:49:12 o12-31 75
Jan 19 14:49:12 o12-31 02
Jan 19 14:49:12 o12-31 f>
Jan 19 14:49:12 o12-31 0b
Jan 19 14:49:12 o12-31 4c
Jan 19 14:49:12 o12-31 89
Jan 19 14:49:12 o12-31 f7
Jan 19 14:49:12 o12-31 e8
Jan 19 14:49:12 o12-31 a7
Jan 19 14:49:12 o12-31 37
Jan 19 14:49:12 o12-31 fd
Jan 19 14:49:12 o12-31 ff
Jan 19 14:49:12 o12-31 eb
Jan 19 14:49:12 o12-31 56
Jan 19 14:49:12 o12-31 4d
Jan 19 14:49:12 o12-31 8b
Jan 19 14:49:12 o12-31 6e
Jan 19 14:49:12 o12-31 10
Jan 19 14:49:12 o12-31 4c
Jan 19 14:49:12 o12-31 8b
Jan 19 14:49:12 o12-31 7d
Jan 19 14:49:12 o12-31 08
Jan 19 14:49:12 o12-31 e8
Jan 19 14:49:12 o12-31 
Jan 19 14:49:12 o12-31 [ 5501.825001] RIP 
Jan 19 14:49:12  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 19 14:49:12 o12-31 [ 5501.825001]  RSP <ffff880226d4fde0>
Jan 19 14:49:12 o12-31 [ 5504.559212] general protection fault: 0000 [#2] 
Jan 19 14:49:12 o12-31 
Jan 19 14:49:12 o12-31 [ 5505.378360] ---[ end trace 44f45a7cc66c8504 ]---

Another trace:

Jan 18 17:38:04 o12-29 [ 9018.944644] ------------[ cut here ]------------
Jan 18 17:38:04 o12-29 [ 9018.945001] kernel BUG at mm/slub.c:2834!
Jan 18 17:38:04 o12-29 [ 9018.945001] invalid opcode: 0000 [#1] 
Jan 18 17:38:04 o12-29 SMP
Jan 18 17:38:04 o12-29 
Jan 18 17:38:04 o12-29 [ 9018.945001] last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
Jan 18 17:38:04 o12-29 [ 9018.945001] CPU 0 
Jan 18 17:38:04 o12-29 
Jan 18 17:38:04 o12-29 [ 9018.945001] Modules linked in:
Jan 18 17:38:04  nfsd
Jan 18 17:38:04  lockd
Jan 18 17:38:04  nfs_acl
Jan 18 17:38:04  auth_rpcgss
Jan 18 17:38:04  sunrpc
Jan 18 17:38:04  exportfs
Jan 18 17:38:04  ipt_REDIRECT
Jan 18 17:38:04  xt_NOTRACK
Jan 18 17:38:04  iptable_raw
Jan 18 17:38:04  fuse
Jan 18 17:38:04  ipmi_devintf
Jan 18 17:38:04  ipmi_si
Jan 18 17:38:04  ipmi_msghandler
Jan 18 17:38:04  ipv6
Jan 18 17:38:04 o12-31 [ 9005.343703] audit_printk_skb: 12 callbacks suppressed
Jan 18 17:38:04  iptable_nat
Jan 18 17:38:04  nf_nat
Jan 18 17:38:04  dm_multipath
Jan 18 17:38:04  netconsole
Jan 18 17:38:04  configfs
Jan 18 17:38:04  radeon
Jan 18 17:38:04  ttm
Jan 18 17:38:04  drm_kms_helper
Jan 18 17:38:04  drm
Jan 18 17:38:04  i2c_algo_bit
Jan 18 17:38:04  mptsas
Jan 18 17:38:04  mptscsih
Jan 18 17:38:04  tg3
Jan 18 17:38:04  i5100_edac
Jan 18 17:38:04  i2c_core
Jan 18 17:38:04  dcdbas
Jan 18 17:38:04  iTCO_wdt
Jan 18 17:38:04  mptbase
Jan 18 17:38:04  iTCO_vendor_support
Jan 18 17:38:04  serio_raw
Jan 18 17:38:04  hed
Jan 18 17:38:04  edac_core
Jan 18 17:38:04  scsi_transport_sas
Jan 18 17:38:04  [last unloaded: scsi_wait_scan]
Jan 18 17:38:04 o12-29 
Jan 18 17:38:04 o12-29 [ 9018.945001] 
Jan 18 17:38:04 o12-29 [ 9018.945001] Pid: 7299, comm: irqbalance Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300
Jan 18 17:38:04 o12-29 [ 9018.945001] RIP: 0010:[<ffffffff81108a6e>] 
Jan 18 17:38:05  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 18 17:38:05 o12-29 [ 9018.945001] RSP: 0018:ffff88021f831de8  EFLAGS: 00010246
Jan 18 17:38:05 o12-29 [ 9020.058633] RAX: 0040000000100068 RBX: ffff880200000000 RCX: ffff880225504f40
Jan 18 17:38:05 o12-29 [ 9020.058633] RDX: 0040000000100068 RSI: ffff880223287240 RDI: ffff880200000000
Jan 18 17:38:05 o12-29 [ 9020.058633] RBP: ffff88021f831e18 R08: ffff880225504858 R09: 0000000000000000
Jan 18 17:38:05 o12-29 [ 9020.058633] R10: 0000000000000000 R11: 0000000000000246 R12: ffff880221965700
Jan 18 17:38:05 o12-29 [ 9020.058633] R13: ffffffff8112f0fa R14: ffffea0007000000 R15: ffff880223287240
Jan 18 17:38:05 o12-29 [ 9020.058633] FS:  00007f231bd9d740(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
Jan 18 17:38:05 o12-29 [ 9020.058633] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jan 18 17:38:05 o12-29 [ 9020.058633] CR2: 00007f231bda5000 CR3: 000000021e9bc000 CR4: 00000000000406f0
Jan 18 17:38:05 o12-29 [ 9020.058633] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 18 17:38:06 o12-29 [ 9020.058633] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 18 17:38:06 o12-29 [ 9020.058633] Process irqbalance (pid: 7299, threadinfo ffff88021f830000, task ffff88021f0045c0)
Jan 18 17:38:06 o12-29 [ 9020.058633] Stack:
Jan 18 17:38:06 o12-29 [ 9020.058633]  ffff88021f831df8
Jan 18 17:38:06  ffff8802236aefb0
Jan 18 17:38:06  ffff880221965700
Jan 18 17:38:06  ffff8802236aefb0
Jan 18 17:38:06 o12-29 
Jan 18 17:38:06 o12-29 [ 9020.058633] <0>
Jan 18 17:38:06  ffffffff81168a3e
Jan 18 17:38:06  ffff880223287240
Jan 18 17:38:06  ffff88021f831e48
Jan 18 17:38:06  ffffffff8112f0fa
Jan 18 17:38:06 o12-29 
Jan 18 17:38:06 o12-29 [ 9020.058633] <0>
Jan 18 17:38:06  ffffffff81105f49
Jan 18 17:38:06  ffff880223287240
Jan 18 17:38:06  ffff8802236aefb0
Jan 18 17:38:06  ffff880225504840
Jan 18 17:38:06 o12-29 
Jan 18 17:38:06 o12-29 [ 9020.058633] Call Trace:
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff81168a3e>] ? seq_release_net+0x0/0x3d
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff8112f0fa>] seq_release_private+0x28/0x44
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff81105f49>] ? __raw_local_irq_save+0x1d/0x23
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff81168a72>] seq_release_net+0x34/0x3d
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff81168a3e>] ? seq_release_net+0x0/0x3d
Jan 18 17:38:06 o12-29 [ 9020.058633]  [<ffffffff81160853>] proc_reg_release+0xd1/0xee
Jan 18 17:38:07 o12-29 [ 9020.058633]  [<ffffffff81469bf6>] ? _raw_spin_lock+0xe/0x10
Jan 18 17:38:07 o12-29 [ 9020.058633]  [<ffffffff811183e2>] fput+0x13b/0x1ed
Jan 18 17:38:07 o12-29 [ 9020.058633]  [<ffffffff81115613>] filp_close+0x66/0x70
Jan 18 17:38:07 o12-29 [ 9020.058633]  [<ffffffff811156b2>] sys_close+0x95/0xd5
Jan 18 17:38:07 o12-29 [ 9020.058633]  [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
Jan 18 17:38:07 o12-29 [ 9020.058633] Code: 
Jan 18 17:38:07 o12-29 ff
Jan 18 17:38:07 o12-29 14
Jan 18 17:38:07 o12-29 24
Jan 18 17:38:07 o12-29 49
Jan 18 17:38:07 o12-29 83
Jan 18 17:38:07 o12-29 c4
Jan 18 17:38:07 o12-29 10
Jan 18 17:38:07 o12-29 49
Jan 18 17:38:07 o12-29 83
Jan 18 17:38:07 o12-29 3c
Jan 18 17:38:07 o12-29 24
Jan 18 17:38:07 o12-29 00
Jan 18 17:38:07 o12-29 eb
Jan 18 17:38:07 o12-29 e4
Jan 18 17:38:07 o12-29 48
Jan 18 17:38:07 o12-29 83
Jan 18 17:38:07 o12-29 fb
Jan 18 17:38:07 o12-29 10
Jan 18 17:38:07 o12-29 76
Jan 18 17:38:07 o12-29 79
Jan 18 17:38:07 o12-29 48
Jan 18 17:38:07 o12-29 89
Jan 18 17:38:07 o12-29 df
Jan 18 17:38:07 o12-29 e8
Jan 18 17:38:07 o12-29 1b
Jan 18 17:38:07 o12-29 e9
Jan 18 17:38:07 o12-29 ff
Jan 18 17:38:07 o12-29 ff
Jan 18 17:38:07 o12-29 49
Jan 18 17:38:07 o12-29 89
Jan 18 17:38:07 o12-29 c6
Jan 18 17:38:07 o12-29 48
Jan 18 17:38:07 o12-29 8b
Jan 18 17:38:07 o12-29 00
Jan 18 17:38:07 o12-29 a8
Jan 18 17:38:07 o12-29 80
Jan 18 17:38:07 o12-29 75
Jan 18 17:38:07 o12-29 11
Jan 18 17:38:07 o12-29 f6
Jan 18 17:38:07 o12-29 c4
Jan 18 17:38:07 o12-29 c0
Jan 18 17:38:07 o12-29 75
Jan 18 17:38:07 o12-29 02
Jan 18 17:38:07 o12-29 f>
Jan 18 17:38:07 o12-29 0b
Jan 18 17:38:07 o12-29 4c
Jan 18 17:38:07 o12-29 89
Jan 18 17:38:07 o12-29 f7
Jan 18 17:38:07 o12-29 e8
Jan 18 17:38:07 o12-29 a7
Jan 18 17:38:07 o12-29 37
Jan 18 17:38:07 o12-29 fd
Jan 18 17:38:07 o12-29 ff
Jan 18 17:38:07 o12-29 eb
Jan 18 17:38:07 o12-29 56
Jan 18 17:38:07 o12-29 4d
Jan 18 17:38:07 o12-29 8b
Jan 18 17:38:07 o12-29 6e
Jan 18 17:38:07 o12-29 10
Jan 18 17:38:07 o12-29 4c
Jan 18 17:38:07 o12-29 8b
Jan 18 17:38:07 o12-29 7d
Jan 18 17:38:07 o12-29 08
Jan 18 17:38:07 o12-29 e8
Jan 18 17:38:07 o12-29 
Jan 18 17:38:07 o12-29 [ 9020.058633] RIP 
Jan 18 17:38:07  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 18 17:38:07 o12-29 [ 9020.058633]  RSP <ffff88021f831de8>
Jan 18 17:38:07 o12-29 [ 9022.648903] ---[ end trace e75aad0bae2dbc7c ]---

Comment 1 Nate Clark 2011-01-19 22:29:44 UTC
Created attachment 474367 [details]
Picture of another panic

Wasn't able to capture the full text since this occurred during shutdown.

Comment 2 Chuck Ebbert 2011-01-20 15:20:41 UTC
In kfree():

        if (unlikely(!PageSlab(page))) {
                BUG_ON(!PageCompound(page));

Comment 3 Chuck Ebbert 2011-01-20 15:21:59 UTC
Has this been happening all along or did it start with a recent kernel update?

Comment 4 Nate Clark 2011-01-20 15:45:55 UTC
(In reply to comment #3)
> Has this been happening all along or did it start with a recent kernel update?

I haven't tested thoroughly with the other 2.6.35 kernels, but it does look like the issue might exist in 2.6.35.9-64. I am about to start testing 2.6.35.6-48 to see if it has similar problems.

Comment 5 Nate Clark 2011-01-20 20:49:00 UTC
Not sure if this is helpful but I have another stack from 2.6.35.10-74.

Jan 20 15:46:01 o12-30 [28597.628126] kernel BUG at mm/slub.c:2834!
Jan 20 15:46:02 o12-30 [28597.628126] invalid opcode: 0000 [#1] 
Jan 20 15:46:02 o12-30 SMP
Jan 20 15:46:02 o12-30 
Jan 20 15:46:02 o12-30 [28597.628126] last sysfs file: /sys/module/ipmi_si/refcnt
Jan 20 15:46:02 o12-30 [28597.628126] CPU 1 
Jan 20 15:46:02 o12-30 
Jan 20 15:46:02 o12-30 [28597.628126] Modules linked in:
Jan 20 15:46:02  nfsd
Jan 20 15:46:02  lockd
Jan 20 15:46:02  nfs_acl
Jan 20 15:46:02  auth_rpcgss
Jan 20 15:46:02  sunrpc
Jan 20 15:46:02  exportfs
Jan 20 15:46:02  ipt_REDIRECT
Jan 20 15:46:02  xt_NOTRACK
Jan 20 15:46:02  iptable_raw
Jan 20 15:46:02  fuse
Jan 20 15:46:02  ipmi_si(-)
Jan 20 15:46:02  ipmi_msghandler
Jan 20 15:46:02  ipv6
Jan 20 15:46:02  iptable_nat
Jan 20 15:46:02  nf_nat
Jan 20 15:46:02  dm_multipath
Jan 20 15:46:02  netconsole
Jan 20 15:46:02  configfs
Jan 20 15:46:02  radeon
Jan 20 15:46:02  ttm
Jan 20 15:46:02  drm_kms_helper
Jan 20 15:46:02  drm
Jan 20 15:46:02  mptsas
Jan 20 15:46:02  i2c_algo_bit
Jan 20 15:46:02  i2c_core
Jan 20 15:46:02  iTCO_wdt
Jan 20 15:46:02  tg3
Jan 20 15:46:02  mptscsih
Jan 20 15:46:02  dcdbas
Jan 20 15:46:02  iTCO_vendor_support
Jan 20 15:46:02  hed
Jan 20 15:46:02  mptbase
Jan 20 15:46:02  i5100_edac
Jan 20 15:46:02  serio_raw
Jan 20 15:46:02  edac_core
Jan 20 15:46:02  scsi_transport_sas
Jan 20 15:46:02  [last unloaded: ipmi_devintf]
Jan 20 15:46:02 o12-30 
Jan 20 15:46:02 o12-30 [28597.628126]
Jan 20 15:46:02 o12-30 [28597.628126] Pid: 21798, comm: modprobe Not tainted 2.6.35.10-74.fc14.x86_64 #1 0TY179/PowerEdge R300
Jan 20 15:46:02 o12-30 [28597.628126] RIP: 0010:[<ffffffff81108a6e>] 
Jan 20 15:46:02  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 20 15:46:02 o12-30 [28597.628126] RSP: 0018:ffff88021e8f9d88  EFLAGS: 00010246
Jan 20 15:46:02 o12-30 [28597.628126] RAX: 0040000000100078 RBX: ffff880200000000 RCX: 000000000015000e
Jan 20 15:46:02 o12-30 [28597.628126] RDX: 0040000000100078 RSI: ffffea00076d56d0 RDI: ffff880200000000
Jan 20 15:46:03 o12-30 [28597.628126] RBP: ffff88021e8f9db8 R08: ffff88021e8f9cc8 R09: ffff880002015500
Jan 20 15:46:03 o12-30 [28597.628126] R10: ffff880200000001 R11: ffff88021e8f9d18 R12: ffff88021f3d66f8
Jan 20 15:46:03 o12-30 [28597.628126] R13: ffffffff81219e79 R14: ffffea0007000000 R15: 000000000040f190
Jan 20 15:46:03 o12-30 [28597.628126] FS:  00007fe42bfa1720(0000) GS:ffff880002080000(0000) knlGS:0000000000000000
Jan 20 15:46:03 o12-30 [28597.628126] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 20 15:46:03 o12-30 [28597.628126] CR2: 00007fe42bb411a0 CR3: 0000000225cdd000 CR4: 00000000000406e0
Jan 20 15:46:03 o12-30 [28597.628126] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 20 15:46:03 o12-30 [28597.628126] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 20 15:46:03 o12-30 [28597.628126] Process modprobe (pid: 21798, threadinfo ffff88021e8f8000, task ffff880225b2c5c0)
Jan 20 15:46:03 o12-30 [28597.628126] Stack:
Jan 20 15:46:03 o12-30 [28597.628126] ffff880200000000
Jan 20 15:46:03  ffff88021f3d66c0
Jan 20 15:46:03  ffff88021f3d66f8
Jan 20 15:46:03  ffffffff81a90210
Jan 20 15:46:03 o12-30 
Jan 20 15:46:03 o12-30 [28597.628126] <0>
Jan 20 15:46:03  ffff880200000000
Jan 20 15:46:04  000000000040f190
Jan 20 15:46:04  ffff88021e8f9de8
Jan 20 15:46:04  ffffffff81219e79
Jan 20 15:46:04 o12-30 
Jan 20 15:46:04 o12-30 [28597.628126] <0>
Jan 20 15:46:04  ffff88021f3d66f8
Jan 20 15:46:04  ffffffff81219d50
Jan 20 15:46:04  ffffffff81a7bf70
Jan 20 15:46:04  ffffffffa010f870
Jan 20 15:46:04 o12-30 
Jan 20 15:46:04 o12-30 [28597.628126] Call Trace:
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219e79>] kobject_release+0x129/0x1d9
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219d50>] ? kobject_release+0x0/0x1d9
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8121b382>] kref_put+0x43/0x4d
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81219cc7>] kobject_put+0x47/0x4b
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff812d7b08>] bus_remove_driver+0xbd/0xed
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff812d8463>] driver_unregister+0x6c/0x74
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81233795>] pci_unregister_driver+0x44/0x89
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffffa010d4dc>] cleanup_ipmi_si+0x30/0xa8 [ipmi_si]
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8107af85>] sys_delete_module+0x1ca/0x23b
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff8146cfdd>] ? do_page_fault+0x238/0x265
Jan 20 15:46:04 o12-30 [28597.628126] [<ffffffff81009cf2>] system_call_fastpath+0x16/0x1b
Jan 20 15:46:05 o12-30 [28597.628126] Code: 
Jan 20 15:46:05 o12-30 ff
Jan 20 15:46:05 o12-30 14
Jan 20 15:46:05 o12-30 24
Jan 20 15:46:05 o12-30 49
Jan 20 15:46:05 o12-30 83
Jan 20 15:46:05 o12-30 c4
Jan 20 15:46:05 o12-30 10
Jan 20 15:46:05 o12-30 49
Jan 20 15:46:05 o12-30 83
Jan 20 15:46:05 o12-30 3c
Jan 20 15:46:05 o12-30 24
Jan 20 15:46:05 o12-30 00
Jan 20 15:46:05 o12-30 eb
Jan 20 15:46:05 o12-30 e4
Jan 20 15:46:05 o12-30 48
Jan 20 15:46:05 o12-30 83
Jan 20 15:46:05 o12-30 fb
Jan 20 15:46:05 o12-30 10
Jan 20 15:46:05 o12-30 76
Jan 20 15:46:05 o12-30 79
Jan 20 15:46:05 o12-30 48
Jan 20 15:46:05 o12-30 89
Jan 20 15:46:05 o12-30 df
Jan 20 15:46:05 o12-30 e8
Jan 20 15:46:05 o12-30 1b
Jan 20 15:46:05 o12-30 e9
Jan 20 15:46:05 o12-30 ff
Jan 20 15:46:05 o12-30 ff
Jan 20 15:46:05 o12-30 49
Jan 20 15:46:05 o12-30 89
Jan 20 15:46:05 o12-30 c6
Jan 20 15:46:05 o12-30 48
Jan 20 15:46:05 o12-30 8b
Jan 20 15:46:05 o12-30 00
Jan 20 15:46:05 o12-30 a8
Jan 20 15:46:05 o12-30 80
Jan 20 15:46:05 o12-30 75
Jan 20 15:46:05 o12-30 11
Jan 20 15:46:05 o12-30 f6
Jan 20 15:46:05 o12-30 c4
Jan 20 15:46:05 o12-30 c0
Jan 20 15:46:05 o12-30 75
Jan 20 15:46:05 o12-30 02
Jan 20 15:46:05 o12-30 f>
Jan 20 15:46:05 o12-30 0b
Jan 20 15:46:05 o12-30 4c
Jan 20 15:46:05 o12-30 89
Jan 20 15:46:05 o12-30 f7
Jan 20 15:46:05 o12-30 e8
Jan 20 15:46:05 o12-30 a7
Jan 20 15:46:05 o12-30 37
Jan 20 15:46:05 o12-30 fd
Jan 20 15:46:05 o12-30 ff
Jan 20 15:46:05 o12-30 eb
Jan 20 15:46:05 o12-30 56
Jan 20 15:46:05 o12-30 4d
Jan 20 15:46:05 o12-30 8b
Jan 20 15:46:05 o12-30 6e
Jan 20 15:46:05 o12-30 10
Jan 20 15:46:05 o12-30 4c
Jan 20 15:46:05 o12-30 8b
Jan 20 15:46:05 o12-30 7d
Jan 20 15:46:05 o12-30 08
Jan 20 15:46:05 o12-30 e8
Jan 20 15:46:05 o12-30 
Jan 20 15:46:05 o12-30 [28597.628126] RIP 
Jan 20 15:46:05  [<ffffffff81108a6e>] kfree+0x69/0xd7
Jan 20 15:46:05 o12-30 [28597.628126] RSP <ffff88021e8f9d88>
Jan 20 15:46:05 o12-30 [28601.071743] ---[ end trace 9dc380e62a5d64dc ]---

Comment 6 Nate Clark 2011-01-24 16:04:20 UTC
The issue exists in 2.6.35.6-48

Jan 21 02:35:26 o12-28 [ 554.258001] kernel BUG at mm/slub.c:2834!
Jan 21 02:35:26 o12-28 [ 554.258001] invalid opcode: 0000 [#1] 
Jan 21 02:35:26 o12-28 SMP
Jan 21 02:35:26 o12-28 
Jan 21 02:35:26 o12-28 [ 554.258001] last sysfs file: /sys/devices/pci0000:00/0000:00:04.0/0000:05:00.0/host4/port-4:1/end_device-4:1/target4:0:1/4:0:1:0/block/sdb/dev
Jan 21 02:35:27 o12-28 [ 554.258001] CPU 0 
Jan 21 02:35:27 o12-28 
Jan 21 02:35:27 o12-28 [ 554.258001] Modules linked in:
Jan 21 02:35:27  nfsd
Jan 21 02:35:27  lockd
Jan 21 02:35:27  nfs_acl
Jan 21 02:35:27  auth_rpcgss
Jan 21 02:35:27  sunrpc
Jan 21 02:35:27  exportfs
Jan 21 02:35:27  ipt_REDIRECT
Jan 21 02:35:27  xt_NOTRACK
Jan 21 02:35:27  iptable_raw
Jan 21 02:35:27  fuse
Jan 21 02:35:27  ipmi_devintf
Jan 21 02:35:27  ipmi_si
Jan 21 02:35:27  ipmi_msghandler
Jan 21 02:35:27  ipv6
Jan 21 02:35:27  iptable_nat
Jan 21 02:35:27  nf_nat
Jan 21 02:35:27  dm_multipath
Jan 21 02:35:27  netconsole
Jan 21 02:35:27  configfs
Jan 21 02:35:27  radeon
Jan 21 02:35:27  ttm
Jan 21 02:35:27  drm_kms_helper
Jan 21 02:35:27  mptsas
Jan 21 02:35:27  drm
Jan 21 02:35:27  mptscsih
Jan 21 02:35:27  mptbase
Jan 21 02:35:27  i2c_algo_bit
Jan 21 02:35:27  scsi_transport_sas
Jan 21 02:35:27  i2c_core
Jan 21 02:35:27  hed
Jan 21 02:35:27  dcdbas
Jan 21 02:35:27  serio_raw
Jan 21 02:35:27  iTCO_wdt
Jan 21 02:35:27  iTCO_vendor_support
Jan 21 02:35:27  i5100_edac
Jan 21 02:35:27  tg3
Jan 21 02:35:27  edac_core
Jan 21 02:35:27  [last unloaded: scsi_wait_scan]
Jan 21 02:35:27 o12-28 
Jan 21 02:35:27 o12-28 [ 554.258001] 
Jan 21 02:35:27 o12-28 [ 554.258001] Pid: 15, comm: events/0 Not tainted 2.6.35.6-48.fc14.x86_64 #1 0TY179/PowerEdge R300
Jan 21 02:35:27 o12-28 [ 554.258001] RIP: 0010:[<ffffffff81108796>] 
Jan 21 02:35:27  [<ffffffff81108796>] kfree+0x69/0xd7
Jan 21 02:35:27 o12-28 [ 554.258001] RSP: 0018:ffff880226d4fde0  EFLAGS: 00010246
Jan 21 02:35:27 o12-28 [ 554.258001] RAX: 0040000000100068 RBX: ffff880200000000 RCX: dead000000100100
Jan 21 02:35:27 o12-28 [ 554.258001] RDX: 0040000000100068 RSI: ffffffff81dc6574 RDI: ffff880200000000
Jan 21 02:35:27 o12-28 [ 554.258001] RBP: ffff880226d4fe10 R08: ffff8802255bfec0 R09: ffff8802224d7800
Jan 21 02:35:28 o12-28 [ 554.258001] R10: 0000000000000000 R11: 0000000000000246 R12: ffff880226d60000
Jan 21 02:35:28 o12-28 [ 554.258001] R13: ffffffff811d5b0d R14: ffffea0007000000 R15: ffff880002018588
Jan 21 02:35:28 o12-28 [ 554.258001] FS:  0000000000000000(0000) GS:ffff880002000000(0000) knlGS:0000000000000000
Jan 21 02:35:28 o12-28 [ 554.258001] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jan 21 02:35:28 o12-28 [ 554.258001] CR2: 00007f7375a617e8 CR3: 000000022413a000 CR4: 00000000000406f0
Jan 21 02:35:28 o12-28 [ 554.258001] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jan 21 02:35:28 o12-28 [ 554.258001] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jan 21 02:35:28 o12-28 [ 554.258001] Process events/0 (pid: 15, threadinfo ffff880226d4e000, task ffff880226d60000)
Jan 21 02:35:28 o12-28 [ 554.258001] Stack:
Jan 21 02:35:28 o12-28 [ 554.258001]  ffff880226d4fdf0
Jan 21 02:35:28  ffff8802255bfe40
Jan 21 02:35:28  ffff880226d60000
Jan 21 02:35:28  ffff880226d60000
Jan 21 02:35:28 o12-28 
Jan 21 02:35:28 o12-28 [ 554.258001] <0>
Jan 21 02:35:28  ffff880226d60000
Jan 21 02:35:28  ffff880002018588
Jan 21 02:35:28  ffff880226d4fe30
Jan 21 02:35:28  ffffffff811d5b0d
Jan 21 02:35:28 o12-28 
Jan 21 02:35:28 o12-28 [ 554.258001] <0>
Jan 21 02:35:28  ffff880226d60000
Jan 21 02:35:28  ffff880002018580
Jan 21 02:35:28  ffff880226d4fee0
Jan 21 02:35:28  ffffffff81062571
Jan 21 02:35:28 o12-28 
Jan 21 02:35:28 o12-28 [ 554.258001] Call Trace:
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff811d5b0d>] key_cleanup+0xdb/0xef
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff81062571>] worker_thread+0x1c5/0x251
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff811d5a32>] ? key_cleanup+0x0/0xef
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff810663c3>] ? autoremove_wake_function+0x0/0x39
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff810623ac>] ? worker_thread+0x0/0x251
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff81065f29>] kthread+0x7f/0x87
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff8100aae4>] kernel_thread_helper+0x4/0x10
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff81065eaa>] ? kthread+0x0/0x87
Jan 21 02:35:29 o12-28 [ 554.258001]  [<ffffffff8100aae0>] ? kernel_thread_helper+0x0/0x10
Jan 21 02:35:29 o12-28 [ 554.258001] Code: 
Jan 21 02:35:29 o12-28 ff
Jan 21 02:35:29 o12-28 14
Jan 21 02:35:29 o12-28 24
Jan 21 02:35:29 o12-28 49
Jan 21 02:35:29 o12-28 83
Jan 21 02:35:29 o12-28 c4
Jan 21 02:35:29 o12-28 10
Jan 21 02:35:29 o12-28 49
Jan 21 02:35:29 o12-28 83
Jan 21 02:35:29 o12-28 3c
Jan 21 02:35:29 o12-28 24
Jan 21 02:35:29 o12-28 00
Jan 21 02:35:29 o12-28 eb
Jan 21 02:35:29 o12-28 e4
Jan 21 02:35:29 o12-28 48
Jan 21 02:35:29 o12-28 83
Jan 21 02:35:29 o12-28 fb
Jan 21 02:35:29 o12-28 10
Jan 21 02:35:29 o12-28 76
Jan 21 02:35:29 o12-28 79
Jan 21 02:35:29 o12-28 48
Jan 21 02:35:29 o12-28 89
Jan 21 02:35:29 o12-28 df
Jan 21 02:35:29 o12-28 e8
Jan 21 02:35:29 o12-28 1b
Jan 21 02:35:29 o12-28 e9
Jan 21 02:35:29 o12-28 ff
Jan 21 02:35:29 o12-28 ff
Jan 21 02:35:29 o12-28 49
Jan 21 02:35:29 o12-28 89
Jan 21 02:35:29 o12-28 c6
Jan 21 02:35:29 o12-28 48
Jan 21 02:35:29 o12-28 8b
Jan 21 02:35:29 o12-28 00
Jan 21 02:35:29 o12-28 a8
Jan 21 02:35:29 o12-28 80
Jan 21 02:35:29 o12-28 75
Jan 21 02:35:29 o12-28 11
Jan 21 02:35:29 o12-28 f6
Jan 21 02:35:29 o12-28 c4
Jan 21 02:35:29 o12-28 c0
Jan 21 02:35:29 o12-28 75
Jan 21 02:35:29 o12-28 02
Jan 21 02:35:29 o12-28 f>
Jan 21 02:35:29 o12-28 0b
Jan 21 02:35:29 o12-28 4c
Jan 21 02:35:29 o12-28 89
Jan 21 02:35:29 o12-28 f7
Jan 21 02:35:29 o12-28 e8
Jan 21 02:35:29 o12-28 43
Jan 21 02:35:29 o12-28 38
Jan 21 02:35:29 o12-28 fd
Jan 21 02:35:29 o12-28 ff
Jan 21 02:35:29 o12-28 eb
Jan 21 02:35:29 o12-28 56
Jan 21 02:35:29 o12-28 4d
Jan 21 02:35:29 o12-28 8b
Jan 21 02:35:29 o12-28 6e
Jan 21 02:35:29 o12-28 10
Jan 21 02:35:29 o12-28 4c
Jan 21 02:35:29 o12-28 8b
Jan 21 02:35:29 o12-28 7d
Jan 21 02:35:29 o12-28 08
Jan 21 02:35:29 o12-28 e8
Jan 21 02:35:29 o12-28 
Jan 21 02:35:29 o12-28 [ 554.258001] RIP 
Jan 21 02:35:29  [<ffffffff81108796>] kfree+0x69/0xd7
Jan 21 02:35:29 o12-28 [ 554.258001]  RSP <ffff880226d4fde0>

Comment 7 Nate Clark 2011-01-26 20:56:22 UTC
Is there anything you need from me to help resolve this issue?

Comment 8 Nate Clark 2011-02-01 16:36:57 UTC
I grabbed the 2.6.36.2-12.rc1.fc15.x86_64 kernel from http://kojipkgs.fedoraproject.org/packages/kernel/ and it seems to have resolved the issue. The same nodes that were hitting the BUG on 2.6.35 have lasted over 3x longer on 2.6.36. Hope this helps to narrow down what could be the cause.

Comment 9 Josh Boyer 2011-08-29 20:41:33 UTC
Nate, have you moved your systems to f15 at this point?

Comment 10 Nate Clark 2011-08-29 21:16:22 UTC
(In reply to comment #9)
> Nate, have you moved your systems to f15 at this point?

The systems were moved to a vanilla 2.6.36.4 and have been running that for the past few months, so I can not comment on f15's kernel.

Comment 11 Chuck Ebbert 2011-08-29 23:07:56 UTC
Closing as fixed in F15 per comment #8

Comment 12 manish 2018-01-23 00:13:19 UTC
Hi,

Do we know what exact fix was made for this in the kernel ?
Any patch or commit id will be helpful. 

Thanks,
Manish