Bug 201726
| Summary: | slab allocator lock recursion. | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Lonni J Friedman <netllama> | ||||||||
| Component: | kernel | Assignee: | Peter Zijlstra <pzijlstr> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Brian Brock <bbrock> | ||||||||
| Severity: | high | Docs Contact: | |||||||||
| Priority: | medium | ||||||||||
| Version: | 6 | CC: | konradr, lwang, wtogami | ||||||||
| Target Milestone: | --- | ||||||||||
| Target Release: | --- | ||||||||||
| Hardware: | x86_64 | ||||||||||
| OS: | Linux | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||
| Doc Text: | Story Points: | --- | |||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2006-11-12 05:45:18 UTC | Type: | --- | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Bug Depends On: | |||||||||||
| Bug Blocks: | 202141 | ||||||||||
| Attachments: |
|
||||||||||
Created attachment 133794 [details]
dmesg which includes the copious backtraces
Created attachment 133795 [details]
sysreport
attaching sysreport generated after the spewage
Created attachment 133803 [details]
Serial output showing similar problem.
I get stuff like this:
Loading scsi_mod.ko module
BUG: warning at kernel/lockdep.c:565/print_infinite_recursion_bug() (Not
tainted)
Call Trace:
[<ffffffff8026e67d>] show_trace+0xae/0x319
[<ffffffff8026e8fd>] dump_stack+0x15/0x17
[<ffffffff802a71da>] print_infinite_recursion_bug+0x45/0x49
[<ffffffff802a7cad>] check_noncircular+0x30/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a7cf2>] check_noncircular+0x75/0x9c
[<ffffffff802a8558>] __lock_acquire+0x83f/0xa5f
[<ffffffff802a8d1b>] lock_acquire+0x4b/0x69
[<ffffffff80266009>] __mutex_lock_slowpath+0xe5/0x261
[<ffffffff802661af>] mutex_lock+0x2a/0x2e
[<ffffffff802ac2a4>] lock_cpu_hotplug+0x7a/0x85
[<ffffffff802bb1ed>] stop_machine_run+0x1a/0x4a
[<ffffffff802af2d3>] sys_init_module+0x16d6/0x18cc
[<ffffffff8026030e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:
-> #20 (&rq->rq_lock_key#16){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #19 (&rq->rq_lock_key#15){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #18 (&rq->rq_lock_key#14){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #17 (&rq->rq_lock_key#13){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #16 (&rq->rq_lock_key#12){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #15 (&rq->rq_lock_key#11){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #14 (&rq->rq_lock_key#10){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #13 (&rq->rq_lock_key#9){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #12 (&rq->rq_lock_key#8){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #11 (&rq->rq_lock_key#7){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #10 (&rq->rq_lock_key#6){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #9 (&rq->rq_lock_key#5){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #8 (&rq->rq_lock_key#4){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #7 (&rq->rq_lock_key#3){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #6 (&rq->rq_lock_key#2){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028d34e>] double_rq_lock+0x2d/0x33
[<ffffffff8028d61d>] __migrate_task+0x63/0xea
[<ffffffff802469c8>] migration_thread+0x1e0/0x23f
[<ffffffff802352e9>] kthread+0xff/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #5 (&rq->rq_lock_key){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8028dcc8>] task_rq_lock+0x41/0x74
[<ffffffff8024887f>] try_to_wake_up+0x26/0x418
[<ffffffff8028decc>] default_wake_function+0xc/0xf
[<ffffffff8028c1cc>] __wake_up_common+0x3d/0x68
[<ffffffff8028d30c>] complete+0x37/0x4c
[<ffffffff802352c4>] kthread+0xda/0x136
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #4 (&q->lock){++..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff80267a22>] _spin_lock_irqsave+0x2b/0x3c
[<ffffffff8023a752>] __wake_up_sync+0x1d/0x4e
[<ffffffff8029c27e>] do_notify_parent+0x197/0x1b5
[<ffffffff80216481>] do_exit+0x809/0x919
[<ffffffff8029feea>] ____call_usermodehelper+0x5e/0x5f
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #3 (&sighand->siglock){....}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff8022e1d9>] flush_old_exec+0x49c/0xb14
[<ffffffff80219053>] load_elf_binary+0x47d/0x18ee
[<ffffffff80242803>] search_binary_handler+0xcb/0x2c7
[<ffffffff80250a4a>] load_script+0x1ae/0x1c4
[<ffffffff80242803>] search_binary_handler+0xcb/0x2c7
[<ffffffff80241cc3>] do_execve+0x1a3/0x261
[<ffffffff802575ea>] sys_execve+0x35/0x4d
[<ffffffff802612cb>] execve+0x63/0xc8
[<ffffffff8026d7d3>] init+0x353/0x3cd
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #2 (init_sighand.siglock){....}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026764a>] _spin_lock+0x24/0x31
[<ffffffff80220e98>] copy_process+0x1196/0x167d
[<ffffffff8023363c>] do_fork+0x7e/0x17f
[<ffffffff802611f8>] kernel_thread+0x80/0xde
[<ffffffff8096e8bb>] start_kernel+0x249/0x24c
[<ffffffff8096e28a>] _sinittext+0x28a/0x292
[<ffffffffffffffff>] 0xffffffffffffffff
-> #1 (tasklist_lock){..--}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff8026771e>] _write_lock_irq+0x2a/0x37
[<ffffffff80220e00>] copy_process+0x10fe/0x167d
[<ffffffff80291bda>] fork_idle+0x36/0x61
[<ffffffff80278e08>] do_fork_idle+0x13/0x27
[<ffffffff8027919f>] __cpu_up+0x298/0x782
[<ffffffff802abdfd>] _cpu_up+0x80/0xdd
[<ffffffff802abe84>] cpu_up+0x2a/0x42
[<ffffffff8026d537>] init+0xb7/0x3cd
[<ffffffff8026125d>] child_rip+0x7/0x12
-> #0 (cpu_bitmask_lock){--..}:
[<ffffffff802a8d1a>] lock_acquire+0x4a/0x69
[<ffffffff80266008>] __mutex_lock_slowpath+0xe4/0x261
[<ffffffff802661ae>] mutex_lock+0x29/0x2e
[<ffffffff802ac2a3>] lock_cpu_hotplug+0x79/0x85
[<ffffffff802bb1ec>] stop_machine_run+0x19/0x4a
[<ffffffff802af2d2>] sys_init_module+0x16d5/0x18cc
[<ffffffff8026030d>] system_call+0x7d/0x83
other info that might help us debug this:
1 lock held by insmod/583:
#0: (module_mutex){--..}, at: [<ffffffff80265de5>]
mutex_lock_interruptible+0x2a/0x2e
stack backtrace:
Call Trace:
[<ffffffff8026e67d>] show_trace+0xae/0x319
[<ffffffff8026e8fd>] dump_stack+0x15/0x17
[<ffffffff802a6f67>] print_circular_bug_tail+0x6c/0x77
[<ffffffff802a8561>] __lock_acquire+0x848/0xa5f
[<ffffffff802a8d1b>] lock_acquire+0x4b/0x69
[<ffffffff80266009>] __mutex_lock_slowpath+0xe5/0x261
[<ffffffff802661af>] mutex_lock+0x2a/0x2e
[<ffffffff802ac2a4>] lock_cpu_hotplug+0x7a/0x85
[<ffffffff802bb1ed>] stop_machine_run+0x1a/0x4a
[<ffffffff802af2d3>] sys_init_module+0x16d6/0x18cc
[<ffffffff8026030e>] system_call+0x7e/0x83
DWARF2 unwinder stuck at system_call+0x7e/0x83
Leftover inexact backtrace:
SCSI subsystem initialized
should be fixed in 2.6.18-1.2849.fc6 now in updates |
Description of problem: Version-Release number of selected component (if applicable): 2.6.17-1.2517.fc6 and 2.6.17-1.2530.fc6 How reproducible: Steps to Reproduce: 1. Boot, and the fireworks begin early on at "checking if image is initramfs..." 2. Even after booting, there's a ton more spewage, seemingly at random when doing things as simple as ssh/scp, and also when running the attached sysreport. Actual results: ####### checking if image is initramfs... ============================================= [ INFO: possible recursive locking detected ] --------------------------------------------- swapper/1 is trying to acquire lock: (&nc->lock){....}, at: [<ffffffff8020782c>] kmem_cache_free+0x1a1/0x26c but task is already holding lock: (&nc->lock){....}, at: [<ffffffff8020b47e>] kfree+0x1b3/0x27e other info that might help us debug this: 2 locks held by swapper/1: #0: (&nc->lock){....}, at: [<ffffffff8020b47e>] kfree+0x1b3/0x27e #1: (&parent->list_lock){....}, at: [<ffffffff802dae22>] __drain_alien_cache+0x37/0 x77 stack backtrace: Call Trace: [<ffffffff8026e77d>] show_trace+0xae/0x30e [<ffffffff8026e9f2>] dump_stack+0x15/0x17 [<ffffffff802a7f23>] __lock_acquire+0x135/0xa54 [<ffffffff802a8de3>] lock_acquire+0x4b/0x69 [<ffffffff8026774b>] _spin_lock+0x25/0x31 [<ffffffff8020782c>] kmem_cache_free+0x1a1/0x26c [<ffffffff802da9a8>] slab_destroy+0x12b/0x138 [<ffffffff802dab5f>] free_block+0x1aa/0x1ee [<ffffffff802dae48>] __drain_alien_cache+0x5d/0x77 [<ffffffff8020b49b>] kfree+0x1d0/0x27e [<ffffffff80968444>] free+0x9/0xb [<ffffffff80968461>] huft_free+0x1b/0x27 [<ffffffff8096961c>] inflate_dynamic+0x4f0/0x525 [<ffffffff80969b18>] unpack_to_rootfs+0x4c7/0x930 [<ffffffff80969fe6>] populate_rootfs+0x65/0xe7 [<ffffffff8026d710>] init+0x190/0x3cd [<ffffffff8026135e>] child_rip+0x8/0x12 DWARF2 unwinder stuck at child_rip+0x8/0x12 Leftover inexact backtrace: [<ffffffff80267a32>] _spin_unlock_irq+0x2b/0x31 [<ffffffff8026099c>] restore_args+0x0/0x30 [<ffffffff8036c696>] acpi_os_acquire_lock+0x9/0xb [<ffffffff8026d580>] init+0x0/0x3cd [<ffffffff80261356>] child_rip+0x0/0x12 it is ############## ############ sibling task PC pid father child younger older init S ffff8101442eda08 0 1 0 2 (NOTLB) ffff8101442eda08 ffff8101442ed988 ffffffff806c0d80 0000000000000008 ffff81013fc6a040 ffffffff80567e80 000000a047ea2983 0000000000015fe9 ffff81013fc6a228 ffff810100000000 ffffffff806c0d80 ffff8101442eda08 Call Trace: [<ffffffff80265886>] schedule_timeout+0x8c/0xb3 [<ffffffff8021204c>] do_select+0x470/0x4de [<ffffffff802e705b>] core_sys_select+0x1b7/0x266 [<ffffffff80217106>] sys_select+0x147/0x172 [<ffffffff8026040e>] system_call+0x7e/0x83 DWARF2 unwinder stuck at system_call+0x7e/0x83 Leftover inexact backtrace: migration/0 S ffff81013fc77e98 0 2 1 3 (L-TLB) ffff81013fc77e98 0000000100000296 0000000000000002 0000000000000001 ffff8101442ea040 ffffffff80567e80 000000a0598b50f5 000000000000119a ffff8101442ea228 ffff810100000000 0000000000000046 ffff81013fc77e78 Call Trace: [<ffffffff80246a22>] migration_thread+0x1a2/0x23f [<ffffffff802353fe>] kthread+0x100/0x136 [<ffffffff8026135e>] child_rip+0x8/0x12 DWARF2 unwinder stuck at child_rip+0x8/0x12 Leftover inexact backtrace: [<ffffffff80267a32>] _spin_unlock_irq+0x2b/0x31 [<ffffffff8026099c>] restore_args+0x0/0x30 [<ffffffff802352fe>] kthread+0x0/0x136 [<ffffffff80261356>] child_rip+0x0/0x12 ######### Expected results: No spewage Additional info: I'm seeing all of this behavior on an HP xw9300 workstation