Bug 507500

Summary:	"MAX_LOCK_DEPTH too low!" warning during kvm guest startup
Product:	[Fedora] Fedora	Reporter:	Saikat Guha <sg266>
Component:	kernel	Assignee:	Kernel Maintainer List <kernel-maint>
Status:	CLOSED WONTFIX	QA Contact:	Fedora Extras Quality Assurance <extras-qa>
Severity:	medium	Docs Contact:
Priority:	low
Version:	13	CC:	berrange, clalance, ehabkost, gcosta, gozen, itamar, jforbes, kernel-maint, kmcmartin, markmc, selinux, virt-maint
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	546576 (view as bug list)		Environment:
Last Closed:	2011-06-27 14:15:12 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	498969, 546576

Description Saikat Guha 2009-06-23 01:32:22 UTC

During boot on http://www.smolts.org/client/show/pub_fb7fc672-cdea-44da-9462-668f9497c120 running F11 + kernel-2.6.30-6.fc12.x86_64

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
Pid: 1750, comm: qemu-kvm Not tainted 2.6.30-6.fc12.x86_64 #1
Call Trace:
 [<ffffffff8108a068>] __lock_acquire+0xb80/0xc0a
 [<ffffffff8108a1e0>] lock_acquire+0xee/0x12e
 [<ffffffff810ffa5f>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff810ffa5f>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff814b9e16>] _spin_lock_nest_lock+0x45/0x8e
 [<ffffffff810ffa5f>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff814b834a>] ? mutex_lock_nested+0x4f/0x6b
 [<ffffffff810ffa5f>] mm_take_all_locks+0xf7/0x141
 [<ffffffff8111540b>] ? do_mmu_notifier_register+0xb4/0x192
 [<ffffffff81115413>] do_mmu_notifier_register+0xbc/0x192
 [<ffffffff81115548>] mmu_notifier_register+0x26/0x3e
 [<ffffffffa005ec76>] kvm_dev_ioctl+0x143/0x2d9 [kvm]
 [<ffffffff811347ff>] vfs_ioctl+0x31/0xaa
 [<ffffffff81134cf5>] do_vfs_ioctl+0x47d/0x4d4
 [<ffffffff81134db1>] sys_ioctl+0x65/0x9c
 [<ffffffff81013002>] system_call_fastpath+0x16/0x1b

# rpm -qa | grep "\(kvm\|qemu\|virt\|kernel\)" | sort
etherboot-zroms-kvm-5.4.4-13.fc11.noarch
kernel-2.6.30-6.fc12.x86_64
kernel-firmware-2.6.29.5-191.fc11.noarch
kernel-firmware-2.6.30-6.fc12.noarch
kernel-headers-2.6.29.4-167.fc11.x86_64
libvirt-0.6.2-11.fc11.x86_64
libvirt-python-0.6.2-11.fc11.x86_64
python-virtinst-0.400.3-8.fc11.noarch
qemu-0.10.5-3.fc11.x86_64
qemu-common-0.10.5-3.fc11.x86_64
qemu-img-0.10.5-3.fc11.x86_64
qemu-system-arm-0.10.5-3.fc11.x86_64
qemu-system-cris-0.10.5-3.fc11.x86_64
qemu-system-m68k-0.10.5-3.fc11.x86_64
qemu-system-mips-0.10.5-3.fc11.x86_64
qemu-system-ppc-0.10.5-3.fc11.x86_64
qemu-system-sh4-0.10.5-3.fc11.x86_64
qemu-system-sparc-0.10.5-3.fc11.x86_64
qemu-system-x86-0.10.5-3.fc11.x86_64
qemu-user-0.10.5-3.fc11.x86_64

Comment 1 Saikat Guha 2009-06-23 09:35:14 UTC

With kernel-2.6.31-0.24.rc0.git18.fc12.x86_64

BUG: MAX_STACK_TRACE_ENTRIES too low!
turning off the locking correctness validator.
Pid: 1333, comm: S24avahi-daemon Not tainted 2.6.31-0.24.rc0.git18.fc12.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff81092ac8>] save_trace+0x9d/0xba
 [<ffffffff8109376c>] mark_lock+0xd6/0x253
 [<ffffffff81094b03>] __lock_acquire+0x275/0xc17
 [<ffffffff8107c91e>] ? __kernel_text_address+0x86/0x98
 [<ffffffff81095593>] lock_acquire+0xee/0x12e
 [<ffffffff8127f4cd>] ? get_hash_bucket+0x3b/0x5d
 [<ffffffff8127f4cd>] ? get_hash_bucket+0x3b/0x5d
 [<ffffffff814eeaa2>] _spin_lock_irqsave+0x5d/0xab
 [<ffffffff8127f4cd>] ? get_hash_bucket+0x3b/0x5d
 [<ffffffff8127f4cd>] get_hash_bucket+0x3b/0x5d
 [<ffffffff81280a4e>] add_dma_entry+0x26/0x62
 [<ffffffff81280e33>] debug_dma_map_page+0xf3/0x116
 [<ffffffff8142dc26>] skb_dma_map+0xf4/0x242
 [<ffffffffa0032f3b>] tg3_start_xmit_dma_bug+0x34b/0x80e [tg3]
 [<ffffffff81432cf0>] dev_hard_start_xmit+0x24d/0x30c
 [<ffffffff814492e6>] ? __qdisc_run+0xe6/0x231
 [<ffffffff8144930a>] __qdisc_run+0x10a/0x231
 [<ffffffff81433158>] dev_queue_xmit+0x263/0x396
 [<ffffffff81433074>] ? dev_queue_xmit+0x17f/0x396
 [<ffffffff8106a4a8>] ? local_bh_enable_ip+0x21/0x37
 [<ffffffff8143c403>] neigh_resolve_output+0x268/0x2af
 [<ffffffff8146c51c>] ? ip_finish_output+0x0/0x98
 [<ffffffff8146c4d7>] ip_finish_output2+0x1f3/0x238
 [<ffffffff8146c59e>] ip_finish_output+0x82/0x98
 [<ffffffff8146c935>] ip_output+0xb3/0xce
 [<ffffffff8146b047>] dst_output+0x23/0x39
 [<ffffffff8146ca29>] ip_local_out+0x32/0x4d
 [<ffffffff814986bc>] igmpv3_sendpack+0x50/0x6c
 [<ffffffff8149acdd>] igmp_ifc_timer_expire+0x269/0x2b8
 [<ffffffff8149aa74>] ? igmp_ifc_timer_expire+0x0/0x2b8
 [<ffffffff81070179>] run_timer_softirq+0x1f7/0x29e
 [<ffffffff810700e8>] ? run_timer_softirq+0x166/0x29e
 [<ffffffff8101a205>] ? read_tsc+0x9/0x1b
 [<ffffffff8108971a>] ? clocksource_read+0x22/0x38
 [<ffffffff8106a6c5>] __do_softirq+0xf6/0x1f0
 [<ffffffff8101422c>] call_softirq+0x1c/0x30
 [<ffffffff81015d77>] do_softirq+0x5f/0xd7
 [<ffffffff81069fdc>] irq_exit+0x66/0xbc
 [<ffffffff8102c180>] smp_apic_timer_interrupt+0x99/0xbf
 [<ffffffff81013bf3>] apic_timer_interrupt+0x13/0x20
 <EOI>  [<ffffffff81093ce5>] ? debug_check_no_locks_freed+0x6/0x16a
 [<ffffffff8127809e>] ? __spin_lock_init+0x2e/0x7c
 [<ffffffff8106029c>] ? mm_init+0xd6/0x1a9
 [<ffffffff81060968>] ? dup_mm+0x8a/0x3ea
 [<ffffffff814ee5cb>] ? _spin_unlock_irq+0x3f/0x61
 [<ffffffff81061973>] ? copy_process+0xc44/0x148c
 [<ffffffff8106232e>] ? do_fork+0x173/0x37a
 [<ffffffff81093c6d>] ? trace_hardirqs_on_caller+0x139/0x175
 [<ffffffff81093cc9>] ? trace_hardirqs_on+0x20/0x36
 [<ffffffff81012f7a>] ? sysret_check+0x2e/0x69
 [<ffffffff810114d2>] ? sys_clone+0x3b/0x51
 [<ffffffff814edfde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff81013373>] ? stub_clone+0x13/0x20
 [<ffffffff81012f42>] ? system_call_fastpath+0x16/0x1b

Comment 2 Mark McLoughlin 2009-06-23 15:46:39 UTC

Please file the MAX_STACK_TRACE_ENTRIES issue separately.

The MAX_LOCK_DEPTH issue was discussed upstream here:

  http://kerneltrap.org/mailarchive/linux-kvm/2009/5/13/5778353

Sounds like MAX_LOCK_DEPTH does actually need to be increased for this case.

Comment 3 Saikat Guha 2009-06-23 17:44:03 UTC

MAX_STACK_TRACE_ENTRIES created as bug #507673

Comment 4 Tom London 2009-09-04 17:05:08 UTC

I'm seeing this as well with all recent rawhide kernels after booting each time I run qemu-kvm the first time.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
Pid: 2099, comm: qemu-kvm Not tainted 2.6.31-0.199.rc8.git2.fc12.x86_64 #1
Call Trace:
 [<ffffffff81098a5b>] __lock_acquire+0xb84/0xc0e
 [<ffffffff81098bd3>] lock_acquire+0xee/0x12e
 [<ffffffff8111b427>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff8111b427>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff8111b427>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff81506da6>] _spin_lock_nest_lock+0x45/0x8e
 [<ffffffff8111b427>] ? mm_take_all_locks+0xf7/0x141
 [<ffffffff8150561b>] ? mutex_lock_nested+0x4f/0x6b
 [<ffffffff8111b427>] mm_take_all_locks+0xf7/0x141
 [<ffffffff811313d8>] ? do_mmu_notifier_register+0xb4/0x192
 [<ffffffff811313e0>] do_mmu_notifier_register+0xbc/0x192
 [<ffffffff81131515>] mmu_notifier_register+0x26/0x3d
 [<ffffffffa022ceeb>] kvm_dev_ioctl+0x14a/0x2f7 [kvm]
 [<ffffffff81152a5f>] vfs_ioctl+0x31/0xaa
 [<ffffffff81153021>] do_vfs_ioctl+0x4aa/0x506
 [<ffffffff811530e2>] sys_ioctl+0x65/0x9c
 [<ffffffff81012f42>] system_call_fastpath+0x16/0x1b
kvm: emulating exchange as write

Is there some useful way of adjusting this at runtime via "sysctl" or "echo XXXX >/proc/sys/kernel/max_lock_depth"?

Comment 5 Justin M. Forbes 2009-10-07 18:49:03 UTC

It appears that several attempts have been made at increasing MAX_LOCK_DEPTH and meet a lot of upstream resistance.  Wondering if there is a better solution for this.

Comment 6 Mark McLoughlin 2009-10-09 11:13:24 UTC

*** Bug 527680 has been marked as a duplicate of this bug. ***

Comment 7 Kyle McMartin 2009-10-20 18:33:07 UTC

Making MAX_LOCK_DEPTH one bigger means another ~110bytes in task_struct... making it 48 bigger means task_struct is now another full page (and a quarter) bigger... this seems, ugh.

How deep does it need to be?

Comment 8 Bug Zapper 2010-03-15 12:40:23 UTC

This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 9 Bug Zapper 2011-06-02 17:59:57 UTC

This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Bug Zapper 2011-06-27 14:15:12 UTC

Fedora 13 changed to end-of-life (EOL) status on 2011-06-25. Fedora 13 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.