Bug 523121 - [RHEL6 Xen]: kernel backtrace: possible recursive locking detected on Xen domU
[RHEL6 Xen]: kernel backtrace: possible recursive locking detected on Xen domU
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel (Show other bugs)
6.1
All Linux
low Severity medium
: rc
: ---
Assigned To: Xen Maintainance List
Martin Jenner
:
Depends On: 521800
Blocks:
  Show dependency treegraph
 
Reported: 2009-09-14 04:13 EDT by Chris Lalancette
Modified: 2011-01-05 05:10 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 521800
Environment:
Last Closed: 2010-01-15 06:30:53 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Chris Lalancette 2009-09-14 04:13:10 EDT
+++ This bug was initially created as a clone of Bug #521800 +++

Description of problem:
During Xen domU bootup there's a backtrace related to recursive locking. I copy & pasted it here from domU dmesg.

Version-Release number of selected component (if applicable):
2.6.31-0.203.rc8.git2.fc12.i686.PAE

How reproducible:
Always.

Steps to Reproduce:
1. Boot rawhide kernel as Xen domU
2. domU seems to work, but you get the backtrace

  
Actual results:
recursive locking related backtrace during kernel startup

Expected results:
No backtrace / locking errors.

Additional info:
This happens on Fedora 11 Xen dom0, using xen-3.4.1-3 (rebuilt for F11), pv_ops dom0 kernel, and libvirt from F11 testing-updates.

paste from domU dmesg:

Write protecting the kernel text: 4352k
Write protecting the kernel read-only data: 1800k

=============================================
[ INFO: possible recursive locking detected ]
2.6.31-0.203.rc8.git2.fc12.i686.PAE #1
---------------------------------------------
init/1 is trying to acquire lock:
 (&input_pool.lock){+.+...}, at: [<c043b30e>] __wake_up+0x2b/0x61

but task is already holding lock:
 (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0

other info that might help us debug this:
2 locks held by init/1:
 #0:  (&p->cred_guard_mutex){+.+.+.}, at: [<c0508756>] do_execve+0xa4/0x2ee
 #1:  (&input_pool.lock){+.+...}, at: [<c068e21b>] account+0x30/0xf0

stack backtrace:
Pid: 1, comm: init Not tainted 2.6.31-0.203.rc8.git2.fc12.i686.PAE #1
Call Trace:
 [<c08387c0>] ? printk+0x22/0x3a
 [<c0478b59>] __lock_acquire+0x7e9/0xb25
 [<c0478f4c>] lock_acquire+0xb7/0xeb
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c083b4f7>] _spin_lock_irqsave+0x45/0x89
 [<c043b30e>] ? __wake_up+0x2b/0x61
 [<c043b30e>] __wake_up+0x2b/0x61
 [<c068e2a0>] account+0xb5/0xf0
 [<c068e3ef>] extract_entropy+0x3e/0xac
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c04799d7>] ? lock_release+0x186/0x19f
 [<c068e56e>] get_random_bytes+0x29/0x3e
 [<c053bbd1>] load_elf_binary+0xab9/0x106c
 [<c050732d>] search_binary_handler+0xd7/0x27b
 [<c053b118>] ? load_elf_binary+0x0/0x106c
 [<c0539c76>] load_script+0x1a6/0x1c8
 [<c0507323>] ? search_binary_handler+0xcd/0x27b
 [<c0406199>] ? xen_force_evtchn_callback+0x1d/0x34
 [<c0507323>] ? search_binary_handler+0xcd/0x27b
 [<c0406b14>] ? check_events+0x8/0xc
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c04799d7>] ? lock_release+0x186/0x19f
 [<c050732d>] search_binary_handler+0xd7/0x27b
 [<c0539ad0>] ? load_script+0x0/0x1c8
 [<c050888b>] do_execve+0x1d9/0x2ee
 [<c0408359>] sys_execve+0x39/0x6e
 [<c0409ad0>] syscall_call+0x7/0xb
 [<c04f00d8>] ? sys_swapon+0x348/0xa98
 [<c040d76b>] ? kernel_execve+0x27/0x3e
 [<c04031e0>] ? run_init_process+0x2b/0x3e
 [<c0403275>] ? init_post+0x82/0xe9
 [<c0a9b566>] ? kernel_init+0x1f6/0x211
 [<c0a9b370>] ? kernel_init+0x0/0x211
 [<c040a6bf>] ? kernel_thread_helper+0x7/0x10  

-------

--- Additional comment from jeremy@goop.org on 2009-09-11 13:03:14 EDT ---

Does this happen on native?  At first glance, I don't see anything Xen-specific there.

--- Additional comment from jeremy@goop.org on 2009-09-11 13:04:07 EDT ---

Does this kernel have CONFIG_PARAVIRT_SPINLOCKS enabled?

--- Additional comment from jforbes@redhat.com on 2009-09-11 17:01:38 EDT ---

No, CONFIG_PARAVIRT_SPINLOCKS is not set.

--- Additional comment from pasik@iki.fi on 2009-09-13 07:10:46 EDT ---

I just tried updating the rawhide domU to the latest packages.. here's the backtrace from 2.6.31-2.fc12.i686.PAE:

I'll try to upgrade the dom0 to rawhide, to check if this happens also on baremetal.


dracut: dracut-001-9.git6f0e469d.fc12
udev: starting version 145
udevadm used greatest stack depth: 6072 bytes left

=============================================
[ INFO: possible recursive locking detected ]
2.6.31-2.fc12.i686.PAE #1
---------------------------------------------
udevd/68 is trying to acquire lock:
 (&input_pool.lock){+.+...}, at: [<c043b31e>] __wake_up+0x2b/0x61

but task is already holding lock:
 (&input_pool.lock){+.+...}, at: [<c068e3e7>] account+0x30/0xf0

other info that might help us debug this:
2 locks held by udevd/68:
 #0:  (nl_table_lock){.+.+..}, at: [<c07b4f8e>] netlink_table_grab+0x21/0xb8
 #1:  (&input_pool.lock){+.+...}, at: [<c068e3e7>] account+0x30/0xf0

stack backtrace:
Pid: 68, comm: udevd Not tainted 2.6.31-2.fc12.i686.PAE #1
Call Trace:
 [<c0838c08>] ? printk+0x22/0x3a
 [<c0478b69>] __lock_acquire+0x7e9/0xb25
 [<c0478f5c>] lock_acquire+0xb7/0xeb
 [<c043b31e>] ? __wake_up+0x2b/0x61
 [<c043b31e>] ? __wake_up+0x2b/0x61
 [<c083b93f>] _spin_lock_irqsave+0x45/0x89
 [<c043b31e>] ? __wake_up+0x2b/0x61
 [<c043b31e>] __wake_up+0x2b/0x61
 [<c068e46c>] account+0xb5/0xf0
 [<c068e5bb>] extract_entropy+0x3e/0xac
 [<c07b578c>] ? nl_pid_hash_zalloc+0x27/0x52
 [<c068e73a>] get_random_bytes+0x29/0x3e
 [<c07b5987>] nl_pid_hash_rehash+0x71/0xed
 [<c07b5a9f>] netlink_insert+0x9c/0x123
 [<c07b5bd0>] netlink_autobind+0xaa/0xce
 [<c07b5d34>] netlink_bind+0x8d/0x164
 [<c078bed6>] sys_bind+0x7e/0xb4
 [<c0476219>] ? lock_release_holdtime+0x39/0x143
 [<c0479762>] ? lock_release_non_nested+0xb6/0x1b5
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c0478f74>] ? lock_acquire+0xcf/0xeb
 [<c0406199>] ? xen_force_evtchn_callback+0x1d/0x34
 [<c04df3c0>] ? might_fault+0x56/0xa4
 [<c0406b14>] ? check_events+0x8/0xc
 [<c0406b0b>] ? xen_restore_fl_direct_end+0x0/0x1
 [<c04799e7>] ? lock_release+0x186/0x19f
 [<c04df3fb>] ? might_fault+0x91/0xa4
 [<c078c713>] sys_socketcall+0x8f/0x1a6
 [<c0409ad0>] syscall_call+0x7/0xb
dracut: Starting plymouth daemon
blkfront: xvda: barriers enabled
 xvda: xvda1 xvda2
blkid used greatest stack depth: 5968 bytes left

--- Additional comment from pasik@iki.fi on 2009-09-13 11:07:27 EDT ---

I just upgraded the host to rawhide/f12, and booted 2.6.31-2.fc12.i686.PAE on baremetal. No backtrace on baremetal, so it only happens on Xen domU.
Comment 1 RHEL Product and Program Management 2009-09-14 04:19:47 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 2 Pasi Karkkainen 2010-01-15 05:25:50 EST
This bug is fixed in 2.6.31.5-127.fc12.i686.PAE and newer kernels. The original bug is closed already.
Comment 3 Chris Lalancette 2010-07-19 09:31:54 EDT
Clearing out old flags for reporting purposes.

Chris Lalancette

Note You need to log in before you can comment on or make changes to this bug.