| Summary: | libvirt deadlock after destroy & undefine domain | ||
|---|---|---|---|
| Product: | [Community] Virtualization Tools | Reporter: | Marc-Andre Lureau <marcandre.lureau> |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
| Status: | CLOSED NEXTRELEASE | QA Contact: | |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | unspecified | CC: | berrange, crobinso, dallan, xen-maint |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-05-25 07:35:30 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Michal, does the backtrace shed any light on what's going on here? THe problem is the virDomainEventStateFlush() method + free callbacks. When dispatching callbacks it is careful to release the driver lock. When purging deleted callbacks though, we don't release the lock. So if the 'free callback' again uses the virDomainEventState it will deadlock. (In reply to comment #2) > THe problem is the virDomainEventStateFlush() method + free callbacks. > > When dispatching callbacks it is careful to release the driver lock. > > When purging deleted callbacks though, we don't release the lock. So if the > 'free callback' again uses the virDomainEventState it will deadlock. Will you submit a patch? The patch has beed proposed upstream: https://www.redhat.com/archives/libvir-list/2012-May/msg00995.html Moving to POST:
commit 2cb0899eec72376629a0583647dcad39b00c5715
Author: Daniel P. Berrange <berrange>
AuthorDate: Mon May 21 12:10:53 2012 +0100
Commit: Daniel P. Berrange <berrange>
CommitDate: Mon May 21 18:50:47 2012 +0100
Fix potential events deadlock when unref'ing virConnectPtr
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
* src/conf/domain_event.c, src/conf/domain_event.h,
src/libvirt_private.syms: Delete virDomainEventStateDeregisterConn
* src/libxl/libxl_driver.c, src/lxc/lxc_driver.c,
src/qemu/qemu_driver.c, src/uml/uml_driver.c: Remove
calls to virDomainEventStateDeregisterConn
Oh, since this is reported against upstream, the BZ hygene is not moving to POST but CLOSED NEXTRELEASE. The referred patch will be picked up by next release (0.9.13) |
Description of problem: libvirt deadlock'ed after I destroyed and undefined a domain, in qemu:///session. Version-Release number of selected component (if applicable): 0.9.9 Not sure if the bug is reproducible. Steps: 1. (optional) create a "live" f16 domain with the Boxes wizard 2. virsh -c qemu:///session destroy "Fedora 16" 3. virsh -c qemu:///session undefine "Fedora 16" Actual results: (gdb) thread apply all bt Thread 11 (Thread 0x7f5257e1a700 (LWP 27214)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd8e0, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5970) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5a90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5257e1a700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 10 (Thread 0x7f5257619700 (LWP 27215)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd8e0, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5b00) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5b90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5257619700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 9 (Thread 0x7f5256e18700 (LWP 27216)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd8e0, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5970) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5a90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5256e18700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 8 (Thread 0x7f5256617700 (LWP 27217)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd8e0, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5b00) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5b90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5256617700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 7 (Thread 0x7f5255e16700 (LWP 27218)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd8e0, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5970) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5a90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5255e16700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 6 (Thread 0x7f5255615700 (LWP 27219)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd970, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5b00) at util/threadpool.c:103 ---Type <return> to continue, or q <return> to quit--- #3 0x00007f525f510593 in virThreadHelper (data=0x24c5b90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5255615700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 5 (Thread 0x7f5254e14700 (LWP 27220)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd970, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5970) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5a90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5254e14700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 4 (Thread 0x7f5254613700 (LWP 27221)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd970, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5b00) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5b90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5254613700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 3 (Thread 0x7f5253e12700 (LWP 27222)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd970, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5c20) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5a90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5253e12700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 2 (Thread 0x7f5253611700 (LWP 27223)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:165 #1 0x00007f525f5103b9 in virCondWait (c=0x24cd970, m=0x24cd8b8) at util/threads-pthread.c:117 #2 0x00007f525f5109c6 in virThreadPoolWorker (opaque=0x24c5b00) at util/threadpool.c:103 #3 0x00007f525f510593 in virThreadHelper (data=0x24c5b90) at util/threads-pthread.c:161 #4 0x0000003a17a07d90 in start_thread (arg=0x7f5253611700) at pthread_create.c:309 #5 0x0000003a176ef48d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:115 Thread 1 (Thread 0x7f525e688840 (LWP 27212)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x0000003a17a09f97 in _L_lock_863 () from /lib64/libpthread.so.0 #2 0x0000003a17a09deb in __pthread_mutex_lock (mutex=0x7f524c08ab48) at pthread_mutex_lock.c:65 #3 0x00007f525f510277 in virMutexLock (m=0x7f524c08ab48) at util/threads-pthread.c:85 #4 0x00007f525f55afea in virDomainEventStateLock (state=0x7f524c08ab30) at conf/domain_event.c:591 #5 0x00007f525f55cebf in virDomainEventStateDeregisterConn (conn=0x7f5238053ba0, state=0x7f524c08ab30) at conf/domain_event.c:1510 #6 0x000000000045d13d in qemudClose (conn=0x7f5238053ba0) at qemu/qemu_driver.c:914 ---Type <return> to continue, or q <return> to quit--- #7 0x00007f525f59ae53 in virReleaseConnect (conn=0x7f5238053ba0) at datatypes.c:114 #8 0x00007f525f59afc9 in virUnrefConnect (conn=0x7f5238053ba0) at datatypes.c:149 #9 0x00007f525f55a696 in virDomainEventCallbackListPurgeMarked (cbList=0x7f524c08d1a0) at conf/domain_event.c:347 #10 0x00007f525f55c9fb in virDomainEventStateFlush (state=0x7f524c08ab30) at conf/domain_event.c:1307 #11 0x00007f525f55b105 in virDomainEventTimer (timer=65, opaque=0x7f524c08ab30) at conf/domain_event.c:630 #12 0x00007f525f4f7821 in virEventPollDispatchTimeouts () at util/event_poll.c:440 #13 0x00007f525f4f83be in virEventPollRunOnce () at util/event_poll.c:633 #14 0x00007f525f4f64eb in virEventRunDefaultImpl () at util/event.c:247 #15 0x00007f525f60e021 in virNetServerRun (srv=0x24cd790) at rpc/virnetserver.c:736 #16 0x0000000000424234 in main (argc=1, argv=0x7fff11c45848) at libvirtd.c:1602 (gdb)