Bug 858611 - Libvirt deadlock when restart libvirtd.
Libvirt deadlock when restart libvirtd.
Product: Virtualization Tools
Classification: Community
Component: libvirt (Show other bugs)
x86_64 Linux
unspecified Severity low
: ---
: ---
Assigned To: Libvirt Maintainers
Depends On:
  Show dependency treegraph
Reported: 2012-09-19 04:30 EDT by guozhonghua
Modified: 2012-09-25 09:48 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2012-09-25 09:48:40 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description guozhonghua 2012-09-19 04:30:23 EDT
Description of problem:

Libvirt deadlock when restart libvirtd.

Version-Release number of selected component (if applicable):

libvirtd version 0.9.10, qemu-kvm version 1.0.

How reproducible:
I run a shell which has a loop to do virsh domstate. There are about 50 domains.
When I restart libvirtd, it often deadlock.

Steps to Reproduce:
1. Run 50 domains;
2. Execute the sell do the loop tasks querying virsh domstate;
3. Restart the libvirtd service
Actual results:

Deadlock appeared after the libvirtd restarted.

Expected results:
No deadlock. 

Additional info:

Thread 1 stack:

#0  0x00007ff13603d89c in __lll_lock_wait () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007ff136039065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007ff136038eba in pthread_mutex_lock () from /lib/x86_64-linux-gnu/libpthread.so.0
#3  0x00007ff136efc702 in virMutexLock (m=0x7ff1280e04a0, func=0x5278b0 "qemuDriverLock", line=61) at util/threads-pthread.c:87
#4  0x00000000004aa6a6 in qemuDriverLock (driver=0x7ff1280e04a0, func=0x51ea7f "qemudClose", line=921) at qemu/qemu_conf.c:61
#5  0x0000000000458e0a in qemudClose (conn=0x7ff04c0230f0) at qemu/qemu_driver.c:921
#6  0x00007ff136f7227f in virReleaseConnect (conn=0x7ff04c0230f0) at datatypes.c:114
#7  0x00007ff136f7240d in virUnrefConnect (conn=0x7ff04c0230f0) at datatypes.c:149
#8  0x00007ff136f7bcbf in virConnectClose (conn=0x7ff04c0230f0) at libvirt.c:1471
#9  0x000000000043fdc2 in remoteClientFreeFunc (data=0x1a4b4a0) at remote.c:547
#10 0x00007ff136fd6635 in virNetServerClientFree (client=0x1877df0) at rpc/virnetserverclient.c:601
#11 0x00007ff136fd5691 in virNetServerClientEventFree (opaque=0x1877df0) at rpc/virnetserverclient.c:175
#12 0x00007ff136fe00c7 in virNetSocketEventFree (opaque=0x1a46e50) at rpc/virnetsocket.c:1329
#13 0x00007ff136ee4884 in virEventPollCleanupHandles () at util/event_poll.c:572
#14 0x00007ff136ee4a5a in virEventPollRunOnce () at util/event_poll.c:608
#15 0x00007ff136ee2def in virEventRunDefaultImpl () at util/event.c:247
#16 0x00007ff136fd4d17 in virNetServerRun (srv=0x186dcf0) at rpc/virnetserver.c:736
#17 0x0000000000420763 in main (argc=2, argv=0x7fffa1284e18) at libvirtd.c:1602 

Thread 2 stack:

#0  0x00007f7ed7c20d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1  0x00007f7ed8ae2885 in virCondWait (c=0x7f7eb00022b0, m=0x7f7eb0002250) at util/threads-pthread.c:121
#2  0x00000000004c2bbf in qemuMonitorSend (mon=0x7f7eb0002250, msg=0x7f7e6e7fba70) at qemu/qemu_monitor.c:794
#3  0x00000000004d36b2 in qemuMonitorJSONCommandWithFd (mon=0x7f7eb0002250, cmd=0x7f7eb00036d0, scm_fd=-1, reply=0x7f7e6e7fbb50) at qemu/qemu_monitor_json.c:230
#4  0x00000000004d37e2 in qemuMonitorJSONCommand (mon=0x7f7eb0002250, cmd=0x7f7eb00036d0, reply=0x7f7e6e7fbb50) at qemu/qemu_monitor_json.c:259
#5  0x00000000004d69be in qemuMonitorJSONGetBlockInfo (mon=0x7f7eb0002250, table=0x7f7eb00036f0) at qemu/qemu_monitor_json.c:1373
#6  0x00000000004c45ce in qemuMonitorGetBlockInfo (mon=0x7f7eb0002250) at qemu/qemu_monitor.c:1256
#7  0x00000000004a29de in qemuDomainCheckEjectableMedia (driver=0x7f7ec80102b0, vm=0x7f7ec802ab30) at qemu/qemu_hotplug.c:164
#8  0x00000000004b44d3 in qemuProcessReconnect (opaque=0x7f7ec815fc60) at qemu/qemu_process.c:2932
#9  0x00007f7ed8ae2a5f in virThreadHelper (data=0x7f7ec8103a10) at util/threads-pthread.c:165
#10 0x00007f7ed7c1ce9a in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#11 0x00007f7ed794a4bd in clone () from /lib/x86_64-linux-gnu/libc.so.6

I review the code:

Thread 1:

Thread 2:                                                          
    while (!mon->msg->finished) {                         
        if (virCondWait(&mon->notify, &mon->lock) < 0) {  

Thread 2 has called qemuDriverLock, send a cmd to qemu, then wait thread 1 to poll.
Thread 1 call virEventPollCleanupHandles, finally call qemudClose, which call qemuDriverLock.
Thread 2 wait thread 1 to poll, Thread 1 wait thread 2 to call qemuDriverUnlock.
So deadlock happened.

I would glad to receive any answers, thanks.
Comment 1 Dave Allan 2012-09-21 09:52:24 EDT
Is this still reproducible with the current git head?
Comment 2 guozhonghua 2012-09-25 04:36:22 EDT
Thank you.

I have a test with the 0.10.2 R2, the issue can not be reproduced.

And the code is corrected. 

int qemuDomainCheckEjectableMedia(struct qemud_driver *driver,
                             virDomainObjPtr vm,
                             enum qemuDomainAsyncJob asyncJob)
    ... ....
    if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) == 0) {
        table = qemuMonitorGetBlockInfo(priv->mon);
        qemuDomainObjExitMonitorWithDriver(driver, vm);
    ... ... 

Using qemuDomainObjEnterMonitorAsync replacing qemuDomainObjEnterMonitor to unlock the driver lock, so avoid the dead lock with virEventPollCleanupHandles. 

That may be the true for it.

Thanks a lot.
Comment 3 Dave Allan 2012-09-25 09:48:40 EDT
(In reply to comment #2)
> I have a test with the 0.10.2 R2, the issue can not be reproduced.
> And the code is corrected. 

Ok, I will close the BZ; thanks for reporting it.


Note You need to log in before you can comment on or make changes to this bug.