Bug 999765 - Race condition in libvirt causes hang with qemu 1.6
Race condition in libvirt causes hang with qemu 1.6
Product: Virtualization Tools
Classification: Community
Component: libvirt (Show other bugs)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Eric Blake
: 1028728 (view as bug list)
Depends On:
  Show dependency treegraph
Reported: 2013-08-22 00:59 EDT by Joseph Wang
Modified: 2016-04-26 09:41 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-04-10 10:06:34 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Joseph Wang 2013-08-22 00:59:04 EDT
Description of problem:

My setup with libvirt hangs while computing the capabilities for qemu.  This is due to a race condition for libvirt in virCommandRun.  libvirt starts the qemu backends in daemonize mode.  

If the qemu backend exits before hitting the poll in virCommandProcessIO then the sockets are connected to a zombie process, and libvirtd hangs.  

The stack trace is as follows

Breakpoint 3, virCommandProcessIO (cmd=cmd@entry=0x7fffd4103f20)
    at util/vircommand.c:1884
1884	        if (poll(fds, nfds, -1) < 0) {
(gdb) where
#0  virCommandProcessIO (cmd=cmd@entry=0x7fffd4103f20)
    at util/vircommand.c:1884
#1  0x00007ffff753cb32 in virCommandRun (cmd=cmd@entry=0x7fffd4103f20, 
    exitstatus=exitstatus@entry=0x7fffdb87a120) at util/vircommand.c:2100
#2  0x00007fffdd7f0a40 in virQEMUCapsInitQMP (runGid=0, runUid=0, 
    libDir=<optimized out>, qemuCaps=0x7fffd4088910)
    at qemu/qemu_capabilities.c:2529
#3  virQEMUCapsNewForBinary (
    binary=binary@entry=0x7fffd40acb30 "/usr/bin/qemu-system-cris", 
    libDir=<optimized out>, runUid=0, runGid=0)
    at qemu/qemu_capabilities.c:2677
#4  0x00007fffdd7f246b in virQEMUCapsCacheLookup (
    binary=0x7fffd40acb30 "/usr/bin/qemu-system-cris")
    at qemu/qemu_capabilities.c:2763
#5  0x00007fffdd7f2961 in virQEMUCapsInitGuest (guestarch=VIR_ARCH_CRIS, 
    hostarch=VIR_ARCH_X86_64, cache=0x7fffd40aca30, caps=0x7fffd40acde0)
    at qemu/qemu_capabilities.c:685
#6  virQEMUCapsInit (cache=0x7fffd40aca30) at qemu/qemu_capabilities.c:905
#7  0x00007fffdd8202bb in virQEMUDriverCreateCapabilities (
    driver=driver@entry=0x7fffd40a7240) at qemu/qemu_conf.c:569
#8  0x00007fffdd852fe4 in qemuStateInitialize (privileged=<optimized out>, 
    callback=<optimized out>, opaque=<optimized out>) at qemu/qemu_driver.c:748

How reproducible:

On my machine libvirtd will lock consistently with qemu 1.6 while working on qemu 1.5.  However, as this is a race condition, this is likely to be happen differently on different machines.

The solution is to check if the process is a zombie before attempting to poll its sockets.
Comment 1 Cole Robinson 2016-04-10 10:06:34 EDT
Sorry this didn't receive a timely response. I recall a fix for this going into libvirt around the time of this bug but I can't find the commit... given the age of this bug I'm closing it as DEFERRED but if anyone still hits similar issues with recent libvirt and qemu, please reopen
Comment 2 Cole Robinson 2016-04-10 11:20:39 EDT
*** Bug 1028728 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.