Description of problem: libvirtd hangs when determinate the qemu capabilities Version-Release number of selected component (if applicable): version greater than 1.0.1, current version is 1.1.4 How reproducible: run libvirtd Steps to Reproduce: 1. build libvirtd from source code ./configure --prefix=/usr --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc --localstatedir=/var/lib --libdir=/usr/lib64 --disable-silent-rules --disable-dependency-tracking --with-libvirtd --with-avahi --without-xen --without-xen-inotify --without-xenapi --without-libxl --without-openvz --with-lxc --without-vbox --with-uml --with-qemu --with-yajl --with-phyp --with-esx --with-vmware --with-network --with-storage-fs --with-storage-lvm --with-storage-iscsi --with-storage-disk --with-storage-mpath --without-storage-rbd --without-numactl --without-numad --without-selinux --with-fuse --with-udev --with-capng --without-polkit --with-sasl --with-macvtap --with-libpcap --with-virtualport --without-firewalld --enable-nls --with-python --with-qemu-user=qemu --with-qemu-group=qemu --with-audit --without-netcf --without-hal --without-sanlock --with-init-script=systemd --disable-static --docdir=/usr/share/doc/libvirt-1.1.4 --with-remote --localstatedir=/var 2. run libvirtd Workaround: apply following patch: --- libvirt-1.1.4.orig/src/util/vircommand.c 2013-10-29 16:27:14.000000000 +0800 +++ libvirt-1.1.4/src/util/vircommand.c 2013-11-10 10:47:02.967066588 +0800 @@ -1906,7 +1906,7 @@ nfds++; } - if (nfds == 0) + if (nfds <= 1) break; if (poll(fds, nfds, -1) < 0)
another workaround: --- libvirt-1.0.5.orig/src/qemu/qemu_capabilities.c 2013-04-26 14:17:40.000000000 +0800 +++ libvirt-1.0.5/src/qemu/qemu_capabilities.c 2013-05-19 17:52:23.712369142 +0800 @@ -2452,16 +2452,19 @@ virQEMUCapsInitQMP(virQEMUCapsPtr qemuCa "-M", "none", "-qmp", monarg, "-pidfile", pidfile, - "-daemonize", NULL); virCommandAddEnvPassCommon(cmd); virCommandClearCaps(cmd); virCommandSetGID(cmd, runGid); virCommandSetUID(cmd, runUid); + virCommandSetPidFile(cmd, pidfile); + virCommandDaemonize(cmd); if (virCommandRun(cmd, &status) < 0) goto cleanup;
Neither of the two patches make sense. What version of QEMU do you use and could you attach debug logs from libvirtd?
If you will post your proposed patches upstream at libvir-list, you will get a wider review from developers more familiar with the real root cause you are trying to solve.
I'm using qemu 1.6.1, but this issue can be reproduced with previous versions(i.e 1.4.x,1.5.x) The qemu build with following configuraiton(** --target-list=,x86_64-linux-user,arm-linux-user **) ./configure --cc=x86_64-pc-linux-gnu-gcc --host-cc=x86_64-pc-linux-gnu-gcc --prefix=/usr --sysconfdir=/etc --libdir=/usr/lib64 --docdir=/usr/share/doc/qemu-1.6.1/html --disable-bsd-user --disable-guest-agent --disable-strip --disable-werror --python=/usr/bin/python2.7 --enable-linux-user --disable-system --target-list=,x86_64-linux-user,arm-linux-user --disable-blobs --disable-bluez --disable-curses --disable-kvm --disable-libiscsi --disable-glusterfs --enable-seccomp --disable-sdl --disable-smartcard-nss --disable-tools --disable-vde --disable-libssh2 --disable-libusb --disable-debug-info --disable-debug-tcg --enable-docs --enable-tcg-interpreter When start the libvirtd, it hangs on determine the qemu capabilities, you can see a zombie process(qemu-system-arm) 2375 and it's child(original child) 2375 which is not terminated. Because the qemu-system-<arch> runs as a daemon, it forks itself and quits, but it is not wait()ed by its parent(libvirtd). And the child does not exit(or isn't terminated) either. I was reviewing the source code and debugging the issue and found the nfds equals 1. I'm curious if the only one fds(cmd->inpipe) is available. Because I tried to print out the nfds value, it's 1. I'll attach the libvirtd log shortly. 1885 for (;;) { 1886 size_t i; 1887 struct pollfd fds[3]; 1888 int nfds = 0; 1889 1890 if (cmd->inpipe != -1) { 1891 fds[nfds].fd = cmd->inpipe; 1892 fds[nfds].events = POLLOUT; 1893 fds[nfds].revents = 0; 1894 nfds++; 1895 } 1896 if (outfd != -1) { 1897 fds[nfds].fd = outfd; 1898 fds[nfds].events = POLLIN; 1899 fds[nfds].revents = 0; 1900 nfds++; 1901 } 1902 if (errfd != -1) { 1903 fds[nfds].fd = errfd; 1904 fds[nfds].events = POLLIN; 1905 fds[nfds].revents = 0; 1906 nfds++; 1907 } 1908 1909 if (nfds == 0) 1910 break; 1911 1912 if (poll(fds, nfds, -1) < 0) { Workaround without any patch, kill 2375(qemu-system-arch), then libvirtd executes 2534 and 2536(qemu-system-x86_64), kill 2536, libvirtds executes 2612 2614(it's qemu-kvm, it links to qemu-system-x86_64), kill it as well. $ virsh list ##########** hangs *** # $ ps -elfy ff | grep "libvirt\|qemu" S root 2271 2987 0 80 0 3164 45251 poll_s 07:50 pts/1 0:00 | \_ sudo ./daemon/libvirtd -f /etc/libvirt/libvirtd.conf S root 2273 2271 1 80 0 10196 114812 poll_s 07:50 pts/1 0:00 | \_ /home/sipingal/libvirt-1.1.4.orig/daemon/.libslibvirtd -f /etc/libvirt/libvirtd.conf Z qemu 2375 2273 0 80 0 0 0 exit 07:50 pts/1 0:00 | \_ [qemu-system-arm] <defunct> S sipingal 2429 32008 0 80 0 960 28093 pipe_w 07:50 pts/2 0:00 \_ grep --colour=auto libvirt\|qemu S nobody 1884 1 0 80 0 952 31244 poll_s 07:19 ? 0:00 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf S qemu 2378 1 0 80 0 5308 95085 poll_s 07:50 ? 0:00 /usr/bin/qemu-system-arm -S -no-user-config -nodefaults -nographic -M none -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait -pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize $ sudo kill 2378 $ ps -elfy ff | grep "qemu" Z qemu 2534 2273 0 80 0 0 0 exit 07:50 pts/1 0:00 | \_ [qemu-system-x86] <defunct> S sipingal 2564 32008 0 80 0 972 28093 pipe_w 07:50 pts/2 0:00 \_ grep --colour=auto qemu S qemu 2536 1 0 80 0 5068 95487 poll_s 07:50 ? 0:00 /usr/bin/qemu-system-x86_64 -S -no-user-config -nodefaults -nographic -M none -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait -pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize $ sudo kill 2536 $ ps -elfy ff | grep "qemu" Z qemu 2612 2273 1 80 0 0 0 exit 07:50 pts/1 0:00 | \_ [qemu-system-x86] <defunct> S sipingal 2627 32008 0 80 0 968 28093 pipe_w 07:50 pts/2 0:00 \_ grep --colour=auto qemu S qemu 2614 1 0 80 0 4744 79618 poll_s 07:50 ? 0:00 /usr/bin/qemu-system-x86_64 -machine accel=kvm -S -no-user-config -nodefaults -nographic -M none -qmp unix:/var/lib/libvirt/qemu/capabilities.monitor.sock,server,nowait -pidfile /var/lib/libvirt/qemu/capabilities.pidfile -daemonize sipingal@spad ~ $ sudo kill 2614 sipingal@spad ~ $ ps -elfy ff | grep "qemu" S sipingal 2675 32008 0 80 0 972 28093 pipe_w 07:51 pts/2 0:00 \_ grep --colour=auto qemu FYI, My system is Gentoo: $ eix -I qemu [I] app-emulation/qemu Available versions: 1.4.2 (~)1.5.2-r1 (~)1.5.2-r2 1.5.3 (~)1.6.0 (~)1.6.0-r1 (~)1.6.1 **9999 {accessibility +aio alsa bluetooth +caps +curl debug (+)fdt +filecaps glusterfs gtk iscsi +jpeg mixemu ncurses opengl +png pulseaudio python rbd sasl sdl +seccomp selinux smartcard spice ssh static static-softmmu static-user systemtap tci test +threads tls usb usbredir +uuid vde +vhost-net virtfs +vnc xattr xen xfs KERNEL="FreeBSD linux" PYTHON_TARGETS="python2_6 python2_7" QEMU_SOFTMMU_TARGETS="alpha arm cris i386 lm32 m68k microblaze microblazeel mips mips64 mips64el mipsel moxie or32 ppc ppc64 ppcemb s390x sh4 sh4eb sparc sparc64 unicore32 x86_64 xtensa xtensaeb" QEMU_USER_TARGETS="alpha arm armeb cris i386 m68k microblaze microblazeel mips mips64 mips64el mipsel mipsn32 mipsn32el or32 ppc ppc64 ppc64abi32 s390x sh4 sh4eb sparc sparc32plus sparc64 unicore32 x86_64"} Installed versions: 1.6.1(06:22:15 PM 10/24/2013)(aio alsa bluetooth caps curl fdt filecaps gtk jpeg mixemu ncurses opengl png pulseaudio python rbd sasl sdl seccomp spice tci threads tls usbredir uuid vde vhost-net virtfs vnc xattr xfs -accessibility -debug -glusterfs -iscsi -selinux -smartcard -ssh -static -static-softmmu -static-user -systemtap -test -usb -xen KERNEL="linux -FreeBSD" PYTHON_TARGETS="python2_7 -python2_6" QEMU_SOFTMMU_TARGETS="arm x86_64 -alpha -cris -i386 -lm32 -m68k -microblaze -microblazeel -mips -mips64 -mips64el -mipsel -moxie -or32 -ppc -ppc64 -ppcemb -s390x -sh4 -sh4eb -sparc -sparc64 -unicore32 -xtensa -xtensaeb" QEMU_USER_TARGETS="arm x86_64 -alpha -armeb -cris -i386 -m68k -microblaze -microblazeel -mips -mips64 -mips64el -mipsel -mipsn32 -mipsn32el -or32 -ppc -ppc64 -ppc64abi32 -s390x -sh4 -sh4eb -sparc -sparc32plus -sparc64 -unicore32") Homepage: http://www.qemu.org http://www.linux-kvm.org Description: QEMU + Kernel-based Virtual Machine userland tools
Created attachment 822695 [details] libvirtd log
the second workaround avoid running capability determination as a daemon, then the virCommandRun doesn't execute it via virCommandRunAsync(), but run it as a foreground process, then libvirtd can capture its stdout/err, that causes the NFDS changed to 2. 2617 /* 2618 * We explicitly need to use -daemonize here, rather than 2619 * virCommandDaemonize, because we need to synchronize 2620 * with QEMU creating its monitor socket API. Using 2621 * daemonize guarantees control won't return to libvirt 2622 * until the socket is present. 2623 */ 2624 cmd = virCommandNewArgList(qemuCaps->binary, 2625 "-S", 2626 "-no-user-config", 2627 "-nodefaults", 2628 "-nographic", 2629 "-M", "none", 2630 "-qmp", monarg, 2631 "-pidfile", pidfile, 2632 "-daemonize", 2633 NULL); 2634 virCommandAddEnvPassCommon(cmd); 2635 virCommandClearCaps(cmd); 2636 virCommandSetGID(cmd, runGid); 2637 virCommandSetUID(cmd, runUid); 2638 2639 if (virCommandRun(cmd, &status) < 0) 2640 goto cleanup; 2641 src/util/vircommand.c 2033 /** 2034 * virCommandRun: 2035 * @cmd: command to run 2036 * @exitstatus: optional status collection 2037 * 2038 * Run the command and wait for completion. 2039 * Returns -1 on any error executing the 2040 * command. Returns 0 if the command executed, 2041 * with the exit status set. If @exitstatus is NULL, then the 2042 * child must exit with status 0 for this to succeed. 2043 */ 2044 int 2045 virCommandRun(virCommandPtr cmd, int *exitstatus) 2046 { -----------------8<----------------------- 2102 /* If caller hasn't requested capture of stdout/err, then capture 2103 * it ourselves so we can log it. But the intermediate child for 2104 * a daemon has no expected output, and we don't want our 2105 * capturing pipes passed on to the daemon grandchild. 2106 */ 2107 if (!(cmd->flags & VIR_EXEC_DAEMON)) { 2108 if (!cmd->outfdptr) { 2109 cmd->outfdptr = &cmd->outfd; 2110 cmd->outbuf = &outbuf; 2111 string_io = true; 2112 } 2113 if (!cmd->errfdptr) { 2114 cmd->errfdptr = &cmd->errfd; 2115 cmd->errbuf = &errbuf; 2116 string_io = true; 2117 } 2118 } 2119 2120 cmd->flags |= VIR_EXEC_RUN_SYNC; 2121 if (virCommandRunAsync(cmd, NULL) < 0) { 2122 cmd->has_error = -1; 2123 return -1; 2124 } 2125
As Eric mentioned in comment 4, you really need to take this discussion to the libvirt list.
I'm pretty sure this is long since fixed *** This bug has been marked as a duplicate of bug 999765 ***