Red Hat Bugzilla – Bug 841486
[vdsm] super vdsm server leave child process defunct after create vm
Last modified: 2016-02-22 07:00:49 EST
Description of problem:
After create vm/localfs, found a process in defunct states.
It's a child process of supervdsm server.
[lvroyce@localhost x86_64]$ ps -ef |grep qemu
qemu 20614 20305 0 14:45 ? 00:00:00 [python] <defunct>
qemu 20886 1 3 14:45 ? 00:00:17 /usr/bin/qemu-kvm -name vm6 -S -M pc-1.0 -cpu qemu64,-svm -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid f7c1b02f-b304-4e68-8565-3659b1214c40 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=17-1,serial=0EA0A181-50B2-11CB-BBE1-DFC7A7500304_00:21:cc:62:a6:07,uuid=f7c1b02f-b304-4e68-8565-3659b1214c40 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm6.monitor,server,nowait -mon ......
[lvroyce@localhost x86_64]$ ps -ef |grep vdsm
vdsm 20244 1 0 14:44 ? 00:00:00 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10 --daemon --masterpid /var/run/vdsm/respawn.pid /usr/share/vdsm/vdsm
vdsm 20246 20244 0 14:44 ? 00:00:03 /usr/bin/python /usr/share/vdsm/vdsm
root 20304 20246 0 14:44 ? 00:00:00 /usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246
root 20305 20304 0 14:44 ? 00:00:00 /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246
Version-Release number of selected component (if applicable):
[lvroyce@localhost x86_64]$ rpm -q libvirt
[lvroyce@localhost x86_64]$ rpm -q vdsm
Steps to Reproduce:
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.
please reopen, as I have defunct supervdsm process in ovirt 3.3.3
See discussion on devel:
Starting to use zombiereaper may solve this issue, but there's another one in supervdsm: it uses multiprocessing, which uses subprocess.Popen, which is known to be buggy on python2.
Please consider monkey-patching
multiprocessing.process.Process._Popen = CPopen
any progress regarding this problem?
I still see supervdsmServer defunct processes popping up.
installed version is atm:
rpm -q vdsm
if you need any additional logs, please tell me.
is this just about replacing subprocess.Popen with CPopen?
It would be nice if someone could explain where the exact problem is, which
prevents this from getting fixed, if it's just time to crawl through vdsm
code, I'm happy to assist.
(In reply to Sven Kieske from comment #4)
> is this just about replacing subprocess.Popen with CPopen?
No. That's an unrelated issue that I've noticed while reading the code. Note that my suggestion for consideration is wrong, as multiprocessing.forking.Popen does not have the same API as CPopen and subprocess.Popen.
I suppose that a properly-placed zombiereaper.autoReapPID(proc.pid) would take care of your zombies.
We have to revert the patch from the 3.5 branch, as it makes the much more annoying Bug 1168217 more evident.
Adding dependency on Bug 1180864 which its solution allows to use zombiereaper without worries in supervdsmServer. After backport multiprocessing fix to python 2.6 which allows to handle SIGCHILD interuppts, we'll be able to merge http://gerrit.ovirt.org/#/c/28915/ back
will merge when rhel 6.7 be out (See Bug 1180864). moving to 3.6
The former patches have been reverted when we realized that Python still had the EINTR bug. They must be re-posted.
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Verified in vdsm-4.17.15-0.el7ev.noarch
oVirt 3.6.0 has been released, closing current release