Bug 841486 - [vdsm] super vdsm server leave child process defunct after create vm
[vdsm] super vdsm server leave child process defunct after create vm
Status: CLOSED CURRENTRELEASE
Product: vdsm
Classification: oVirt
Component: General (Show other bugs)
4.14.0
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-3.6.0-rc
: 4.17.0
Assigned To: Oved Ourfali
Petr Kubica
infra
: Reopened
Depends On: 1180864 1181624 1210347
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-19 02:59 EDT by Royce Lv
Modified: 2016-02-22 07:00 EST (History)
16 users (show)

See Also:
Fixed In Version: 3.6.0-4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-13 09:40:24 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 39780 None None None Never

  None (edit)
Description Royce Lv 2012-07-19 02:59:56 EDT
Description of problem:
After create vm/localfs, found a process in defunct states.
It's a child process of supervdsm server.

[lvroyce@localhost x86_64]$ ps -ef |grep qemu
qemu     20614 20305  0 14:45 ?        00:00:00 [python] <defunct>

qemu     20886     1  3 14:45 ?        00:00:17 /usr/bin/qemu-kvm -name vm6 -S -M pc-1.0 -cpu qemu64,-svm -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid f7c1b02f-b304-4e68-8565-3659b1214c40 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=17-1,serial=0EA0A181-50B2-11CB-BBE1-DFC7A7500304_00:21:cc:62:a6:07,uuid=f7c1b02f-b304-4e68-8565-3659b1214c40 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm6.monitor,server,nowait -mon ......


[lvroyce@localhost x86_64]$ ps -ef |grep vdsm
vdsm     20244     1  0 14:44 ?        00:00:00 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10 --daemon --masterpid /var/run/vdsm/respawn.pid /usr/share/vdsm/vdsm
vdsm     20246 20244  0 14:44 ?        00:00:03 /usr/bin/python /usr/share/vdsm/vdsm
root     20304 20246  0 14:44 ?        00:00:00 /usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246
root     20305 20304  0 14:44 ?        00:00:00 /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246


Version-Release number of selected component (if applicable):
[lvroyce@localhost x86_64]$ rpm -q libvirt
libvirt-0.9.13-1.fc17.x86_64
[lvroyce@localhost x86_64]$ rpm -q vdsm
vdsm-4.10.0-0.185.gitb52165e.fc17.lvroyce1342680119.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Itamar Heim 2013-02-03 07:25:15 EST
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.
Comment 2 Sven Kieske 2014-05-09 11:54:22 EDT
please reopen, as I have defunct supervdsm process in ovirt 3.3.3
again.

See discussion on devel:
http://lists.ovirt.org/pipermail/devel/2014-May/007289.html
Comment 3 Dan Kenigsberg 2014-05-14 08:25:44 EDT
Starting to use zombiereaper may solve this issue, but there's another one in supervdsm: it uses multiprocessing, which uses subprocess.Popen, which is known to be buggy on python2.

Please consider monkey-patching

  multiprocessing.process.Process._Popen = CPopen

before use.
Comment 4 Sven Kieske 2014-06-18 04:21:30 EDT
any progress regarding this problem?
I still see supervdsmServer defunct processes popping up.

installed version is atm:
rpm -q vdsm
vdsm-4.13.3-3.el6.x86_64

if you need any additional logs, please tell me.

is this just about replacing subprocess.Popen with CPopen?

It would be nice if someone could explain where the exact problem is, which
prevents this from getting fixed, if it's just time to crawl through vdsm
code, I'm happy to assist.
Comment 5 Dan Kenigsberg 2014-06-21 09:58:13 EDT
(In reply to Sven Kieske from comment #4)

> is this just about replacing subprocess.Popen with CPopen?

No. That's an unrelated issue that I've noticed while reading the code. Note that my suggestion for consideration is wrong, as multiprocessing.forking.Popen does not have the same API as CPopen and subprocess.Popen.

I suppose that a properly-placed zombiereaper.autoReapPID(proc.pid) would take care of your zombies.
Comment 6 Dan Kenigsberg 2014-12-02 10:25:57 EST
We have to revert the patch from the 3.5 branch, as it makes the much more annoying Bug 1168217 more evident.
Comment 7 Yaniv Bronhaim 2015-01-12 06:33:32 EST
Adding dependency on Bug 1180864 which its solution allows to use zombiereaper without worries in supervdsmServer. After backport multiprocessing fix to python 2.6 which allows to handle SIGCHILD interuppts,  we'll be able to merge http://gerrit.ovirt.org/#/c/28915/ back
Comment 8 Yaniv Bronhaim 2015-01-13 08:03:25 EST
will merge when rhel 6.7 be out (See Bug 1180864). moving to 3.6
Comment 9 Dan Kenigsberg 2015-04-12 04:55:39 EDT
The former patches have been reverted when we realized that Python still had the EINTR bug. They must be re-posted.
Comment 11 Red Hat Bugzilla Rules Engine 2015-10-18 04:34:05 EDT
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 14 Petr Kubica 2016-01-07 06:42:08 EST
Verified in vdsm-4.17.15-0.el7ev.noarch
Comment 15 Sandro Bonazzola 2016-01-13 09:40:24 EST
oVirt 3.6.0 has been released, closing current release

Note You need to log in before you can comment on or make changes to this bug.