Bug 841486 - [vdsm] super vdsm server leave child process defunct after create vm
Summary: [vdsm] super vdsm server leave child process defunct after create vm
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.14.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-3.6.0-rc
: 4.17.0
Assignee: Oved Ourfali
QA Contact: Petr Kubica
URL:
Whiteboard: infra
Depends On: 1180864 1181624 1210347
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-19 06:59 UTC by Royce Lv
Modified: 2019-12-16 04:27 UTC (History)
16 users (show)

Fixed In Version: 3.6.0-4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-01-13 14:40:24 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-3.6.0+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 39780 0 None None None Never

Description Royce Lv 2012-07-19 06:59:56 UTC
Description of problem:
After create vm/localfs, found a process in defunct states.
It's a child process of supervdsm server.

[lvroyce@localhost x86_64]$ ps -ef |grep qemu
qemu     20614 20305  0 14:45 ?        00:00:00 [python] <defunct>

qemu     20886     1  3 14:45 ?        00:00:17 /usr/bin/qemu-kvm -name vm6 -S -M pc-1.0 -cpu qemu64,-svm -enable-kvm -m 1024 -smp 1,sockets=1,cores=1,threads=1 -uuid f7c1b02f-b304-4e68-8565-3659b1214c40 -smbios type=1,manufacturer=Red Hat,product=RHEV Hypervisor,version=17-1,serial=0EA0A181-50B2-11CB-BBE1-DFC7A7500304_00:21:cc:62:a6:07,uuid=f7c1b02f-b304-4e68-8565-3659b1214c40 -nodefconfig -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm6.monitor,server,nowait -mon ......


[lvroyce@localhost x86_64]$ ps -ef |grep vdsm
vdsm     20244     1  0 14:44 ?        00:00:00 /bin/bash -e /usr/share/vdsm/respawn --minlifetime 10 --daemon --masterpid /var/run/vdsm/respawn.pid /usr/share/vdsm/vdsm
vdsm     20246 20244  0 14:44 ?        00:00:03 /usr/bin/python /usr/share/vdsm/vdsm
root     20304 20246  0 14:44 ?        00:00:00 /usr/bin/sudo -n /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246
root     20305 20304  0 14:44 ?        00:00:00 /usr/bin/python /usr/share/vdsm/supervdsmServer.py d2b71523-ce00-4495-bfa2-bd214577a32c 20246


Version-Release number of selected component (if applicable):
[lvroyce@localhost x86_64]$ rpm -q libvirt
libvirt-0.9.13-1.fc17.x86_64
[lvroyce@localhost x86_64]$ rpm -q vdsm
vdsm-4.10.0-0.185.gitb52165e.fc17.lvroyce1342680119.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Itamar Heim 2013-02-03 12:25:15 UTC
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.

Comment 2 Sven Kieske 2014-05-09 15:54:22 UTC
please reopen, as I have defunct supervdsm process in ovirt 3.3.3
again.

See discussion on devel:
http://lists.ovirt.org/pipermail/devel/2014-May/007289.html

Comment 3 Dan Kenigsberg 2014-05-14 12:25:44 UTC
Starting to use zombiereaper may solve this issue, but there's another one in supervdsm: it uses multiprocessing, which uses subprocess.Popen, which is known to be buggy on python2.

Please consider monkey-patching

  multiprocessing.process.Process._Popen = CPopen

before use.

Comment 4 Sven Kieske 2014-06-18 08:21:30 UTC
any progress regarding this problem?
I still see supervdsmServer defunct processes popping up.

installed version is atm:
rpm -q vdsm
vdsm-4.13.3-3.el6.x86_64

if you need any additional logs, please tell me.

is this just about replacing subprocess.Popen with CPopen?

It would be nice if someone could explain where the exact problem is, which
prevents this from getting fixed, if it's just time to crawl through vdsm
code, I'm happy to assist.

Comment 5 Dan Kenigsberg 2014-06-21 13:58:13 UTC
(In reply to Sven Kieske from comment #4)

> is this just about replacing subprocess.Popen with CPopen?

No. That's an unrelated issue that I've noticed while reading the code. Note that my suggestion for consideration is wrong, as multiprocessing.forking.Popen does not have the same API as CPopen and subprocess.Popen.

I suppose that a properly-placed zombiereaper.autoReapPID(proc.pid) would take care of your zombies.

Comment 6 Dan Kenigsberg 2014-12-02 15:25:57 UTC
We have to revert the patch from the 3.5 branch, as it makes the much more annoying Bug 1168217 more evident.

Comment 7 Yaniv Bronhaim 2015-01-12 11:33:32 UTC
Adding dependency on Bug 1180864 which its solution allows to use zombiereaper without worries in supervdsmServer. After backport multiprocessing fix to python 2.6 which allows to handle SIGCHILD interuppts,  we'll be able to merge http://gerrit.ovirt.org/#/c/28915/ back

Comment 8 Yaniv Bronhaim 2015-01-13 13:03:25 UTC
will merge when rhel 6.7 be out (See Bug 1180864). moving to 3.6

Comment 9 Dan Kenigsberg 2015-04-12 08:55:39 UTC
The former patches have been reverted when we realized that Python still had the EINTR bug. They must be re-posted.

Comment 11 Red Hat Bugzilla Rules Engine 2015-10-18 08:34:05 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 14 Petr Kubica 2016-01-07 11:42:08 UTC
Verified in vdsm-4.17.15-0.el7ev.noarch

Comment 15 Sandro Bonazzola 2016-01-13 14:40:24 UTC
oVirt 3.6.0 has been released, closing current release


Note You need to log in before you can comment on or make changes to this bug.