Red Hat Bugzilla – Bug 1078590
use of tls with libvirt.so can leave zombie processes
Last modified: 2015-03-05 02:32:59 EST
Cloning to RHEL 7 +++ This bug was initially created as a clone of Bug #1078589 +++ Description of problem: Libvirt commit 434de30 refactored the client-side tls code, but accidentally changed a SIG_SETMASK to a SIG_BLOCK when attempting to restore signals after temporarily blocking them around a poll() call. As a result, the client can end up with SIGCHLD permanently blocked, at which point the child leaks zombie processes. Version-Release number of selected component (if applicable): libvirt-0.10.2-29.el6_5.5 but present all the way back to RHEL 6.2 How reproducible: https://www.redhat.com/archives/libvir-list/2014-March/msg00858.html Steps to Reproduce: 1. See the upstream mail thread 2. 3. Actual results: zombie processes leaked because SIGCHLD permanently blocked Expected results: no zombies, correct signal handling Additional info: Fixed with this patch upstream: commit 3d4b4f5ac634c123af1981084add29d3a2ca6ab0 Author: Michal Privoznik <mprivozn@redhat.com> Date: Wed Mar 19 18:10:34 2014 +0100 virNetClientSetTLSSession: Restore original signal mask Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling poll(). This is okay, as we don't want poll() to be interrupted. However, then - immediately as we fall out from the poll() - we try to restore the original sigmask - again using SIG_BLOCK. But as the man page says, SIG_BLOCK adds signals to the signal mask: SIG_BLOCK The set of blocked signals is the union of the current set and the set argument. Therefore, when restoring the original mask, we need to completely overwrite the one we set earlier and hence we should be using: SIG_SETMASK The set of blocked signals is set to the argument set. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- Additional comment from Eric Blake on 2014-03-19 17:07:47 MDT --- Technically a regression from RHEL 6.1 behavior; but as it has been so long since the bug was introduced I'm not sure if it deserves a z-stream fix to 6.5 or if it can just wait for 6.6
verify this issue on build libvirt-1.2.8-9.el7.x86_64 qemu-img-rhev-2.1.2-14.el7.x86_64 1. Preapre the tls env with 2 servers (one is client and the other is server) Make sure you could remote tls from client to server # virsh -c qemu+tls://server/system 2. Install perl-Sys-Virt-1.2.8-3.el7.x86_64 on libvirt client 3. On client, run libvirt-perl.pl as comments 4 4.[root@client ~]# perl libvirt-perl.pl init... pid=12300 while... fork 1 end... pid=12301 receive chld fork 2 end... pid=12302 receive chld connection open fork 3 end... pid=12303 receive chld fork 4 end... pid=12304 receive chld go next... while... fork 1 end... pid=12305 receive chld fork 2 end... pid=12306 receive chld connection open fork 3 end... pid=12307 receive chld fork 4 end... pid=12308 receive chld go next... while... fork 1 end... pid=12309 receive chld fork 2 end... pid=12310 receive chld connection open fork 3 end... pid=12311 receive chld fork 4 end... pid=12312 receive chld go next... while... fork 1 end... pid=12313 receive chld fork 2 end... pid=12314 receive chld connection open fork 3 end... pid=12315 receive chld fork 4 end... pid=12316 receive chld go next... while... fork 1 end... pid=12317 receive chld fork 2 end... pid=12318 receive chld connection open fork 3 end... pid=12320 receive chld fork 4 end... pid=12321 receive chld go next... while... fork 1 end... pid=12322 receive chld fork 2 end... pid=12323 receive chld connection open fork 3 end... pid=12324 receive chld .... 4. check process, no zombie process ps -afx |grep perl 12300 pts/0 S+ 0:00 | \_ perl libvirt-perl.pl 12382 pts/2 S+ 0:00 \_ grep --color=auto perl 5. repeat step 1-4 with libvirt tcp connection, got the same result move to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0323.html