Bug 1078589

Summary: use of tls with libvirt.so can leave zombie processes
Product: Red Hat Enterprise Linux 6 Reporter: Eric Blake <eblake>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.2CC: cpelland, dyuan, eblake, jdenemar, jherrman, mjenner, mzhan, rbalakri, ydu, zhwang
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-0.10.2-30.el6 Doc Type: Bug Fix
Doc Text:
A previous update introduced an error where a SIG_SETMASK argument was incorrectly replaced by a SIG_BLOCK argument after the poll() system call. Consequently, the SIGCHLD signal could be permanently blocked, which caused signal masks not to return to their original values and defunct processes to be generated. With this update, the original signal masks are restored as intended, and poll() now functions correctly.
Story Points: ---
Clone Of:
: 1078590 (view as bug list) Environment:
Last Closed: 2014-10-14 04:20:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1078590, 1080501    

Description Eric Blake 2014-03-19 23:06:44 UTC
Description of problem:
Libvirt commit 434de30 refactored the client-side tls code, but accidentally changed a SIG_SETMASK to a SIG_BLOCK when attempting to restore signals after temporarily blocking them around a poll() call.  As a result, the client can end up with SIGCHLD permanently blocked, at which point the child leaks zombie processes.

Version-Release number of selected component (if applicable):
libvirt-0.10.2-29.el6_5.5
but present all the way back to RHEL 6.2

How reproducible:
https://www.redhat.com/archives/libvir-list/2014-March/msg00858.html

Steps to Reproduce:
1. See the upstream mail thread
2.
3.

Actual results:
zombie processes leaked because SIGCHLD permanently blocked

Expected results:
no zombies, correct signal handling

Additional info:
Fixed with this patch upstream:
commit 3d4b4f5ac634c123af1981084add29d3a2ca6ab0
Author: Michal Privoznik <mprivozn>
Date:   Wed Mar 19 18:10:34 2014 +0100

    virNetClientSetTLSSession: Restore original signal mask
    
    Currently, we use pthread_sigmask(SIG_BLOCK, ...) prior to calling
    poll(). This is okay, as we don't want poll() to be interrupted.
    However, then - immediately as we fall out from the poll() - we try to
    restore the original sigmask - again using SIG_BLOCK. But as the man
    page says, SIG_BLOCK adds signals to the signal mask:
    
    SIG_BLOCK
          The set of blocked signals is the union of the current set and the set argument.
    
    Therefore, when restoring the original mask, we need to completely
    overwrite the one we set earlier and hence we should be using:
    
    SIG_SETMASK
          The set of blocked signals is set to the argument set.
    
    Signed-off-by: Michal Privoznik <mprivozn>

Comment 1 Eric Blake 2014-03-19 23:07:47 UTC
Technically a regression from RHEL 6.1 behavior; but as it has been so long since the bug was introduced I'm not sure if it deserves a z-stream fix to 6.5 or if it can just wait for 6.6

Comment 6 zhenfeng wang 2014-04-14 09:13:50 UTC
Verify this bug with libvirt-0.10.2-32.el6.x86_64, The verify steps were the same with the comment 12 in bug 1080501, since i can get the same result with the bug 1080501, so mark this bug verified

Comment 8 errata-xmlrpc 2014-10-14 04:20:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1374.html