Bug 606761

Summary:	Interruption of epoll_wait by SIGCHLD immediately leads to segv
Product:	Red Hat Enterprise MRG	Reporter:	Pete MacKinnon <pmackinn>
Component:	qpid-cpp	Assignee:	Andrew Stitcher <astitcher>
Status:	CLOSED CURRENTRELEASE	QA Contact:	MRG Quality Engineering <mrgqe-bugs>
Severity:	high	Docs Contact:
Priority:	urgent
Version:	beta	CC:	gsim, jross, matt
Target Milestone:	1.3
Target Release:	---
Hardware:	x86_64
OS:	Linux
URL:	https://issues.apache.org/jira/browse/QPID-2388
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-12-11 18:55:36 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	578137, 596210

Description Pete MacKinnon 2010-06-22 12:37:51 UTC

condor_schedd with QMF plug-in enabled can crash on exit. This daemon can fork/clone child processes (in this case the procd process family manager). 

This is more reproducible on a fast RHTS machine (Intel(R) Xeon(TM) CPU 3.20GHz
). Enabling D_PROCFAMILY adds some logging steps that can change the signal timing to avoid the EINTR on epoll_wait.

Test setup is to not set USE_PROCD to anything and allow the schedd to naturally spawn a procd (fork or clone doesn't matter).

STRACE>>>>

Bad shutdown strace:

futex(0x9891938, FUTEX_WAKE_PRIVATE, 1) = 1
epoll_ctl(17, EPOLL_CTL_MOD, 19, {EPOLLIN|EPOLLONESHOT, {u32=158216744, u64=158216744}}) = 0
rt_sigprocmask(SIG_SETMASK, [], ~[KILL STOP RTMIN RT_1], 8) = 0
epoll_wait(17, b6be51c8, 1, -1)         = -1 EINTR (Interrupted system call)
--- SIGCHLD (Child exited) @ 0 (0) ---
--- SIGSEGV (Segmentation fault) @ 0 (0) ---

Clean shutdown trace:

futex(0xa36ba10, FUTEX_WAKE_PRIVATE, 1) = 1
epoll_ctl(12, EPOLL_CTL_MOD, 13, {EPOLLIN|EPOLLONESHOT, {u32=171346464, u64=171346464}}) = 0
rt_sigprocmask(SIG_SETMASK, [], ~[KILL STOP RTMIN RT_1], 8) = 0
epoll_wait(12, {{EPOLLIN, {u32=2, u64=2}}}, 1, -1) = 1
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], NULL, 8) = 0
epoll_ctl(12, EPOLL_CTL_MOD, 13, {EPOLLIN|EPOLLOUT|EPOLLONESHOT, {u32=171346464, u64=171346464}}) = 0
epoll_ctl(12, EPOLL_CTL_DEL, 13, NULL)  = 0
time(NULL)                              = 1277169584
stat64("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
write(2, "2010-06-21 21:19:44 warning Conn"..., 46) = 46
futex(0xa3680f4, FUTEX_CMP_REQUEUE_PRIVATE, 1, 2147483647, 0xa3680d8, 2) = 1
futex(0xa3680d8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0xa3680f4, FUTEX_WAIT_PRIVATE, 3, NULL) = 0
futex(0xa3680d8, FUTEX_WAKE_PRIVATE, 1) = 0
close(13)                               = 0
_exit(0)                                = ?
Process 19655 detached

CODE>>>>

void Poller::run() {
    // Ensure that we exit thread responsibly under all circumstances
    try {
        // Make sure we can't be interrupted by signals at a bad time
        ::sigset_t ss;
        ::sigfillset(&ss);
        ::pthread_sigmask(SIG_SETMASK, &ss, 0);

        ++(impl->threadCount);
        do {
            Event event = wait();


Poller::Event Poller::wait(Duration timeout) {
    static __thread PollerHandlePrivate* lastReturnedHandle = 0;
    epoll_event epe;
    int timeoutMs = (timeout == TIME_INFINITE) ? -1 : timeout / TIME_MSEC;
    AbsTime targetTimeout = 
        (timeout == TIME_INFINITE) ?
            FAR_FUTURE :
            AbsTime(now(), timeout); 

    if (lastReturnedHandle) {
        impl->resetMode(*lastReturnedHandle);
        lastReturnedHandle = 0;
    }

    // Repeat until we weren't interrupted by signal
    do {
        PollerHandleDeletionManager.markAllUnusedInThisThread();
        // Need to run on kernels without epoll_pwait()
        // - fortunately in this case we don't really need the atomicity of epoll_pwait()
#if 1
        sigset_t os;
        pthread_sigmask(SIG_SETMASK, &impl->sigMask, &os);
        int rc = ::epoll_wait(impl->epollFd, &epe, 1, timeoutMs);
        pthread_sigmask(SIG_SETMASK, &os, 0);
#else
        int rc = ::epoll_pwait(impl->epollFd, &epe, 1, timeoutMs, &impl->sigMask);
#endif

        if (rc ==-1 && errno != EINTR) {
            QPID_POSIX_CHECK(rc);
        } else if (rc > 0) {
            assert(rc == 1);
            void* dataPtr = epe.data.ptr;

            // Check if this is an interrupt
            PollerPrivate::InterruptHandle& interruptHandle = impl->interruptHandle;

Comment 1 Andrew Stitcher 2010-06-22 15:21:41 UTC

What the qpid library does around the epoll_wait() may be to blame here.

Currently it unconditionally unblocks ALL signals (on this thread) before entering epoll_wait.

Then it enters epoll_wait - at this point ANY signal (even one that the application has blocked for the entire process) may be serviced.

Then it unconditionally blocks all signals.

The intention here is that sigals may not be processed at all on the I/O thread except when it is waiting for something to happen.

So it the application has blocked SIGCHLD (or any other signal) and never expects to have to process it this logic defeats this.

However this is not so straightforward to fix...

Comment 2 Andrew Stitcher 2010-06-22 16:55:24 UTC

I think the simplest solution is to block all signal delivery completely on the client IO threads.

The library internally uses no signals at all and sets no handlers.

Hoever doing this may impact somewhat on the broker as this application does use signals (at least a little). So some consideration will have to be given to the broker to make sure that there are still threads avaible in the broker to handle signals.

Comment 3 Pete MacKinnon 2010-06-22 17:11:01 UTC

Looks like the broker signal handlers are such:

void SignalHandler::setBroker(const boost::intrusive_ptr<Broker>& b) {
    broker = b;

    signal(SIGINT,shutdownHandler); 
    signal(SIGTERM, shutdownHandler);

    signal(SIGHUP,SIG_IGN); // TODO aconway 2007-07-18: reload config.

    signal(SIGCHLD,SIG_IGN); 
}

Comment 4 Pete MacKinnon 2010-06-22 21:11:36 UTC

More data...

Basically, (I think Matt already postualed this) the condor parent thread went through the motions of shutting down the daemon module, including the deletion of the daemonCore object. However, the process has registered signal handlers for SIGCHLD et al which propagates the signal using the DC object ptr. This handler was blocked from receiving SIGCHLD when the sub-process (procd) terminated by the qpid epoll code. So, condor daemon core gets the "late" SIGCHLD and boom!

qpid::sys::Poller::wait (this=0xb6a00ce8, timeout=...)
    at qpid/sys/epoll/EpollPoller.cpp:571
571	        pthread_sigmask(SIG_SETMASK, &os, 0);
(gdb) n
570	        int rc = ::epoll_wait(impl->epollFd, &epe, 1, timeoutMs);
(gdb) n
571	        pthread_sigmask(SIG_SETMASK, &os, 0);
(gdb) n
576	        if (rc ==-1 && errno != EINTR) {
(gdb) n
667	        if (timeoutMs == -1) {
(gdb) n
564	        PollerHandleDeletionManager.markAllUnusedInThisThread();
(gdb) n
569	        pthread_sigmask(SIG_SETMASK, &impl->sigMask, &os);
(gdb) n
564	        PollerHandleDeletionManager.markAllUnusedInThisThread();
(gdb) n
569	        pthread_sigmask(SIG_SETMASK, &impl->sigMask, &os);
(gdb) n
570	        int rc = ::epoll_wait(impl->epollFd, &epe, 1, timeoutMs);
(gdb) n

Program received signal SIGSEGV, Segmentation fault.
0x08175f4d in unix_sigchld () at daemon_core_main.cpp:1288
1288		daemonCore->Send_Signal( daemonCore->getpid(), SIGCHLD );
(gdb) p daemonCore
$1 = (DaemonCore *) 0x0

Comment 5 Andrew Stitcher 2010-06-23 03:51:22 UTC

So we have a good explanation for the bug.

I think however that condor need fixing as well as qpid. The pattern of just blocking a signal that can never now be delivered due to the handler effectively being deleted is a bad idea and at the very least a bug waiting to happen in some other circumstance.

So instead of just blocking the SIGCHLD signal when the daemonCore is deleted the code should be unregistering the handler and setting it back to the default handling (in this case ignore).

I think that either fixing qpid or condor would be sufficient, but doing both is long term better.

Comment 6 Andrew Stitcher 2010-06-23 21:43:28 UTC

mrg 1.3 qpid code updated with the upstream fix from r957109

Comment 11 Justin Ross 2012-12-11 18:55:36 UTC

Please reopen if this is apparently still an issue.