Bug 592377 - Qpid C++ IOThread destructor blocks grid program exit on Linux and Windows
Qpid C++ IOThread destructor blocks grid program exit on Linux and Windows
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: qpid-cpp (Show other bugs)
Development
All All
high Severity urgent
: 1.3
: ---
Assigned To: Gordon Sim
MRG Quality Engineering
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-14 13:24 EDT by Pete MacKinnon
Modified: 2012-12-11 14:01 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-12-11 14:01:03 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Pete MacKinnon 2010-05-14 13:24:26 EDT
~IOThread() {
        ScopedLock<Mutex> l(threadLock);
        while (connections > 0) {
    >>>     noConnections.wait(threadLock); <<<
        }
        if (poller_)
            poller_->shutdown();
        for (int i=0; i<ioThreads; ++i) {
            t[i].join();
        }
    }

Linux issue:
If a Condor daemon raises an exception (and it can in various points of the code), it will try to minimally log and quickly exit. This is accounted for in Condor process management - a failed process will be restarted. However, when QMF is enabled the failed daemon is hung in the dtor above.

Windows issue:
<may need more info from Tim St Clair here>
Condor QMF-enabled daemons on Windows hang at exit at same line of code in dtor.

Suggestion:
Modify wait to overloaded version that takes time arg (wraps pthread_cond_wait on Linux, boost condition.timed_wait on Windows). Use a reasonable configurable timer value for shutdown (10 sec?).
Comment 1 Gordon Sim 2010-05-15 11:43:40 EDT
Your program calls exit() rather than returning from main? That means that the destructors will not be called so you will need to delete any QMF/Qpid objects prior to the exit in your exception handling logic.
Comment 2 Matthew Farrellee 2010-05-15 18:17:05 EDT
It absolutely calls exit() rather than returning from main. "exception" is not a true C++ exception, but an EXCEPT (exit now) macro. EXCEPT cannot be easily instrumented to cleanup QMF/Qpid objects introduced via dlopen, which is how we include QMF functionality.

From the description, it looks like a dtor is being called during exit.

Needing to call the destructors in a particular order raises the bar for entry, especially for existing software. It is also a new requirement since 1.2.

Pete, you might investigate __attribute__((destructor)) while Gordon, will you see if this dtor requirement is necessary?
Comment 3 Gordon Sim 2010-05-16 06:01:58 EDT
I don't think its the order in which destructors are called in this case, I think it is that some are not called at all. The exit() is the key difference between this case and the normal cases that I assume works ok. 

Can you describe in more detail how the executable is made up (what libs are loaded, what QMF/Qpid variables are held, are they global, static, etc)?
Comment 4 Gordon Sim 2010-06-01 03:55:40 EDT
Will no longer be a problem as of qpid-cpp-client-0.7.946106-2 as the IOThread destructor no longer waits on the condition.

Note You need to log in before you can comment on or make changes to this bug.