Bug 729121
Summary: | Investigate Shutdown Semantics of Condor daemons | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Timothy St. Clair <tstclair> |
Component: | condor | Assignee: | Timothy St. Clair <tstclair> |
Status: | CLOSED WONTFIX | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
Severity: | low | Docs Contact: | |
Priority: | medium | ||
Version: | 2.0 | CC: | matt, mkudlej, tstclair |
Target Milestone: | 2.2 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-03-27 19:10:03 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Timothy St. Clair
2011-08-08 18:44:02 UTC
How can we reproduce this? It's easy to shutdown collector, but how can we recognize that there is wrong behaviour? Please be more specific. This is more of a workitem BZ not necessarily a bug, but to help define behavior. So here is the logic behind it... /* On Unix, we define our own exit() call. The reason is messy: * Basically, daemonCore Create_Thread call fork() on Unix. * When the forked child calls exit, however, all the class * destructors are called. However, the code was never written in * a way that expects the daemons to be forked. For instance, some * global constructor in the schedd tells the gridmanager to shutdown... * certainly we do not want this happening in our forked child! Also, * we've seen problems were the forked child gets stuck in libc realloc * on Linux trying to free up space in the gsi libraries after being * called by some global destructor. So.... for now, if we are * forked via Create_Thread, we have our child exit _without_ calling * any c++ destructors. How do we accomplish that magic feat? By * exiting via a call to exec()! So here it is... we overload exit() * inside of daemonCore -- we do it this way so we catch all calls to * exit, including ones buried in dprintf etc. Note we dont want to * do this via a macro setting, because some .C files that call exit * do not include condor_daemon_core.h, and we don't want to put it * into condor_common.h (because we only want to overload exit for * daemonCore daemons). So doing it this way works for all cases. */ There is no real way to "fix this." It appears that this was a backstop fix given the architecture. I will take cleanup into account when refactoring pieces, namely by using shared_ptrs where possible so individual destructors don't cause too much damage. |