Red Hat Bugzilla – Bug 471901
rewrite init script to use condor_off
Last modified: 2009-02-04 11:05:37 EST
Description of problem:
The Condor init script uses killproc, which does take down the condor_master. However, it does not do so gracefully.
The condor_master likes to wait around for all daemons it is managing to exit. killproc will initiate this process with a kill, but will escalate things with a kill -9. The result is the master does not get to do its normal cleanup, which includes removing SCHEDD.lock files in an HA Schedd setup.
It is possible to skip using the init script for everyday stop'ing of Condor, but many people will use it anyway.
Consider re-writing the init script to use condor_off -master, followed by a condor_off -master -fast, and only as a last resort (drastic!) actually kill -9 the master.
Version-Release number of selected component (if applicable):
7.2.0-0.1 and before
Using condor_off is a bad idea in an init script for a few reasons, such as 1) it ignores the pidfile and 2) it may be denied by local policy
A better solution is to use SIGQUIT, which initiates a fast shutdown
This is fixed in 7.2.0-0.3
NOTE: This change stops sending -KILL to the master, which means if it hang during shutdown, it will remain hanging.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.