Description of problem: I think function stop should return 0 in condor_schedd.init if there is not executable of condor_schedd. - if [ "$1" != "status" ]; then + if [ "$1" != "status" -a "$1" != "stop" ]; then # Report that $prog does not exist, or is not executable if [ ! -x /usr/sbin/$prog ]; then exit 5 fi [ $running -eq 4 ] && exit 7 fi Version-Release number of selected component (if applicable): condor-7.6.5-0.8.el6 How reproducible: 100% Steps to Reproduce: 1. setup RH HA + HA scheduler and run it so there will run condor_schedd on one node(X) of cluster 2. rename /usr/sbin/condor_schedd on X 3. killall condor_schedd on X Actual results: There is message that HA schedd service failed in clustat. Expected results: Condor_schedd.init will return 0 so RH HA can move service to another node.
Removed the binary check on stop. The stop method will only return failure if there was a problem stopping the daemon. Tracking on upstream branch: V7_6-branch
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: Removing the binary on a node running the schedd in a High Availability configuration with Red Hat High Availability C: Red Hat High Availability would not be able to stop the schedd because the binary was missing and the service wouldn't failover C: The init script used to control the schedd does not check for the existence of the binary before stopping R: A stop operation will be attempted even if the schedd binary isn't on the system
/usr/libexec/condor/condor_schedd.init: stop() { killproc -p $pidfile $prog -QUIT RETVAL=$? wait_pid $pidfile 15 if [ $? -ne 0 ]; then RETVAL=1 fi return $RETVAL } The cluster init script stop function was changed as requested. Now the function returns 0 only when service was stopped correctly. >>> VERIFIED
/usr/libexec/condor/condor_schedd.init: stop() { killproc -p $pidfile $prog -QUIT RETVAL=$? wait_pid $pidfile 15 if [ $? -ne 0 ]; then RETVAL=1 fi return $RETVAL } Checked on all supported platforms: RHEL5/6 on i686/x86_64. RH-7.8.7-0.6 >>> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2013-0564.html