Bug 766612 - condor_schedd.init - stop should return 0 if there is not service executable
Summary: condor_schedd.init - stop should return 0 if there is not service executable
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: 2.1
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: 2.3
: ---
Assignee: Robert Rati
QA Contact: Tomas Rusnak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-12-12 12:56 UTC by Martin Kudlej
Modified: 2013-03-06 18:40 UTC (History)
4 users (show)

Fixed In Version: condor-7.8.2-0.3
Doc Type: Bug Fix
Doc Text:
C: Removing the binary on a node running the schedd in a High Availability configuration with Red Hat High Availability C: Red Hat High Availability would not be able to stop the schedd because the binary was missing and the service wouldn't failover C: The init script used to control the schedd does not check for the existence of the binary before stopping R: A stop operation will be attempted even if the schedd binary isn't on the system
Clone Of:
Environment:
Last Closed: 2013-03-06 18:40:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0564 0 normal SHIPPED_LIVE Low: Red Hat Enterprise MRG Grid 2.3 security update 2013-03-06 23:37:09 UTC

Description Martin Kudlej 2011-12-12 12:56:06 UTC
Description of problem:
I think function stop should return 0 in condor_schedd.init if there is not executable of condor_schedd.

- if [ "$1" != "status" ]; then
+ if [ "$1" != "status" -a "$1" != "stop" ]; then
    # Report that $prog does not exist, or is not executable
    if [ ! -x /usr/sbin/$prog ]; then
        exit 5
    fi

    [ $running -eq 4 ] && exit 7
fi


Version-Release number of selected component (if applicable):
condor-7.6.5-0.8.el6

How reproducible:
100%

Steps to Reproduce:
1. setup RH HA + HA scheduler and run it so there will run condor_schedd on one node(X) of cluster
2. rename  /usr/sbin/condor_schedd on X
3. killall condor_schedd on X
  
Actual results:
There is message that HA schedd service failed in clustat.

Expected results:
Condor_schedd.init will return 0 so RH HA can move service to another node.

Comment 1 Robert Rati 2012-03-30 14:05:30 UTC
Removed the binary check on stop.  The stop method will only return failure if there was a problem stopping the daemon.

Tracking on upstream branch:
V7_6-branch

Comment 2 Robert Rati 2012-04-02 14:32:25 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Removing the binary on a node running the schedd in a High Availability configuration with Red Hat High Availability
C: Red Hat High Availability would not be able to stop the schedd because the binary was missing and the service wouldn't failover
C: The init script used to control the schedd does not check for the existence of the binary before stopping
R: A stop operation will be attempted even if the schedd binary isn't on the system

Comment 4 Tomas Rusnak 2012-06-20 13:58:50 UTC
/usr/libexec/condor/condor_schedd.init:

stop() {
    killproc -p $pidfile $prog -QUIT
    RETVAL=$?
    wait_pid $pidfile 15
    if [ $? -ne 0 ]; then
        RETVAL=1
    fi
    return $RETVAL
}

The cluster init script stop function was changed as requested. Now the function returns 0 only when service was stopped correctly.

>>> VERIFIED

Comment 9 Tomas Rusnak 2013-01-03 10:06:53 UTC
/usr/libexec/condor/condor_schedd.init:

stop() {
    killproc -p $pidfile $prog -QUIT
    RETVAL=$?
    wait_pid $pidfile 15
    if [ $? -ne 0 ]; then
        RETVAL=1
    fi
    return $RETVAL
}

Checked on all supported platforms: RHEL5/6 on i686/x86_64.

RH-7.8.7-0.6

>>> VERIFIED

Comment 11 errata-xmlrpc 2013-03-06 18:40:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0564.html


Note You need to log in before you can comment on or make changes to this bug.