Bug 471901 - rewrite init script to use condor_off
Summary: rewrite init script to use condor_off
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: grid
Version: 1.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: 1.1
: ---
Assignee: Matthew Farrellee
QA Contact: Kim van der Riet
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-11-17 16:24 UTC by Matthew Farrellee
Modified: 2009-02-04 16:05 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-02-04 16:05:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0036 0 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 1.1 Release 2009-02-04 16:03:49 UTC

Description Matthew Farrellee 2008-11-17 16:24:49 UTC
Description of problem:

The Condor init script uses killproc, which does take down the condor_master. However, it does not do so gracefully.

The condor_master likes to wait around for all daemons it is managing to exit. killproc will initiate this process with a kill, but will escalate things with a kill -9. The result is the master does not get to do its normal cleanup, which includes removing SCHEDD.lock files in an HA Schedd setup.

It is possible to skip using the init script for everyday stop'ing of Condor, but many people will use it anyway.

Consider re-writing the init script to use condor_off -master, followed by a condor_off -master -fast, and only as a last resort (drastic!) actually kill -9 the master.


Version-Release number of selected component (if applicable):

7.2.0-0.1 and before

Comment 1 Matthew Farrellee 2008-11-21 03:36:53 UTC
Using condor_off is a bad idea in an init script for a few reasons, such as 1) it ignores the pidfile and 2) it may be denied by local policy

A better solution is to use SIGQUIT, which initiates a fast shutdown

This is fixed in 7.2.0-0.3

NOTE: This change stops sending -KILL to the master, which means if it hang during shutdown, it will remain hanging.

Comment 4 errata-xmlrpc 2009-02-04 16:05:37 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0036.html


Note You need to log in before you can comment on or make changes to this bug.