Description of problem: Scripts from following list give bad return code if they run after related daemon (such as condor_master, condor_schedd or condor_startd) unexpected stop. condor_reschedule condor_vacate condor_off condor_on condor_reconfig condor_restart Version-Release number of selected component (if applicable): condor-7.4.4-0.16 How reproducible: 100% Steps to Reproduce: 1. start condor and kill with '-9' condor_master or simply run 'condor_master -n' 2. run 'condor_off; echo $?' # condor_off; echo $? Can't connect to local master 0 Actual results: wrong return code Expected results: correct return code (other than 0), when error happened Additional info:
Or... $ echo "<1.2.3.4:1234>" > dummy_master_address $ _CONDOR_MASTER_ADDRESS_FILE=$PWD/dummy_master_address condor_off Can't connect to local master $ echo $? 0 The condor_master -n appears to just be a way to get a LOG/.master_address file written and not cleaned up when the master exits.
Bad... $ for t in condor_on condor_reconfig condor_restart condor_off; do $t; echo $?; done Can't connect to local master 0 Can't connect to local master 0 Can't connect to local master 0 Can't connect to local master 0 Good... $ for t in condor_on condor_reconfig condor_restart condor_off; do $t; echo $?; done Can't connect to local master 1 Can't connect to local master 1 Can't connect to local master 1 Can't connect to local master 1
Upstream at https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1895 Fixed for 7.5.6. commit 6bac22ec1623d39e72e0310ef9485586d1d2e112 Author: Matthew Farrellee <matt@redhat> Date: Thu Feb 3 15:56:08 2011 -0500 Report errors via tool exit codes when talking to local master, #1895 diff --git a/src/condor_tools/tool.cpp b/src/condor_tools/tool.cpp
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: Some command-line tools would not consistently report errors via their return code. C: Scripting the tools can be more complicated than is necessary. F: A number of tools (mentioned in BZ) were updated to report errors when connection problems occurred. R: The tools are now easier to script as they report errors when they have connection problems.
Reproduced on x86_64/RHEL5 with: $CondorVersion: 7.4.5 Feb 4 2011 BuildID: RH-7.4.5-0.8.el5 PRE-RELEASE $ $CondorPlatform: X86_64-LINUX_RHEL5 # condor_off; echo $? Can't connect to local master 0 # condor_vacate; echo $? Can't connect to local startd 0 # condor_off; echo $? Can't connect to local master 0 # condor_on; echo $? Can't connect to local master 0 # condor_reconfig; echo $? Can't connect to local master 0 # condor_restart ; echo $? Can't connect to local master 0
Retested over all supported platforms - x86,x96_64/RHEL5,RHEL6 with: condor-7.6.1-0.4 # condor_off; echo $? Can't find address for local master Perhaps you need to query another pool. 1 # condor_vacate; echo $? Can't find address for local startd Perhaps you need to query another pool. 1 # condor_off; echo $? Can't find address for local master Perhaps you need to query another pool. 1 # condor_on; echo $? Can't find address for local master Perhaps you need to query another pool. 1 # condor_reconfig; echo $? Can't find address for local master Perhaps you need to query another pool. 1 # condor_restart ; echo $? Can't find address for local master Perhaps you need to query another pool. 1 All tools returns correct error code, now. >>> VERIFIED
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0889.html