Bug 619404 - condor_configd restarted every hour
Summary: condor_configd restarted every hour
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: condor
Version: beta
Hardware: All
OS: Linux
low
medium
Target Milestone: 1.3
: ---
Assignee: Matthew Farrellee
QA Contact: MRG Quality Engineering
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-07-29 13:27 UTC by Tomas Rusnak
Modified: 2010-07-29 15:34 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-07-29 15:34:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Tomas Rusnak 2010-07-29 13:27:25 UTC
Description of problem:
The condor_configd was restarted every hour without any reason.

Version-Release number of selected component (if applicable):


How reproducible:
Start condor and take a look at the /var/log/condor/MasterLog.

Steps to Reproduce:
1. service condor start
2. tail -f /var/log/condor/MasterLog
3. wait couple of hours
  
Actual results:
condor_configd restarted every hour

Expected results:
no restart


Additional info:

07/28 17:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 8215
07/28 17:20:50 The QMF_CONFIGD (pid 8215) exited with status 1
07/28 17:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 18:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 9017
07/28 18:20:50 The QMF_CONFIGD (pid 9017) exited with status 1
07/28 18:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 19:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 9783
07/28 19:20:50 The QMF_CONFIGD (pid 9783) exited with status 1
07/28 19:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 20:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 10549
07/28 20:20:50 The QMF_CONFIGD (pid 10549) exited with status 1
07/28 20:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 21:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 11315
07/28 21:20:50 The QMF_CONFIGD (pid 11315) exited with status 1
07/28 21:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 22:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 12081
07/28 22:20:50 The QMF_CONFIGD (pid 12081) exited with status 1
07/28 22:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/28 23:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 12847
07/28 23:20:50 The QMF_CONFIGD (pid 12847) exited with status 1
07/28 23:20:50 restarting /usr/sbin/condor_configd in 3600 seconds
07/29 00:20:50 Started process "/usr/sbin/condor_configd", pid and pgroup = 13613
07/29 00:20:50 The QMF_CONFIGD (pid 13613) exited with status 1
07/29 00:20:50 restarting /usr/sbin/condor_configd in 3600 seconds

Comment 1 Tomas Rusnak 2010-07-29 13:28:46 UTC
Affected version:

$CondorVersion: 7.4.4 Jul 23 2010 BuildID: RH-7.4.4-0.6.el5 PRE-RELEASE $
$CondorPlatform: I386-LINUX_RHEL5 $

Comment 2 Matthew Farrellee 2010-07-29 15:34:12 UTC
The QMF_CONFIGD is failing and exiting with status 1. At each exit the condor_master is reporting it will restart the configd in 3600 seconds (1 hour), which it does. The configd fails and the cycle resets.

http://www.cs.wisc.edu/condor/manual/v7.4/3_3Configuration.html#15663

MASTER_BACKOFF_CEILING defaults to 1 hour (3600 seconds).

-

If you look earlier in the file you'll see the configd starting more frequently.


Note You need to log in before you can comment on or make changes to this bug.