Created attachment 430582 [details]
Description of problem:
When the condor service is stopped on RHEL4 machine and someone tries to configure this machine using remote configuration (wallaby service), the condor-configd stops running after condor is started. The condor_configd is started again by masterd after this stop, but it doesn't work properly. The machine doesn't accept any other configuration from configuration store.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Stop condor 'service condor stop' on machine with RHEL4
2. Try to configure the node using remote configuration
3. Start condor 'service condor start'
4. Try to configure the node using remote configuration again
Only configuration from step 2. is accepted by RHEL4 machine
There is log about exiting of condor_configd in MasterLog
07/09 04:37:33 The QMF_CONFIGD (pid 31552) exited with status 1
07/09 04:37:33 Sending obituary for "/usr/sbin/condor_configd"
There is no such exit of condor_configd in MasterLog
The configuration from step 4. is present on the RHEL4 machine
There are MasterLog and ConfigLog as they were after step 3. in attachment.
There was a problem handing the output of the query of the master daemon for the DAEMON_LIST. As a result, the main thread could crash and that could cause the periodic thread to deadlock or the configd to exit all together.
Tested with (version):
I repeat the test again and it doesn't work correctly.
This uncovered an issue in a logging error message that was causing the configd to exit. This wasn't seen before because the message would only be logged if there was an issue with the store, and it seems there is an issue with the store. The log message that causes the configd to exit is fixed, but still investigating the cause of the store issue.
The issue with the store is logged as 623220.
Cannot be validated until BZ623220 will be corrected.
08/18 05:12:50 DEBUG: Retrieved node object from store
08/18 05:12:58 DEBUG: Checking version of condor configuration
08/18 05:12:58 INFO: Retrieving configuration version "1282122748834749" from the store
08/18 05:12:59 DEBUG: Retrieved configuration from the store
08/18 05:12:59 ERROR: Store error: 1, ERROR: near "-": syntax error
08/18 05:12:59 ERROR: Failed to retrive differences between versions "1282039388665491" and "1282122748834749". No update performed
08/18 05:12:59 DEBUG: Performing a checkin with the store
08/18 05:12:59 DEBUG: Checked in with the store
Tested with (version):
RHEL5 x86_64,i386 - passed
RHEL4 x86_64,i386 - passed