Bug 612869

Summary: problem with wallaby client on RHEL4
Product: Red Hat Enterprise MRG Reporter: Lubos Trilety <ltrilety>
Component: wallabyAssignee: Robert Rati <rrati>
Status: CLOSED ERRATA QA Contact: Lubos Trilety <ltrilety>
Severity: medium Docs Contact:
Priority: medium    
Version: DevelopmentCC: rrati
Target Milestone: 1.3   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 623220    
Bug Blocks: 614414, 620511    
Attachments:
Description Flags
Logs none

Description Lubos Trilety 2010-07-09 09:30:42 UTC
Created attachment 430582 [details]
Logs

Description of problem:
When the condor service is stopped on RHEL4 machine and someone tries to configure this machine using remote configuration (wallaby service), the condor-configd stops running after condor is started. The condor_configd is started again by masterd after this stop, but it doesn't work properly. The machine doesn't accept any other configuration from configuration store.

Version-Release number of selected component (if applicable):
condor-wallaby-client-3.0-1

How reproducible:
100%

Steps to Reproduce:
1. Stop condor 'service condor stop' on machine with RHEL4
2. Try to configure the node using remote configuration
3. Start condor 'service condor start'
4. Try to configure the node using remote configuration again
  
Actual results:
Only configuration from step 2. is accepted by RHEL4 machine
There is log about exiting of condor_configd in MasterLog
E.g.
07/09 04:37:33 The QMF_CONFIGD (pid 31552) exited with status 1
07/09 04:37:33 Sending obituary for "/usr/sbin/condor_configd"

Expected results:
There is no such exit of condor_configd in MasterLog
The configuration from step 4. is present on the RHEL4 machine

Additional info:
There are MasterLog and ConfigLog as they were after step 3. in attachment.

Comment 1 Robert Rati 2010-07-09 20:18:02 UTC
There was a problem handing the output of the query of the master daemon for the DAEMON_LIST.  As a result, the main thread could crash and that could cause the periodic thread to deadlock or the configd to exit all together.

Fixed in:
condor-wallaby-client-3.1

Comment 2 Robert Rati 2010-07-09 20:18:22 UTC
Fixed in:
condor-wallaby-client-3.1-1

Comment 4 Lubos Trilety 2010-08-11 12:49:14 UTC
Tested with (version):
condor-wallaby-client-3.3-1

I repeat the test again and it doesn't work correctly.

Comment 5 Robert Rati 2010-08-11 15:10:50 UTC
This uncovered an issue in a logging error message that was causing the configd to exit.  This wasn't seen before because the message would only be logged if there was an issue with the store, and it seems there is an issue with the store.  The log message that causes the configd to exit is fixed, but still investigating the cause of the store issue.

Comment 6 Robert Rati 2010-08-11 16:44:21 UTC
The issue with the store is logged as 623220.

Comment 7 Robert Rati 2010-08-11 20:38:49 UTC
Fixed in:
condor-wallaby-client-3.4-1

Comment 8 Lubos Trilety 2010-08-18 09:16:49 UTC
Cannot be validated until BZ623220 will be corrected.

08/18 05:12:50 DEBUG: Retrieved node object from store
08/18 05:12:58 DEBUG: Checking version of condor configuration
08/18 05:12:58 INFO: Retrieving configuration version "1282122748834749" from the store
08/18 05:12:59 DEBUG: Retrieved configuration from the store
08/18 05:12:59 ERROR: Store error: 1, ERROR: near "-": syntax error
08/18 05:12:59 ERROR: Failed to retrive differences between versions "1282039388665491" and "1282122748834749".  No update performed
08/18 05:12:59 DEBUG: Performing a checkin with the store
08/18 05:12:59 DEBUG: Checked in with the store

Comment 9 Lubos Trilety 2010-08-20 08:20:00 UTC
Tested with (version):
wallaby-0.9.4-1
condor-wallaby-client-3.4-1

Tested on:
RHEL5 x86_64,i386  - passed
RHEL4 x86_64,i386  - passed

>>> VERIFIED