Bug 748507
Summary: | Wallaby provides DAEMON_LIST = >=MASTER -> condor_master failed to startup | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Matthew Farrellee <matt> |
Component: | wallaby | Assignee: | Will Benton <willb> |
Status: | CLOSED ERRATA | QA Contact: | Tomas Rusnak <trusnak> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 2.1 | CC: | ltoscano, matt, mkudlej, trusnak |
Target Milestone: | 2.2 | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | wallaby-0.14.3-1 (backported to wallaby-0.12.5-10) | Doc Type: | Bug Fix |
Doc Text: |
C: In some scenarios in which the default group configuration used string append operators, Wallaby could generate spurious node configurations.
C: When such spurious node configurations were deployed to Condor nodes, the Condor master could fail to start.
F: Wallaby no longer generates such configurations and should work around archived spurious configurations.
R: This problem should no longer present.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2012-09-19 17:41:55 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 828434 |
Description
Matthew Farrellee
2011-10-24 16:08:05 UTC
I've:
1) set up wallaby store
2) added Master,NodeAccess,ExecuteNode to default group
3) installed wallaby client on nodes
4) restart condor on clients
It hasn't started because:
$ condor_config_val DAEMON_LIST
DAEMON_LIST = >=>=STARTD, MASTER, CONFIGD
$ tail /var/log/condor/Master
>=>=STARTD is in the DAEMON_LIST parameter, but there is no executable path for it defined in the config files!
05/28/12 06:30:44 ERROR "Must have the path to >=>=STARTD defined." at line 1388 in file /builddir/build/BUILD/condor-7.6.7/src/condor_master.V6/masterDaemon.cpp
"rm /var/lib/condor/wallaby_node.config" has helped. Uh, sorry. It has helped just until load new wallaby configuration for clients. I've reproduced this now. Could you please write here how to reproduce it? Martin, the procedure you describe in comment 1 does the trick. The key is having a feature on the default group that appends to DAEMON_LIST (the ones you added will do), then activating the Wallaby configuration before having a new node check in. You can see an example of this in the following test case: http://git.fedorahosted.org/git/?p=grid/wallaby.git;a=blob;f=spec/bz748507_spec.rb;hb=HEAD Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: C: In some scenarios in which the default group configuration used string append operators, Wallaby could generate spurious node configurations. C: When such spurious node configurations were deployed to Condor nodes, the Condor master could fail to start. F: Wallaby no longer generates such configurations and should work around archived spurious configurations. R: This problem should no longer present. Retested on RHEL5/RHEL6 with:
wallaby-0.12.5-1.el6.noarch
# /usr/bin/wallaby load /var/lib/condor-wallaby-base-db/condor-base-db.snapshot
# wallaby add-features-to-group +++DEFAULT Master NodeAccess ExecuteNode
Console Connection Established...
# wallaby add-params-to-feature NodeAccess ALLOW_READ=* ALLOW_WRITE=*
Console Connection Established...
# wallaby add-params-to-group +++DEFAULT CONDOR_HOST=hostname
Console Connection Established...
# wallaby activate
Console Connection Established...
On clients:
# condor_config_val DAEMON_LIST
STARTD, MASTER, CONFIGD
No error found in MasterLog anymore:
# tail /var/log/condor/MasterLog
08/21/12 11:39:47 Using local config sources:
08/21/12 11:39:47 /etc/condor/config.d/00personal_condor.config
08/21/12 11:39:47 /etc/condor/config.d/99configd.config
08/21/12 11:39:47 /etc/condor/config.d/zzz_condor_config.test
08/21/12 11:39:47 /var/lib/condor/wallaby_node.config
08/21/12 11:39:47 DaemonCore: command socket at <IP:35639>
08/21/12 11:39:47 DaemonCore: private command socket at <IP:35639>
08/21/12 11:39:47 Setting maximum accepts per cycle 8.
08/21/12 11:39:47 Started DaemonCore process "/usr/sbin/condor_startd", pid and pgroup = 19031
08/21/12 11:39:47 Started process "/usr/sbin/condor_configd", pid and pgroup = 19032
>>> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2012-1278.html |