Bug 495685 - HOSTALLOW_WRITE denied between HA Schedulers
HOSTALLOW_WRITE denied between HA Schedulers
Status: CLOSED ERRATA
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: grid (Show other bugs)
1.1.1
All Linux
medium Severity medium
: 1.2
: ---
Assigned To: Robert Rati
Martin Kudlej
:
Depends On:
Blocks: 527551
  Show dependency treegraph
 
Reported: 2009-04-14 08:18 EDT by Matthew Farrellee
Modified: 2009-12-03 04:19 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Grid bug fix C: When the schedd and startd were being configured on the same system C: Jobs to be run can match but will not ever start F: A change was made to the way the schedd and startd are configured R: Jobs will be matched and start as expected The schedd and startd configuration files were being configured incorrectly when installed on the same system. This was causing jobs to match, but refuse to start. A change was made to the way the daemons were configured, and jobs now run as expected.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-12-03 04:19:10 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matthew Farrellee 2009-04-14 08:18:16 EDT
# rpm -q condor-remote-configuration-server
condor-remote-configuration-server-1.0-14.el5

In the pool right now we have sched0 and sched1 as HA Schedulers and as Startds. However, HOSTALLOW_WRITE on one does not include the other. This means jobs to be run on one can match but cannot ever start...

From the StartLog on sched1:

4/14 08:09:05 PERMISSION DENIED to condor from host sched0 for command 442 (REQUEST_CLAIM), access level DAEMON: reason: cached result for DAEMON; see first case for the full reason

Please include a simple example to reproduce this.
Comment 1 Matthew Farrellee 2009-04-14 08:23:36 EDT
Maybe there's a simple typo in the code:

sched1# condor_config_val -dump | grep HOSTALLOW_WRITE_STARTD
HOSTALLOW_WRITE_STARTD = startd, $(HOSTALLOW_WRITE), $(FLOCK_FROM)

"startd" in this context has no meaning
Comment 2 Matthew Farrellee 2009-04-14 09:23:30 EDT
A workaround is to add config to the condor_config.overrides file:

 On sched0:
   HOSTALLOW_WRITE_STARTD = sched1, $(HOSTALLOW_WRITE_STARTD)

 On sched1:
   HOSTALLOW_WRITE_STARTD = sched0, $(HOSTALLOW_WRITE_STARTD)
Comment 3 Robert Rati 2009-09-17 16:14:51 EDT
This was actually an issue with the schedd and startd being configured on the same system.  The schedulers will now be prompted for if a startd is configured on a system with a schedd on it.  As always, nodes part of an HA scheduler do not need to be entered as they will be automatically determined from the HA schedd name.

Fixed in:
condor-remote-configuration-1.0-17
Comment 4 Robert Rati 2009-09-17 16:15:25 EDT
All schedulers will need to be re-configured with this package for the changes to take affect.
Comment 6 Martin Kudlej 2009-10-23 08:21:35 EDT
I've tried it on condor-remote-configuration-server-1.0-14 on RHEL5.4 and
condor-remote-configuration-1.0-14 on RHEL4.8 (i386 x x86_64) and it doesn't work.
I've tried it on condor-remote-configuration(-server)-1.0-22 and it works -->VERIFIED
Comment 7 Irina Boverman 2009-10-29 10:29:46 EDT
Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

New Contents:
please see bug summary.
Comment 8 Lana Brindley 2009-11-04 22:35:42 EST
Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1,8 @@
-please see bug summary.+Grid bug fix
+
+C: When the schedd and startd were being configured on the same system
+C: Jobs to be run can match but will not ever start
+F: A change was made to the way the schedd and startd are configured
+R: Jobs will be matched and start as expected
+
+The schedd and startd configuration files were being configured incorrectly when installed on the same system. This was causing jobs to match, but refuse to start. A change was made to the way the daemons were configured, and jobs now run as expected.
Comment 10 errata-xmlrpc 2009-12-03 04:19:10 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1633.html

Note You need to log in before you can comment on or make changes to this bug.