Bug 474725
Summary: | Security session cache and account priv-switching | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Matthew Farrellee <matt> |
Component: | grid | Assignee: | Matthew Farrellee <matt> |
Status: | CLOSED ERRATA | QA Contact: | Jeff Needle <jneedle> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 1.0 | CC: | dan |
Target Milestone: | 1.1 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-02-04 16:04:12 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matthew Farrellee
2008-12-05 00:22:15 UTC
Here is an example configuration snippet that I used to produce the problem with security sessions and JobRouter: ################## BEGIN ####################### SEC_DEFAULT_AUTHENTICATION = REQUIRED # These settings become the default settings for all routes JOB_ROUTER_DEFAULTS = \ [ \ requirements=target.WantJobRouter is True; \ MaxIdleJobs = 10; \ MaxJobs = 200; \ \ delete_WantJobRouter = true; \ set_requirements = true; \ TargetUniverse = 12; \ ] # Now we define each of the routes to send jobs on JOB_ROUTER_ENTRIES = \ [ \ name = "Site 1"; \ ] # Reminder: you must restart Condor for changes to DAEMON_LIST to take effect. DAEMON_LIST = $(DAEMON_LIST) JOB_ROUTER # For testing, set this to a small value to speed things up. JOB_ROUTER_POLLING_PERIOD = 10 ########################## END ############################ I ran a personal condor (master, collector, negotiator, schedd, job_router) as root. I submitted jobs to it from two different accounts. The submit file was just this: ############ BEGIN ############## universe = vanilla requirements = false +WantJobRouter = true notification = never should_transfer_files = yes when_to_transfer_output = on_exit executable = /usr/bin/env output = stdout error = stderr queue ############ END ############## If you wait for the first user's job to be routed, then the second user's job will fail to be routed and will therefore hang around in the queue forever. The problem is visible in the SchedLog: 12/8 15:18:24 OwnerCheck(user1) failed in SetAttribute for job 8.0 In this case, it is failing when trying to mark the submitted job as being managed by the job router. It is failing because it is using a security session that is mapped to user1 when trying to operate on a job owned by user2. The problem does not happen unless read-access requires authentication. The reason is that the QMGMT command is registered as a read-level command. So an authenticated security session is only created if read-level access requires authentication. commit ff4f517a3e134ef0168f45f604e1e3b8a3d1e6be Author: Dan Bradley <dan@> Date: Tue Dec 9 15:58:11 2008 -0600 Changed FS authentication to authenticate as condor when possible. This is now consistent with other authentication methods. Also made the queue super user(s) able to set the owner attribute to any value from the list of users who have ever owned jobs in the current instance of the schedd. This, in combination with the FS change, allows JobRouter to submit jobs as condor for all queue management operations rather than as individual users, thus avoiding the issue of correct session cache use when a daemon uses multiple identities to talk to the same service. This will be part of 7.2.0-0.10 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0036.html |