Bug 458889
Summary: | Job Hooks leave starter in wrong privstate | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Robert Rati <rrati> |
Component: | grid | Assignee: | Robert Rati <rrati> |
Status: | CLOSED ERRATA | QA Contact: | Kim van der Riet <kim.vdriet> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 1.0 | CC: | matt |
Target Milestone: | 1.1 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-02-04 16:06:06 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
The problem is that before condor was forking to create a process to run the hooks, it is doing checks on the executable/command as the priv mode specified. For the condor_final priv, it was changing to ruid condor instead of changing to euid condor for the checks so the starter ended up permanently as the condor user and thus wasn’t able to access files not world readable. A job like the one below should produce errors in StarterLog about being unable to access stdout and stderr: Cmd = “/bin/date” Out = “/home/rsquared/date.output” Err = “/home/rsquared/date.err” Iwd = “/home/rsquared” Owner = “rsquared” Bug supposedly present in condor 7.0.4-4 An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0036.html |
Description of problem: It looks like after the job hooks are run, the starter somehow ended up in a final condor uid state and can't access files correctly created by a job. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: >>> 8/8 09:03:57 in VanillaProc::StartJob() >>> 8/8 09:03:57 in OsProc::StartJob() >>> 8/8 09:03:57 IWD: /autohome/u100/rherban/condortest >>> 8/8 09:03:57 passwd_cache: setgroups( rherban ) failed. >>> 8/8 09:03:57 set_user_egid – ERROR: initgroups(rherban, 6751) failed, >>> errno: Operation not permitted >>> 8/8 09:03:57 Input file: /autohome/u100/rherban/condortest/in.blast >>> 8/8 09:03:57 Failed to open >>> ‘/autohome/u100/rherban/condortest/out.blast’ as standard output: >>> Permission denied (errno 13) >>> 8/8 09:03:57 Failed to open >>> ‘/autohome/u100/rherban/condortest/err.blast’ as standard error: >>> Permission denied (errno 13) >>> 8/8 09:03:57 Failed to open some/all of the std files… >>> 8/8 09:03:57 Aborting OsProc::StartJob. >>> 8/8 09:03:57 Failed to start job, exiting Expected results: The job should execute under the correct permissions, and files be accessible after completion. Additional info: