Bug 458889 - Job Hooks leave starter in wrong privstate
Job Hooks leave starter in wrong privstate
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: grid (Show other bugs)
All Linux
medium Severity medium
: 1.1
: ---
Assigned To: Robert Rati
Kim van der Riet
Depends On:
  Show dependency treegraph
Reported: 2008-08-12 18:16 EDT by Robert Rati
Modified: 2009-02-04 11:06 EST (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2009-02-04 11:06:06 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0036 normal SHIPPED_LIVE Red Hat Enterprise MRG Grid 1.1 Release 2009-02-04 11:03:49 EST

  None (edit)
Description Robert Rati 2008-08-12 18:16:56 EDT
Description of problem:
It looks like after the job hooks are run, the starter somehow ended up in 
a final condor uid state and can't access files correctly created by a job.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
Actual results:
>>> 8/8 09:03:57 in VanillaProc::StartJob()
>>> 8/8 09:03:57 in OsProc::StartJob()
>>> 8/8 09:03:57 IWD: /autohome/u100/rherban/condortest
>>> 8/8 09:03:57 passwd_cache: setgroups( rherban ) failed.
>>> 8/8 09:03:57 set_user_egid – ERROR: initgroups(rherban, 6751) failed,
>>> errno: Operation not permitted
>>> 8/8 09:03:57 Input file: /autohome/u100/rherban/condortest/in.blast
>>> 8/8 09:03:57 Failed to open
>>> ‘/autohome/u100/rherban/condortest/out.blast’ as standard output:
>>> Permission denied (errno 13)
>>> 8/8 09:03:57 Failed to open
>>> ‘/autohome/u100/rherban/condortest/err.blast’ as standard error:
>>> Permission denied (errno 13)
>>> 8/8 09:03:57 Failed to open some/all of the std files…
>>> 8/8 09:03:57 Aborting OsProc::StartJob.
>>> 8/8 09:03:57 Failed to start job, exiting

Expected results:
The job should execute under the correct permissions, and files be accessible after completion.

Additional info:
Comment 1 Robert Rati 2008-08-12 18:27:49 EDT
The problem is that before condor was forking to create a process to run the hooks, it is doing checks on the executable/command as the priv mode specified. For the condor_final priv, it was changing to ruid condor instead of changing to euid condor for the checks so the starter ended up permanently as the condor user and thus wasn’t able to access files not world readable.

A job like the one below should produce errors in StarterLog about being unable to access stdout and stderr:

Cmd = “/bin/date”
Out = “/home/rsquared/date.output”
Err = “/home/rsquared/date.err”
Iwd = “/home/rsquared”
Owner = “rsquared”
Comment 2 Matthew Farrellee 2008-08-12 18:34:38 EDT
Bug supposedly present in condor 7.0.4-4
Comment 5 errata-xmlrpc 2009-02-04 11:06:06 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.