Bug 476021

Summary: Permissions issue when retrieving data from S3 in EC3E
Product: Red Hat Enterprise MRG Reporter: Robert Rati <rrati>
Component: gridAssignee: Robert Rati <rrati>
Status: CLOSED ERRATA QA Contact: Jeff Needle <jneedle>
Severity: medium Docs Contact:
Priority: low    
Version: 1.0CC: matt
Target Milestone: 1.1   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-02-04 16:06:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Robert Rati 2008-12-11 15:57:58 UTC
Description of problem:
An EC2E job completes, but the finalize hook is unable to place the tarball contents of the results in the routed job's spool dir because the spool dir is no longer owned by the job owner, but by user condor.

Version-Release number of selected component (if applicable):
condor-7.2.0-0.11
ec2e-hooks-1.0-6

How reproducible:
Run EC2E job and have the AMI exit before the results are able to be read from SQS by condor.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
SchedLog:
12/11 10:18:31 (pid:13672) Job 3273.0 is finished
...
12/11 10:18:31 (pid:13672) Job cleanup for 3273.0 will not block, calling jobIsFinished() directly
12/11 10:18:31 (pid:13672) jobIsFinished() completed, calling DestroyProc(3273.0)

JobRouterLog:
12/11 10:19:00 JobRouter (src=3266.7,dest=3273.0,route=Amazon Small): updated job status
12/11 10:19:00 JobRouter (src=3266.7,dest=3273.0,route=Amazon Small): found target job finished

Expected results:


Additional info:
The routed job is completing BEFORE the source job.  That means the EC2 job is completing and shutting down before the status message makes its way back.
When the ec2 job finishes condor chown's it's spool back to condor.condor.
The status that the routed job has completed has not yet been read. and the JR just fires off the cleanup hook, as it should, when the routed job completes.

Comment 1 Robert Rati 2008-12-17 15:14:24 UTC
The finalize hook no longer attempts to write to the routed job's spool directory, and instead writes to the source job's IWD.

Fixed in:
condor-ec2-enhanced-hooks-1.0-8

Comment 4 errata-xmlrpc 2009-02-04 16:06:27 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0036.html