Red Hat Bugzilla – Bug 474842
EC2E job never completes
Last modified: 2009-02-04 11:04:54 EST
Description of problem:
randomly it seems an EC2E job won't complete. The job is created and the AMI started, but it seems the job is never run.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
When the AMI starts up, caroniad first tries to access AWS using the information provided by the user_data. If it was unable to access AWS because the information was wrong or because AWS was having problems at that instance then caroniad would exit and the job would never be run nor would the AMI be shutdown.
The caronia daemon now trys to access AWS 5 times (waiting 5 times between attempts) and if it still can't access AWS will shutdown the AMI. The hooks on the schedd will notice that the job has been run and force condor to re-route the job.
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.