Bug 451798
Summary: | ec2 status for terminated and removed job not handled properly | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Matthew Farrellee <matt> |
Component: | condor | Assignee: | grid-maint-list <grid-maint-list> |
Status: | CLOSED DEFERRED | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 1.0 | CC: | jfrey, matt |
Target Milestone: | 2.0 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2011-01-07 17:58:27 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matthew Farrellee
2008-06-17 13:49:52 UTC
This behavior is intentional. The amazon_gahp will turn InvalidInstanceID.NotFound errors into successful operations for AMAZON_VM_STATUS and AMAZON_VM_STOP commands. This was done to simplify the error-handling in the gridmanager. If we don't want the gahp to eat these errors, we'll have to modify the gridmanager to recognize them and react appropriately. I think my description may have been poorly worded. The issue was that instances that are no longer reported by EC2 as even existing were not being handled properly. This is the case of EC2 removing all knowledge of an instance before AMAZON_VM_STATUS can be sent to query the instance. The semantics should be that in such a case the instance is considered terminated, but that wasn't happening. Testing this means withholding AMAZON_VM_STATUS commands until the instance has moved into the terminated state and been flushed from the output of ec2-describe-instances. FYI: amazon-gahp was renamed to amazon_gahp in version 7.2.0-0.8 (7.2.0 pre-release) If this appears again or more frequently it can be re-opened. |