| Summary: | Instance fails to launch with message "The requested Availability Zone is no longer supported..." | ||
|---|---|---|---|
| Product: | [Retired] CloudForms Cloud Engine | Reporter: | Shveta <ssachdev> |
| Component: | aeolus-conductor | Assignee: | Jan Provaznik <jprovazn> |
| Status: | CLOSED ERRATA | QA Contact: | wes hayutin <whayutin> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 1.0.0 | CC: | akarol, cpelland, dajohnso, deltacloud-maint, dgao, hbrock, jclift, mandreou, mtaylor, ssachdev |
| Target Milestone: | beta | Keywords: | Reopened |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-05-15 21:41:49 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Shveta
2011-06-30 15:36:35 UTC
have ec2-east and ec2-west configured..
each w/ a realm..
launching deployments in east works..
launching the same deployment in west fails.. w/
>> Maximum connections set to 1024
>> Listening on localhost:3004, CTRL+C to stop
I, [2011-07-06T09:38:19.260506 #6769] INFO -- : New Aws::Ec2 using per_thread-connection mode
I, [2011-07-06T09:38:19.261234 #6769] INFO -- : Opening new HTTPS connection to ec2.us-west-1.amazonaws.com:443
I, [2011-07-06T09:38:19.856986 #6769] INFO -- : New Aws::S3Interface using per_thread-connection mode
I, [2011-07-06T09:38:19.857526 #6769] INFO -- : Opening new HTTPS connection to s3-us-west-1.amazonaws.com:443
W, [2011-07-06T09:38:20.503995 #6769] WARN -- : ##### Aws::S3Interface returned an error: 405 Method Not Allowed
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>MethodNotAllowed</Code><Message>The specified method is not allowed against this resource.</Message><ResourceType>SERVICE</ResourceType><Method>PUT</Method><RequestId>954CED60B796A1CA</RequestId><HostId>qmx4oYFyZqZA1czrG4382ePtSUmZhednMzRgGkvvuil250jWX+CxXcRTfNDeUiU4</HostId></Error> #####
W, [2011-07-06T09:38:20.504080 #6769] WARN -- : ##### Aws::S3Interface request: s3-us-west-1.amazonaws.com:443/ ####
!! Unexpected error while processing request: undefined method `details' for #<Deltacloud::ExceptionHandler::ProviderError:0x7fc5e44d9a70>
undefined method `details' for #<Deltacloud::ExceptionHandler::ProviderError:0x7fc5e44d9a70>
././views/errors/502.xml.haml:5:in `__tilt_d98a69ed6cee34ea553d44c4882dd338'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:195:in `send'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:195:in `evaluate'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:449:in `evaluate'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/tilt.rb:131:in `render'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:343:in `render_without_format'
././lib/sinatra/respond_to.rb:129:in `render'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:311:in `haml'
././lib/deltacloud/helpers/application_helper.rb:112:in `report_error'
././lib/sinatra/respond_to.rb:242:in `call'
././lib/sinatra/respond_to.rb:242:in `respond_to'
././lib/deltacloud/helpers/application_helper.rb:111:in `report_error'
././server.rb:61
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:641:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:641:in `error_block!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:636:in `each'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:636:in `error_block!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:630:in `handle_exception!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:605:in `dispatch!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `catch'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:399:in `call'
././lib/sinatra/rack_driver_select.rb:45:in `call'
././lib/sinatra/rack_matrix_params.rb:85:in `call'
././lib/sinatra/rack_runtime.rb:36:in `call'
././lib/sinatra/rack_etag.rb:42:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/methodoverride.rb:24:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/commonlogger.rb:18:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:1005:in `synchronize'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:76:i127.0.0.1 - - [06/Jul/2011 09:38:20] "GET /api HTTP/1.1" 200 1439 0.0110
127.0.0.1 - - [06/Jul/2011 09:38:22] "POST /api/keys HTTP/1.1" 201 2156 2.2933
127.0.0.1 - - [06/Jul/2011 09:38:22] "GET /api HTTP/1.1" 200 1439 0.0111
127.0.0.1 - - [06/Jul/2011 09:38:23] "GET /api/realms HTTP/1.1" 200 494 0.5363
ugh.. think I confused the issue here... :( disregard my comments in 1 & 2 disregard my comments in 1 & 2 Make only provider account for west and realm west-1-a , west-1-b, west1-c. Image launching fails for zone west-1-a and succeeds for b and c. Unfortunately the EC2 Error code descriptions offers little light on why this error is thrown. Check out Error Codes in the Dev guide: http://docs.amazonwebservices.com/AWSEC2/latest/APIReference/ However, I've found a thread on the dev forum that seems to explain the error: https://forums.aws.amazon.com/thread.jspa?threadID=67778. From reading the thread it seems that instance starts may sometimes be rejected, given some state in EC2 (probably being close to capacity in the selected zone). It also seems that the retrieved list via ec2-availability-zones will only return only the zones that a particular user CAN* deploy into at that point in time. I'm going to add a dc call to list realms in the matching code to make sure the selected realm is available, and if not return an appropriate response. There will still be a small possibility that this error will be thrown, given the time delay between doing the matching check and when the actual API request is sent through Condor. We are currently treating realms as a static list in conductor, given this error we may require to change the model to support a dynamic list, although I imagine this will be a considerable amount of work. nice bug!! Admin -> realms -> create realm (realmTest) realmTest -> add mapping to realm -> there you will find all the zones :) Sorry.. just documenting how to get to the zones After rereading that thread post. I realized that the word "Might" was used in the sentence: "We also might remove the constrained zone entirely from the list of options for new customers" Before I make any changes I need to know for sure if describe availability zones request actually does not return unavailable zones. I've replied to the thread with this q and am gunner put this BZ on hold until I get a response So it seems that the old thread wasn't getting much attention so I've opened a new one here: https://forums.aws.amazon.com/thread.jspa?threadID=71600 Two other potential solutions to this problem: Firstly, we could return an appropriate message back to the user this would allow them to adjust their realms appropriately in order to avoid this error again. The issue with this solution is that DeltaCloud Core does not return specific error codes that wrap the ec2 errors. A deltacloud exception is thrown and the ec2 error code is added to the error body. In order to handle this error we would have to add a hack for catching this ec2 error. I've been informed by core people that adding specific error codes to dc core is on the backlist. Secondly, we could get condor to retry the instance start on another potential zone. (if there is one available). Again tho, we would need to realise that the DC exception was caused by the chosen realm, and have to add a hack for the handling of the ec2 specific error. I think this solution is the most appropriate but i would imagine this would amount to a considerable amount of work. making sure all the bugs are at the right version for future queries As a note, we have a bug filed with Deltacloud asking for the Deltacloud library to pass better error information: https://issues.apache.org/jira/browse/DTACLOUD-62 Please vote for it, so it gets attention from the Deltacloud devs. Since this is Conductor -> Deltacloud issue I am going to assign this over to Jan in the Conductor Team. adding to sprint tracker related to this bug: https://bugzilla.redhat.com/show_bug.cgi?id=717987 (point 6 discussed in comments) There is a wrong link in previous comment. Correct link to the related bug is: https://bugzilla.redhat.com/show_bug.cgi?id=772644 *** This bug has been marked as a duplicate of bug 772644 *** Reopening this, 2 things are needed: 1) re-scanning (periodically or before instance launch) - this is needed both for this BZ and for 772644 2) adding availability status to Realm model - this status should be returned by dc api. Unavailable realms will be skipped when launching an instance - this task is not done and is not part of any other task. EC2 ===========================> Deltacloud: Regions =======================> Providers Availability Zones ============> Realms The deltacloud server can return information for both, by querying EC2: ****************************************************** 1. Get a list of Providers/EC2 Regions: curl --user "KEY:PASS" "http://localhost:3001/api/drivers/ec2?format=xml" <driver href='http://localhost:3001/api/drivers/ec2' id='ec2'> <name>EC2</name> <provider id='eu-west-1'> <entrypoint kind='eu-west-1'>localhost:3001/api;provider=eu-west-1</entrypoint> </provider> <provider id='sa-east-1'> <entrypoint kind='sa-east-1'>localhost:3001/api;provider=sa-east-1</entrypoint> </provider> <provider id='us-east-1'> <entrypoint kind='us-east-1'>localhost:3001/api;provider=us-east-1</entrypoint> </provider> <provider id='ap-northeast-1'> <entrypoint kind='ap-northeast-1'>localhost:3001/api;provider=ap-northeast-1</entrypoint> </provider> <provider id='us-west-2'> <entrypoint kind='us-west-2'>localhost:3001/api;provider=us-west-2</entrypoint> </provider> <provider id='us-west-1'> <entrypoint kind='us-west-1'>localhost:3001/api;provider=us-west-1</entrypoint> </provider> <provider id='ap-southeast-1'> <entrypoint kind='ap-southeast-1'>localhost:3001/api;provider=ap-southeast-1</entrypoint> </provider> </driver> ****************************************************** 2. Query the list of (currently available) Realms/Availability zones for a given provider/region: curl --user "KEY:PASS" "http://localhost:3001/api;provider=us-west-1/realms?format=xml" <?xml version='1.0' encoding='utf-8' ?> <realms> <realm href='http://localhost:3001/api;provider=us-west-1/realms/us-west-1b' id='us-west-1b'> <name>us-west-1b</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-west-1/realms/us-west-1c' id='us-west-1c'> <name>us-west-1c</name> <state>available</state> </realm> </realms> As you can see from the output - EC2 tells us that for the us-west-1 region, only us-west-1b and us-west-1c availability zones are currently available for launching stuff. Does this help? Can we do something else on Deltacloud side to help resolve this? marios pushed in 3 commits: fd15193d5cf6970601c98fd041306234f9842b6a 27d4b1d80ab7b9bea8028849739ec0e738458c8b c64db92b51c3daf19225ce0cc28b5117216f6960 Cloud Resource Cluster Name us-west-1a us-west-1b us-west-1c <?xml version='1.0' encoding='utf-8' ?> <realms> <realm href='http://localhost:3001/api;provider=us-west-1/realms/us-west-1a' id='us-west-1a'> <name>us-west-1a</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-west-1/realms/us-west-1b' id='us-west-1b'> <name>us-west-1b</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-west-1/realms/us-west-1c' id='us-west-1c'> <name>us-west-1c</name> <state>available</state> </realm> </realms> Cloud Resource Cluster Name us-east-1a us-east-1b us-east-1c us-east-1d us-east-1e <?xml version='1.0' encoding='utf-8' ?> <realms> <realm href='http://localhost:3001/api;provider=us-east-1/realms/us-east-1a' id='us-east-1a'> <name>us-east-1a</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-east-1/realms/us-east-1b' id='us-east-1b'> <name>us-east-1b</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-east-1/realms/us-east-1c' id='us-east-1c'> <name>us-east-1c</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-east-1/realms/us-east-1d' id='us-east-1d'> <name>us-east-1d</name> <state>available</state> </realm> <realm href='http://localhost:3001/api;provider=us-east-1/realms/us-east-1e' id='us-east-1e'> <name>us-east-1e</name> <state>available</state> </realm> </realms> back end realms in conductor match what dcloud is pulling from ec2 [root@qeblade30 ~]# rpm -qa | grep deltacloud deltacloud-core-rhevm-0.5.0-5.el6.noarch deltacloud-core-ec2-0.5.0-5.el6.noarch rubygem-deltacloud-client-0.5.0-2.el6.noarch deltacloud-core-0.5.0-5.el6.noarch deltacloud-core-vsphere-0.5.0-5.el6.noarch [root@qeblade30 ~]# rpm -qa | grep aeolus aeolus-conductor-doc-0.8.0-24.el6.noarch rubygem-aeolus-cli-0.3.0-8.el6.noarch aeolus-all-0.8.0-24.el6.noarch rubygem-aeolus-image-0.3.0-7.el6.noarch aeolus-configure-2.5.0-12.el6.noarch aeolus-conductor-0.8.0-24.el6.noarch aeolus-conductor-daemons-0.8.0-24.el6.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-0583.html |