Hide Forgot
when dbomatic checks instances status for an account (method check_one_account), it iterates through all account's instances and calls dc-api instance get request (api_instance = connection.instance(instance.external_key) It seems that this request can raise an exception for some instances, this is core log when an exception raises: I, [2012-02-16T04:17:20.858803 #10873] INFO -- : New Aws::Ec2 using per_thread-connection mode D, [2012-02-16T04:17:20.869549 #10873] DEBUG -- : Rightscale::HttpConnection : server ec2.us-east-1.amazonaws.com closed connection I, [2012-02-16T04:17:21.370164 #10873] INFO -- : Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 thin server (localhost:3002) [deltacloud-mock][10873]: NoMethodError:undefined method `[]' for nil:NilClass /usr/share/deltacloud-core/lib/deltacloud/drivers/ec2/ec2_driver.rb:797:in `convert_instance' /usr/share/deltacloud-core/lib/deltacloud/drivers/ec2/ec2_driver.rb:186:in `instance' /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `call' /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `safely' /usr/share/deltacloud-core/lib/deltacloud/drivers/ec2/ec2_driver.rb:184:in `instance' /usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:93:in `send' /usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:93:in `show' /usr/lib/ruby/1.8/benchmark.rb:293:in `measure' /usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:92:in `show' /usr/share/deltacloud-core/lib/deltacloud/server.rb:470 /usr/share/deltacloud-core/lib/sinatra/rabbit.rb:125:in `instance_eval' If an exception occurs, rest of instances is not checked -> we should handle an exception and try to check rest of instances.
michal: can you check pls if the exception above is common for ec2?
Jan, what version of DC you're running? This exception is definitely a DC related.
commit f147e9f360ae276544081d2d1419188e64423d89 Author: Michal Fojtik <mfojtik> Date: Thu Feb 16 13:31:51 2012 +0100 EC2: Raise an 404 exception when trying to access non-existing instance
diff --git a/server/lib/deltacloud/drivers/ec2/ec2_driver.rb b/server/lib/deltacloud/drivers/ec2/ec2_driver.rb index 4569056..74e110f 100644 --- a/server/lib/deltacloud/drivers/ec2/ec2_driver.rb +++ b/server/lib/deltacloud/drivers/ec2/ec2_driver.rb @@ -184,6 +184,7 @@ module Deltacloud inst_arr = [] safely do ec2_inst = ec2.describe_instances([opts[:id]]).first + raise "Instance #{opts[:id]} NotFound" if ec2_inst.nil? instance = convert_instance(ec2_inst) return nil unless instance if ec2_inst[:aws_platform] == 'windows'
Even with Michal's fix on dc-core side, an exception is raised -> we need to handle this exception in dbomatic.
Moving this back to ON_DEV from MODIFIED since it seems a dbomatic patch is required to proceed with building.
assigning to angus to do reassignment - patch is needed in dbomatic (conductor side) too. I would suggest to assign higher priority - this breaks instance status checking if a user deletes instances on provider side (ec2).
How do you produce this issue? Eg under what circumstances is the exception raised for the instance? Under a fresh Aeolus install w/ the components from the F16 repo, I get the following in the deltacloud log if I try to manually querying for an instance which does not exist: Deltacloud::ExceptionHandler::ObjectNotFound - InvalidInstanceID.NotFound: The instance ID 'i-01234567' does not exist REQUEST=ec2.us-east-1.amazonaws.com:443/<snip> /usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ses/../awsbase/awsbase.rb:572:in `request_info_impl' /usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ec2/ec2.rb:177:in `request_info' /usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ses/../awsbase/awsbase.rb:586:in `request_cache_or_info' /usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ec2/ec2.rb:432:in `describe_instances' /usr/share/deltacloud-core/lib/deltacloud/drivers/ec2/ec2_driver.rb:185:in `instance' /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `call' /usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `safely' /usr/share/deltacloud-core/lib/deltacloud/drivers/ec2/ec2_driver.rb:184:in `instance' /usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:93:in `send' /usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:93:in `show' /usr/lib/ruby/1.8/benchmark.rb:293:in `measure' <snip> Note the aws client itself is throwing an exception even before we get to the one added in the deltacloud patch. How do I trigger the aforementioned error / problem in dbomatic / deltacloud?
Patch sent to list https://fedorahosted.org/pipermail/aeolus-devel/2012-March/009410.html
Patch pushed to repo
Marking needinfo for verification steps...
The problem is that dc core can raise an exception when requesting a non-existing instance, this happens only for some providers - e.g. ec2. Example: irb(main):005:0> ec2.connect.instance('i-1234435') DeltaCloud::API::BackendError: 502 : InvalidInstanceID.NotFound: The instance ID 'i-01234435' does not exist REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJDNGKF2QAIF5HCSQ&Action=DescribeInstances&InstanceId.1=i-1234435&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-03-19T07%3A49%3A13.000Z&Version=2010-08-31&Signature=FLPHYh1GN20JRRENN338%2BGBumTUjLba1uKXVO20OPVQ%3D REQUEST ID=ba03b46a-8dc8-4b59-9379-b16aba4e3489 from /usr/lib/ruby/gems/1.8/gems/deltacloud-client-0.5.0/lib/deltacloud.rb:432:in `handle_backend_error' from /usr/lib/ruby/gems/1.8/gems/deltacloud-client-0.5.0/lib/deltacloud.rb:390:in `request' from /usr/lib/ruby/gems/1.8/gems/rest-client-1.6.1/lib/restclient/request.rb:218:in `call' So reproduction steps would be: 1. start 3 ec2 instances 2. when all 3 instances are running, stop dbomatic and terminate 1st instance (which was created first in conductor) through aws console 3. when the terminated instance completely disappears from ec2, start dbomatic again 4. w/o this patch, you should see error message + backtrace in dbomatic log similar to the error I pasted above 5. if you stop 2nd and 3rd instance from conductor, their status should not be updated in conductor even if they are really stopped
assigning to Pushpesh
Tried steps specified by Jan Provaznik, no error logs seen in dbomatic logs. dbomatic service started and stopped in between instances stopped from both side i.e. ec2-console and conductor.
verified on [root@qe-blade-01 ~]# rpm -qa|grep aeolus aeolus-conductor-doc-0.8.7-1.el6.noarch aeolus-configure-2.5.2-1.el6.noarch aeolus-conductor-0.8.7-1.el6.noarch rubygem-aeolus-cli-0.3.1-1.el6.noarch rubygem-aeolus-image-0.3.0-12.el6.noarch aeolus-conductor-daemons-0.8.7-1.el6.noarch aeolus-all-0.8.7-1.el6.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-0583.html