Bug 786437

Summary: unable to delete a deployment, shutdown instance due to ec2 key
Product: [Retired] CloudForms Cloud Engine Reporter: wes hayutin <whayutin>
Component: deltacloud-coreAssignee: Michal Fojtik <mfojtik>
Status: CLOSED ERRATA QA Contact: Ronelle Landy <rlandy>
Severity: high Docs Contact:
Priority: unspecified    
Version: 1.0.0CC: hbrock, jrd, lutter, rananda, whayutin
Target Milestone: beta6   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3e46de11d6f51e279ca2293565c8f2eef83a20ec Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-15 20:32:54 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description wes hayutin 2012-02-01 12:58:46 UTC
Description of problem:

recreate: (not repeatable most likely)
1. start deployment
2. aeolus-conductor crashed
3. aeolus-restart-services
4. try to stop deployment 

deltacloud log

2012-02-01T07:55:34.914948 #25641]  INFO -- : Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443
W, [2012-02-01T07:55:35.240665 #25641]  WARN -- : ##### Aws::Ec2 returned an error: 400 Bad Request
<?xml version="1.0" encoding="UTF-8"?>
<Response><Errors><Error><Code>InvalidKeyPair.NotFound</Code><Message>The key pair 'Deployment-ConfigSrv-g2mlq_fedoraEC2Instance_1328099217_key_69886688001040' does not exist</Message></Error></Errors><RequestID>d33b817e-5d43-425b-8ff3-bad7fa5628b8</RequestID></Response> #####
W, [2012-02-01T07:55:35.240766 #25641]  WARN -- : ##### Aws::Ec2 request: ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJ557U7P7OIHRV2EQ&Action=DescribeKeyPairs&KeyName.1=Deployment-ConfigSrv-g2mlq_fedoraEC2Instance_1328099217_key_69886688001040&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-02-01T12%3A55%3A34.000Z&Version=2010-08-31&Signature=GGfDVrX5anJLVaB61graof30%2FSMx%2BM%2BF1bmvlqZgWrM%3D ####
thin server (localhost:3002) [deltacloud-mock][25641]: Aws::AwsError:InvalidKeyPair.NotFound: The key pair 'Deployment-ConfigSrv-g2mlq_fedoraEC2Instance_1328099217_key_69886688001040' does not exist
REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJ557U7P7OIHRV2EQ&Action=DescribeKeyPairs&KeyName.1=Deployment-ConfigSrv-g2mlq_fedoraEC2Instance_1328099217_key_69886688001040&SignatureMethod=HmacSHA256&SignatureVersion=2&Timestamp=2012-02-01T12%3A55%3A34.000Z&Version=2010-08-31&Signature=GGfDVrX5anJLVaB61graof30%2FSMx%2BM%2BF1bmvlqZgWrM%3D 
REQUEST ID=d33b817e-5d43-425b-8ff3-bad7fa5628b8 
/usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ses/../awsbase/awsbase.rb:572:in `request_info_impl'
/usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ec2/ec2.rb:177:in `request_info'
/usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ses/../awsbase/awsbase.rb:586:in `request_cache_or_info'
/usr/lib/ruby/gems/1.8/gems/aws-2.5.5/lib/ec2/ec2.rb:998:in `describe_key_pairs'
/usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/ec2/ec2_driver.rb:277:in `keys'
/usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `call'
/usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `safely'
/usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/ec2/ec2_driver.rb:276:in `keys'
/usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/base_driver.rb:193:in `key'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:93:in `send'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:93:in `show'
/usr/lib/ruby/1.8/benchmark.rb:293:in `measure'
/usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:92:in `show'
/usr/share/deltacloud-core/bin/../lib/deltacloud/server.rb:766
/usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `instance_eval'
/usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `GET /api/keys/:id'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `compile!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `route_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:708:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:758:in `process_route'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `catch'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `process_route'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:707:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `each'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `route!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:843:in `dispatch!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `instance_eval'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `catch'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:629:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_syslog.rb:48:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_date.rb:31:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_accept.rb:149:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/head.rb:9:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_driver_select.rb:45:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_matrix_params.rb:106:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_runtime.rb:36:in `call'
/usr/share/deltacloud-core/bin/../lib/sinatra/rack_etag.rb:41:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/head.rb:9:in `call'
/usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/methodoverride.rb:24:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1272:in `call'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1303:in `synchronize'
/usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1272:in `call'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:84:in `pre_process'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:82:in `catch'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:82:in `pre_process'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `call'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `spawn_threadpool'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `initialize'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `new'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `spawn_threadpool'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1049:in `defer'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:54:in `process'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/connection.rb:42:in `receive_data'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
/usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/backends/base.rb:61:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/server.rb:159:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/controllers/controller.rb:86:in `start'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:185:in `send'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:185:in `run_command'
/usr/lib/ruby/gems/1.8/gems/thin-1.2.11/lib/thin/runner.rb:151:in `run!'
/usr/share/deltacloud-core/bin/deltacloudd:235
/usr/bin/deltacloudd:5:in `load'
/usr/bin/deltacloudd:5
thin server (localhost:3002) [deltacloud-mock][25641]: 127.0.0.1 - - [01/Feb/2012 07:55:35] "GET /api/keys/Deployment-ConfigSrv-g2mlq_fedoraEC2Instance_1328099217_key_69886688001040 HTTP/1.1" 502 778 0.3424

Comment 1 David Lutterkort 2012-02-01 20:16:45 UTC
Just to make sure I understand: the DC bug here is that DC doesn't return a 404. Whether the key should be there or not depends on what happened before aeolus crashed.

Comment 2 Hugh Brock 2012-02-02 14:37:27 UTC
Wes, answers to the above?

Comment 3 wes hayutin 2012-02-02 15:37:26 UTC
so.. I *think* the key is actually in ec2.. or was shortly before this happened.  It may be a race condition.

recreate is as follows..
1. create a ec2 deployable
2. ssh in
3. destroy the deployable (application)
If by chance ruby segfaults... deltacloud will return the key not found error as well...

not sure if the call to delete the key happens before the rest of the code runs or what...


this bug may be a result of the segfault and thus a dupe... but not 100% sure

Comment 4 David Lutterkort 2012-02-02 20:39:47 UTC
It's not really related to the segfault - besides the issue of a misleading status code from DC (502 instead of 404), there seems to be an issue with the transactionality of deleting deployables in Aeolus. If the Aeolus server fails (e.g., by an unexpected shutdown) at the wrong point in time, it leaves a partially finished task behind.

Depending on how the Aeolus code in question is written, switching to a 404 might be enough for Aeolus to work properly.; I am assuming here that Aeolus' goal is to ultimately delete the key, and it should be satisfied if the key it wants to delete has been deleted already.

Comment 5 jrd 2012-02-03 19:12:40 UTC
I talked this over with morazi.  He asserts that this is sufficiently high prio that we should get a new rpm with a fix before 2/8 if at all possible.  DL and MF, please see about nailing this one down first thing monday, eh?

Comment 6 Hugh Brock 2012-02-06 16:48:46 UTC
Setting dev-ack+ and blocker+ for beta

Comment 7 Michal Fojtik 2012-02-07 11:39:09 UTC
commit 3e46de11d6f51e279ca2293565c8f2eef83a20ec
Author: Michal Fojtik <mfojtik>
Date:   Wed Feb 1 14:38:38 2012 +0100

    Core: Return 404 instead of exception when accessing non-existing driver in drivers collection

commit f66b594398bf92b142b1a35dd0754ef48ff22762
Author: Michal Fojtik <mfojtik>
Date:   Wed Feb 1 14:37:45 2012 +0100

    EC2: We should return 404 instead of 502 or 500 in case when resource is not available

===========================================================


Package: deltacloud-core-0.5.0-2.el6
Tag: ce-rhel-6-candidate
Status: complete
Built by: mfojtik
ID: 197630
Started: Tue, 07 Feb 2012 06:15:14 EST
Finished: Tue, 07 Feb 2012 06:18:17 EST
Changelog:
* Tue Feb 07 2012 Michal Fojtik <mfojtik> - 0.5.0-2
- Applied patches for BZ #786437

Comment 8 Ronelle Landy 2012-02-07 16:42:33 UTC
Verified that Deltacloud now returns 404 when a resources is not available 

Accessing available drivers:

10.11.9.71 - - [07/Feb/2012 10:32:46] "GET /api/drivers/ec2?format=xml HTTP/1.1" 200 2229 0.0186
10.11.9.71 - - [07/Feb/2012 10:32:56] "GET /api/drivers/rhevm?format=xml HTTP/1.1" 200 126 0.0147

Accessing a driver that does not exist:

10.11.9.71 - - [07/Feb/2012 10:33:42] "GET /api/drivers/google?format=xml HTTP/1.1" 404 - 0.0064

rpm -qa | grep deltacloud
deltacloud-core-vsphere-0.5.0-4.el6.noarch
deltacloud-core-rhevm-0.5.0-4.el6.noarch
deltacloud-core-ec2-0.5.0-4.el6.noarch
deltacloud-core-0.5.0-4.el6.noarch
rubygem-deltacloud-client-0.5.0-2.el6.noarch

rpm -qa | grep aeolus
aeolus-all-0.8.0-21.el6.noarch
aeolus-conductor-0.8.0-21.el6.noarch
rubygem-aeolus-cli-0.3.0-7.el6.noarch
aeolus-configure-2.5.0-11.el6.noarch
aeolus-conductor-daemons-0.8.0-21.el6.noarch
aeolus-conductor-doc-0.8.0-21.el6.noarch
rubygem-aeolus-image-0.3.0-7.el6.noarch

Comment 9 errata-xmlrpc 2012-05-15 20:32:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-0587.html