Created attachment 493890 [details] dbomatic log Description of problem: Just documenting the issue, after three days of trying to find a clear recreate for why dbomatic appears to crash periodically I am giving up. To summarize, from what I can tell After moderate use of cloud engine, bringing up 10-20 instances, shutting them down etc.. dbomatic *sometimes* will crash w/o any clear traces in the dbomatic logs. The longer dbomatic runs the more likely IMHO to crash. I was also able to get it crash by bringing up 10 instances and running condor_rm / condor_release on the jobs as the started and stopped.. I'll attach the only trace I was able to get out of dbomatic.
think the recreate is.. confirm dbomatic is up start instances until you have one w/ an error... confirm dbomatic is up use the webui, instances, check the instance in error.. "removed failed" at this point I think dbomatic will be down
So I have been attempting to chase this down and here is what I am seeing... dbomatic dies when a unhandled ec2 exception occurs in deltacloud. In terms of seeing any type of indication as to why in the dbomatic logs, there is none. So a couple of things here that I believe need to happen and possibly split out into seperate bugs, harden dbomatic to prevent crashing when this occurs, and get that exception handled properly, additionally, some improvements to the dbomatic logging would be a nice to have. Seen with RPMS versions of the released AMI... [root@ip-10-82-250-97 log]# rpm -qa | grep deltacloud rubygem-deltacloud-client-0.1.0-2.el6_0.noarch libdeltacloud-0.7-2.el6_0.x86_64 rubygem-deltacloud-core-0.2.0-5.el6.noarch [root@ip-10-82-250-97 log]# rpm -qa | grep aeolus aeolus-conductor-0.2.0-4.el6.x86_64 aeolus-configure-2.0.0-10.el6.noarch aeolus-conductor-doc-0.2.0-4.el6.x86_64 aeolus-conductor-daemons-0.2.0-4.el6.x86_64 ==> deltacloud-ec2-us-east-1/ec2.log <== Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 ##### Aws::Ec2 returned an error: 503 Service Unavailable <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>RequestLimitExceeded</Code><Message>Request limit exceeded.</Message></Error></Errors><RequestID>5b1e3fbd-ecb8-4e9d-be74-55b43adeb57f</RequestID></Response> ##### ##### Aws::Ec2 request: ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-74c9cb1b&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A35.000Z&Version=2010-08-31&Signature=DAuIeGBGABBLe5xYTity0lALDt7DlYnIBel9ZhM5ZEI%3D #### Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 ##### Aws::Ec2 returned an error: 503 Service Unavailable <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>RequestLimitExceeded</Code><Message>Request limit exceeded.</Message></Error></Errors><RequestID>05af87bf-e247-433c-9806-df64e70d6073</RequestID></Response> ##### ======= UNHANDLED EXCEPTION ============ #<Deltacloud::BackendError: RequestLimitExceeded: Request limit exceeded. REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-74c9cb1b&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A35.000Z&Version=2010-08-31&Signature=DAuIeGBGABBLe5xYTity0lALDt7DlYnIBel9ZhM5ZEI%3D REQUEST ID=5b1e3fbd-ecb8-4e9d-be74-55b43adeb57f > ======================================== ##### Aws::Ec2 request: ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-12dfdd7d&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A35.000Z&Version=2010-08-31&Signature=3I81LmRtftJPIzXNxYFEEguXlDblGsEmdW%2BWAvEvbrQ%3D #### ======= UNHANDLED EXCEPTION ============ #<Deltacloud::BackendError: RequestLimitExceeded: Request limit exceeded. REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-12dfdd7d&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A35.000Z&Version=2010-08-31&Signature=3I81LmRtftJPIzXNxYFEEguXlDblGsEmdW%2BWAvEvbrQ%3D REQUEST ID=05af87bf-e247-433c-9806-df64e70d6073 > ======================================== Opening new HTTPS connection to ec2.us-east-1.amazonaws.com:443 Opening new HTTPS connection to ec2warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session warning: peer certificate won't be verified in this SSL session Deltacloud::BackendError - RequestLimitExceeded: Request limit exceeded. REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-84cac8eb&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A36.000Z&Version=2010-08-31&Signature=K2aaWkVYRBK1hNAkcUjd7dIMAptVw4Frfw74i%2FhDWaU%3D REQUEST ID=5cd99415-5a2e-4d81-a07f-804fb84e266d : ././lib/deltacloud/base_driver/base_driver.rb:241:in `safely' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:539:in `delete_unused_tags' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:173:in `instances' ././lib/deltacloud/base_driver/base_driver.rb:239:in `call' ././lib/deltacloud/base_driver/base_driver.rb:239:in `safely' [root@ip-10-82-250-97 log]# cat deltacloud-ec2-us-east-1/ec2.log | tail -100 - /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/methodoverride.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/commonlogger.rb:18:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:1005:in `synchronize' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:33:in `log' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:76:in `pre_process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `catch' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `pre_process' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `call' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `initialize' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `new' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1049:in `defer' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:54:in `process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:42:in `receive_data' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/backends/base.rb:57:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/server.rb:156:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/controllers/controller.rb:80:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `send' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `run_command' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:143:in `run!' /usr/lib/ruby/gems/1.8/gems/deltacloud-core-0.2.0/bin/deltacloudd:123 /usr/bin/deltacloudd:19:in `load' /usr/bin/deltacloudd:19 127.0.0.1 - - [06/May/2011 11:06:36] "DELETE /api/instances/i-b0e8eadf HTTP/1.1" 500 5889 1.6293 Deltacloud::BackendError - RequestLimitExceeded: Request limit exceeded. REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=DeleteTags&ResourceId.1=i-b4d1d3db&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Timestamp=2011-05-06T15%3A06%3A36.000Z&Version=2010-08-31&Signature=XseoXEB8bUdDbtccDUYswi4howqmqWZOSX%2FZS3mSrJg%3D REQUEST ID=dce4bd65-26f9-4a00-a1bb-209ef9a87533 : ././lib/deltacloud/base_driver/base_driver.rb:241:in `safely' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:539:in `delete_unused_tags' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:173:in `instances' ././lib/deltacloud/base_driver/base_driver.rb:239:in `call' ././lib/deltacloud/base_driver/base_driver.rb:239:in `safely' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:160:in `instances' ././lib/deltacloud/base_driver/base_driver.rb:178:in `instance' ././lib/deltacloud/helpers/application_helper.rb:95:in `send' ././lib/deltacloud/helpers/application_helper.rb:95:in `show' /usr/lib/ruby/1.8/benchmark.rb:293:in `measure' ././lib/deltacloud/helpers/application_helper.rb:94:in `show' ././server.rb:314 ././lib/sinatra/rabbit.rb:67:in `instance_eval' ././lib/sinatra/rabbit.rb:67:in `GET /api/instances/:id' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `route' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `route_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:500:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `each' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:601:in `dispatch!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:399:in `call' ././lib/sinatra/rack_runtime.rb:17:in `call' ././lib/sinatra/rack_etag.rb:23:in `call' ././lib/sinatra/rack_driver_select.rb:21:in `call' /usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/methodoverride.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/commonlogger.rb:18:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:1005:in `synchronize' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:33:in `log' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:76:in `pre_process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `catch' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `pre_process' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `call' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `initialize' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `new' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1049:in `defer' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:54:in `process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:42:in `receive_data' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/backends/base.rb:57:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/server.rb:156:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/controllers/controller.rb:80:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `send' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `run_command' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:143:in `run!' /usr/lib/ruby/gems/1.8/gems/deltacloud-core-0.2.0/bin/deltacloudd:123 /usr/bin/deltacloudd:19:in `load' /usr/bin/deltacloudd:19 127.0.0.1 - - [06/May/2011 11:06:36] "GET /api/instances/i%2D14e8ea7b HTTP/1.1" 500 6060 3.1192
(In reply to comment #2) > So I have been attempting to chase this down and here is what I am seeing... > > dbomatic dies when a unhandled ec2 exception occurs in deltacloud. In terms of > seeing any type of indication as to why in the dbomatic logs, there is none. > So a couple of things here that I believe need to happen and possibly split out > into seperate bugs, harden dbomatic to prevent crashing when this occurs, and > get that exception handled properly, additionally, some improvements to the > dbomatic logging would be a nice to have. Good work so far. So the thing is that dbomatic is only indirectly connected to what happens in deltacloud. That is, dbomatic is merely looking for changes in /var/log/condor/EventLog. That file gets updated when "something" happens to a job in condor (start, stop, crash, etc). So what must be happening is that somehow one of these transitions is causing something to happen in the EventLog, which in turn causes dbomatic to crash. Dave, next time you see this, can you please also collect /var/log/condor/EventLog* and /var/log/condor/GridmanagerLog*? That should help us to identify what is causing the dbomatic crash. Chris Lalancette
Created attachment 497426 [details] logs.tgz Attached are the requested log files, condor/EventLog, condor/Gridmanager*, ec2.log, dbomatic.log zipped up... Because some of the logs don;t have timestamps, I attempted to tail the logs while printing out the dbomatic process from ps, it may or may not help. In this case it was a different unhandled exception from ec2 that seemed to trigger the failure... # When the dbomatic process isn;t found, it just prints out a date stamp like so, 'Fri May 6 14:00:50 EDT 2011' # You will see it dies at 'Fri May 6 14:01:09 EDT 2011' Fri May 6 14:00:44 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:01 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:46 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:01 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:48 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:01 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic => condor/GridmanagerLog.aeolus <== 05/06/11 14:00:50 [21636] GAHP[21797] -> 'R' 05/06/11 14:00:50 [21636] GAHP[21797] -> 'S' '1' 05/06/11 14:00:50 [21636] GAHP[21797] -> '2' 'NULL' 'i-fa350995' 'STOPPED' 'i-f8350997' 'STOPPED' 'i-f6350999' 'STOPPED' 'i-cc585ea3' 'RUNNING' 'i-a00b09cf' 'RUNNING' 'i-840f0deb' 'RUNNING' 'i-d2dfddbd' 'RUNNING' 'i-76606d19' 'RUNNING' 'i-dc606db3' 'STOPPED' 'i-e27b768d' 'STOPPED' 'i-d00b06bf' 'STOPPED' 'i-ae0b06c1' 'RUNNING' 05/06/11 14:00:50 [21636] resource http://localhost:3003/api is now up 05/06/11 14:00:50 [21636] (32.0) doEvaluateState called: gmState GM_CREATE_VM, condorState 1 05/06/11 14:00:50 [21636] GAHP[21797] <- 'DELTACLOUD_VM_SUBMIT 4 http://localhost:3003/api AKIAJI5RB627E2VSWRYQ yUp2VHJQZ3xYkz5LRxxg4OzPHjgnAJxvZrvKxBFC ami-a0d926c9 Condor_localhost_ip-10-82-250-97.ec2.internal#32.0#1304704825 NULL c1.xlarge 7168.0 20.0 1690.0 ec2-east-dj_1304690302_key_70222408432860 NULL' 05/06/11 14:00:50 [21636] GAHP[21797] -> 'S' ==> deltacloud-ec2-us-east-1/ec2.log <== 127.0.0.1 - - [06/May/2011 14:00:50] "GET /api HTTP/1.1" 200 1143 0.0741 Fri May 6 14:00:50 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:01 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Deltacloud::BackendError - InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=CreateTags&ResourceId.1=i-001d106f&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Tag.1.Value=Condor_localhost_ip-10-82-250-97.ec2.internal%2332.0%231304704825&Timestamp=2011-05-06T18%3A00%3A51.000Z&Version=2010-08-31&Signature=qrQUWa6utrhbEUjHEa11QcQu9GVDmhiOOAfML17pRcc%3D REQUEST ID=1150d87f-0afb-436e-b542-dd5c65c59c0f : ././lib/deltacloud/base_driver/base_driver.rb:241:in `safely' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:524:in `tag_instance' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:192:in `create_instance' ././lib/deltacloud/base_driver/base_driver.rb:239:in `call' ././lib/deltacloud/base_driver/base_driver.rb:239:in `safely' ././lib/deltacloud/drivers/ec2/ec2_driver.rb:187:in `create_instance' ././server.rb:325 ././lib/sinatra/rabbit.rb:67:in `instance_eval' ././lib/sinatra/rabbit.rb:67:in `POST /api/instances' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:863:in `route' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:521:in `route_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:500:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:497:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `each' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:476:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:601:in `dispatch!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:566:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:411:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:399:in `call' ././lib/sinatra/rack_runtime.rb:17:in `call' ././lib/sinatra/rack_etag.rb:23:in `call' ././lib/sinatra/rack_driver_select.rb:21:in `call' /usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/methodoverride.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.1.0/lib/rack/commonlogger.rb:18:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:1005:in `synchronize' /usr/lib/ruby/gems/1.8/gems/sinatra-1.0/lib/sinatra/base.rb:979:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:33:in `log' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/stats.rb:24:in `call' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:76:in `pre_process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `catch' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:74:in `pre_process' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `call' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1060:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `initialize' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `new' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1057:in `spawn_threadpool' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:1049:in `defer' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:54:in `process' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/connection.rb:42:in `receive_data' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine' /usr/lib/ruby/gems/1.8/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/backends/base.rb:57:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/server.rb:156:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/controllers/controller.rb:80:in `start' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `send' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:177:in `run_command' /usr/lib/ruby/gems/1.8/gems/thin-1.2.5/lib/thin/runner.rb:143:in `run!' /usr/lib/ruby/gems/1.8/gems/deltacloud-core-0.2.0/bin/deltacloudd:123 /usr/bin/deltacloudd:19:in `load' /usr/bin/deltacloudd:19 127.0.0.1 - - [06/May/2011 14:00:51] "POST /api/instances HTTP/1.1" 500 5688 0.8233 ==> condor/GridmanagerLog.aeolus <== 05/06/11 14:00:51 [21636] GAHP[21797] -> 'R' 05/06/11 14:00:51 [21636] GAHP[21797] -> 'S' '1' 05/06/11 14:00:51 [21636] GAHP[21797] -> '4' 'Create_Instance_Failure: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist' 05/06/11 14:00:51 [21636] GAHP[21797] -> ' REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=CreateTags&ResourceId.1=i-001d106f&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Tag.1.Value=Condor_localhost_ip-10-82-250-97.ec2.internal%2332.0%231304704825&Timestamp=2011-05-06T18%3A00%3A51.000Z&Version=2010-08-31&Signature=qrQUWa6utrhbEUjHEa11QcQu9GVDmhiOOAfML17pRcc%3D ' 05/06/11 14:00:51 [21636] (32.0) doEvaluateState called: gmState GM_CREATE_VM, condorState 1 05/06/11 14:00:51 [21636] (32.0) job submit failed: (null): Create_Instance_Failure: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist 05/06/11 14:00:51 [21636] (32.0) gm state change: GM_CREATE_VM -> GM_HOLD 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:00:51 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:00:51 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.236664 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.236769 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] (32.0) Writing hold record to user logfile 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.236961 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.237173 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:00:51 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.237793 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.237861 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.237919 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.238157 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.238398 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.238607 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.238729 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.238985 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.239150 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:00:51 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:00:51 [21636] FileLock::obtain(1) - @1304704851.239740 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:00:51 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:00:51 [21636] FileLock::obtain(2) - @1304704851.240004 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:00:51 [21636] (32.0) gm state change: GM_HOLD -> GM_DELETE ==> condor/EventLog <== <c> <a n="MyType"><s>JobHeldEvent</s></a> <a n="Cluster"><i>32</i></a> <a n="EventTypeNumber"><i>12</i></a> <a n="HoldReasonSubCode"><i>0</i></a> <a n="EventTime"><s>2011-05-06T14:00:51</s></a> <a n="HoldReason"><s>Create_Instance_Failure: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="HoldReasonCode"><i>0</i></a> <a n="CurrentTime"><e>time()</e></a> </c> <c> <a n="MyType"><s>JobAdInformationEvent</s></a> <a n="TriggerEventTypeNumber"><i>12</i></a> <a n="DeltacloudUsername"><s>AKIAJI5RB627E2VSWRYQ</s></a> <a n="Cmd"><s>job_31_32</s></a> <a n="Cluster"><i>32</i></a> <a n="EventTypeNumber"><i>28</i></a> <a n="JobStatus"><i>5</i></a> <a n="GlobalJobId"><s>ip-10-82-250-97.ec2.internal#32.0#1304704825</s></a> <a n="HoldReasonSubCode"><i>0</i></a> <a n="EventTime"><s>2011-05-06T14:00:51</s></a> <a n="TriggerEventTypeName"><s>ULOG_JOB_HELD</s></a> <a n="HoldReason"><s>Create_Instance_Failure: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="HoldReasonCode"><i>0</i></a> <a n="CurrentTime"><e>time()</e></a> <a n="Owner"><s>aeolus</s></a> </c> ==> condor/GridmanagerLog.aeolus <== ==> condor/EventLog <== ==> condor/GridmanagerLog.aeolus <== 05/06/11 14:00:51 [21636] GAHP[21797] -> ' REQUEST ID=1150d87f-0afb-436e-b542-dd5c65c59c0f' Fri May 6 14:00:52 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:02 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic 05/06/11 14:00:52 [21636] Received ADD_JOBS signal Fri May 6 14:00:53 EDT 2011 aeolus 20230 1 1 13:58 ? 00:00:03 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic 05/06/11 14:00:54 [21636] in doContactSchedd() 05/06/11 14:00:54 [21636] querying for new jobs 05/06/11 14:00:54 [21636] Using constraint ((Owner=?="aeolus"&&JobUniverse==9)) && (Managed =!= "ScheddDone") && (Matched =!= FALSE) && (JobStatus != 5) && (Managed =!= "External") 05/06/11 14:00:54 [21636] Using job type Deltacloud for job 33.0 05/06/11 14:00:54 [21636] (33.0) SetJobLeaseTimers() 05/06/11 14:00:54 [21636] (33.0) deltacloud job: hwp_id c1.xlarge, hwp_cpu 20.0, hwp_mem 7168.0, hwp_storage 1690.0 Fri May 6 14:00:54 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:03 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:55 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:56 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:57 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:58 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:00:59 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:00 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:01 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:02 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:03 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic 05/06/11 14:01:04 [21636] Found job 33.0 --- inserting 05/06/11 14:01:04 [21636] Fetched 1 new job ads from schedd 05/06/11 14:01:04 [21636] querying for removed/held jobs 05/06/11 14:01:04 [21636] Using constraint ((Owner=?="aeolus"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 05/06/11 14:01:04 [21636] Fetched 0 job ads from schedd 05/06/11 14:01:04 [21636] Updating classad values for 32.0: 05/06/11 14:01:04 [21636] EnteredCurrentStatus = 1304704851 05/06/11 14:01:04 [21636] HoldReason = "Create_Instance_Failure: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist" 05/06/11 14:01:04 [21636] HoldReasonCode = 0 05/06/11 14:01:04 [21636] HoldReasonSubCode = 0 05/06/11 14:01:04 [21636] JobStatus = 5 05/06/11 14:01:04 [21636] Managed = "Schedd" 05/06/11 14:01:04 [21636] NumSystemHolds = 1 05/06/11 14:01:04 [21636] ReleaseReason = undefined 05/06/11 14:01:04 [21636] leaving doContactSchedd() 05/06/11 14:01:04 [21636] (33.0) doEvaluateState called: gmState GM_INIT, condorState 1 05/06/11 14:01:04 [21636] (33.0) gm state change: GM_INIT -> GM_START 05/06/11 14:01:04 [21636] (33.0) gm state change: GM_START -> GM_CLEAR_REQUEST 05/06/11 14:01:04 [21636] (33.0) gm state change: GM_CLEAR_REQUEST -> GM_UNSUBMITTED 05/06/11 14:01:04 [21636] (33.0) gm state change: GM_UNSUBMITTED -> GM_SAVE_INSTANCE_NAME ==> condor/EventLog <== <c> <a n="MyType"><s>SubmitEvent</s></a> <a n="SubmitHost"><s><10.82.250.97:54702></s></a> <a n="Cluster"><i>34</i></a> <a n="EventTypeNumber"><i>0</i></a> <a n="EventTime"><s>2011-05-06T14:01:04</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="CurrentTime"><e>time()</e></a> </c> <c> <a n="MyType"><s>JobAdInformationEvent</s></a> <a n="TriggerEventTypeNumber"><i>0</i></a> <a n="DeltacloudUsername"><s>$$(username)</s></a> <a n="Cmd"><s>job_33_34</s></a> <a n="SubmitHost"><s><10.82.250.97:54702></s></a> <a n="Cluster"><i>34</i></a> <a n="EventTypeNumber"><i>28</i></a> <a n="JobStatus"><i>1</i></a> <a n="GlobalJobId"><s>ip-10-82-250-97.ec2.internal#34.0#1304704864</s></a> <a n="EventTime"><s>2011-05-06T14:01:04</s></a> <a n="TriggerEventTypeName"><s>ULOG_SUBMIT</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="CurrentTime"><e>time()</e></a> <a n="Owner"><s>aeolus</s></a> </c> ==> condor/GridmanagerLog.aeolus <== Fri May 6 14:01:04 EDT 2011 aeolus 20230 1 2 13:58 ? 00:00:04 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:05 EDT 2011 aeolus 20230 1 3 13:58 ? 00:00:05 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:06 EDT 2011 aeolus 20230 1 3 13:58 ? 00:00:06 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic Fri May 6 14:01:08 EDT 2011 aeolus 20230 1 3 13:58 ? 00:00:07 /usr/bin/ruby /usr/share/aeolus-conductor/dbomatic/dbomatic 05/06/11 14:01:09 [21636] in doContactSchedd() 05/06/11 14:01:09 [21636] querying for removed/held jobs 05/06/11 14:01:09 [21636] Using constraint ((Owner=?="aeolus"&&JobUniverse==9)) && ((Managed =!= "ScheddDone")) && (JobStatus == 3 || JobStatus == 4 || (JobStatus == 5 && Managed =?= "External")) 05/06/11 14:01:09 [21636] Fetched 0 job ads from schedd 05/06/11 14:01:09 [21636] Updating classad values for 33.0: 05/06/11 14:01:09 [21636] GridJobId = "deltacloud Condor_localhost_ip-10-82-250-97.ec2.internal#33.0#1304704844" 05/06/11 14:01:09 [21636] LastRemoteStatusUpdate = 1304704864 05/06/11 14:01:09 [21636] leaving doContactSchedd() 05/06/11 14:01:09 [21636] (33.0) doEvaluateState called: gmState GM_SAVE_INSTANCE_NAME, condorState 1 05/06/11 14:01:09 [21636] (33.0) gm state change: GM_SAVE_INSTANCE_NAME -> GM_CREATE_VM 05/06/11 14:01:09 [21636] GAHP[21797] <- 'DELTACLOUD_VM_SUBMIT 5 http://localhost:3003/api AKIAJI5RB627E2VSWRYQ yUp2VHJQZ3xYkz5LRxxg4OzPHjgnAJxvZrvKxBFC ami-a0d926c9 Condor_localhost_ip-10-82-250-97.ec2.internal#33.0#1304704844 NULL c1.xlarge 7168.0 20.0 1690.0 ec2-east-dj_1304690302_key_70222408432860 NULL' 05/06/11 14:01:09 [21636] GAHP[21797] -> 'S' ==> deltacloud-ec2-us-east-1/ec2.log <== 127.0.0.1 - - [06/May/2011 14:01:09] "GET /api HTTP/1.1" 200 1143 0.0059 Fri May 6 14:01:09 EDT 2011 Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode Launching instance of image ami-a0d926c9 for AKIAJI5RB627E2VSWRYQ, key: ec2-east-dj_1304690302_key_70222408432860, groups: New Aws::Ec2 using per_request-connection mode ##### Aws::Ec2 returned an error: 400 Bad Request <?xml version="1.0" encoding="UTF-8"?> <Response><Errors><Error><Code>InvalidInstanceID.NotFound</Code><Message>The instance ID 'i-001d106f' does not exist</Message></Error></Errors><RequestID>1150d87f-0afb-436e-b542-dd5c65c59c0f</RequestID></Response> ##### ##### Aws::Ec2 request: ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=CreateTags&ResourceId.1=i-001d106f&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Tag.1.Value=Condor_localhost_ip-10-82-250-97.ec2.internal%2332.0%231304704825&Timestamp=2011-05-06T18%3A00%3A51.000Z&Version=2010-08-31&Signature=qrQUWa6utrhbEUjHEa11QcQu9GVDmhiOOAfML17pRcc%3D #### ======= UNHANDLED EXCEPTION ============ #<Deltacloud::BackendError: InvalidInstanceID.NotFound: The instance ID 'i-001d106f' does not exist REQUEST=ec2.us-east-1.amazonaws.com:443/?AWSAccessKeyId=AKIAJI5RB627E2VSWRYQ&Action=CreateTags&ResourceId.1=i-001d106f&SignatureMethod=HmacSHA256&SignatureVersion=2&Tag.1.Key=name&Tag.1.Value=Condor_localhost_ip-10-82-250-97.ec2.internal%2332.0%231304704825&Timestamp=2011-05-06T18%3A00%3A51.000Z&Version=2010-08-31&Signature=qrQUWa6utrhbEUjHEa11QcQu9GVDmhiOOAfML17pRcc%3D REQUEST ID=1150d87f-0afb-436e-b542-dd5c65c59c0f > ======================================== New Aws::Ec2 using per_request-connection mode New Aws::Ec2 using per_request-connection mode Launching instance of image ami-a0d926c9 for AKIAJI5RB627E2VSWRYQ, key: ec2-east-dj_1304690302_key_70222408432860, groups: Fri May 6 14:01:10 EDT 2011 127.0.0.1 - - [06/May/2011 14:01:10] "POST /api/instances HTTP/1.1" 201 874 1.2510 ==> condor/GridmanagerLog.aeolus <== 05/06/11 14:01:10 [21636] GAHP[21797] -> 'R' 05/06/11 14:01:10 [21636] GAHP[21797] -> 'S' '1' 05/06/11 14:01:10 [21636] GAHP[21797] -> '5' 'NULL' 'id=i-ac1d10c3' 'state=PENDING' 'actions=stop' 'public_addresses=' 'private_addresses=' 05/06/11 14:01:10 [21636] (33.0) doEvaluateState called: gmState GM_CREATE_VM, condorState 1 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:01:10 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:01:10 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.278845 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.278918 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] (33.0) Writing grid submit record to user logfile 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.279086 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:01:10 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.279300 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:01:10 [21636] FileLock object is updating timestamp on: /tmp/condorLocks/13/56/9182961121613650.lockc 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.279875 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.279942 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.280003 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE ==> condor/EventLog <== <c> <a n="MyType"><s>GridSubmitEvent</s></a> <a n="Cluster"><i>33</i></a> <a n="EventTypeNumber"><i>27</i></a> <a n="GridResource"><s>deltacloud http://localhost:3003/api</s></a> <a n="EventTime"><s>2011-05-06T14:01:10</s></a> <a n="GridJobId"><s>deltacloud Condor_localhost_ip-10-82-250-97.ec2.internal#33.0#1304704844 i-ac1d10c3</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="CurrentTime"><e>time()</e></a> </c> ==> condor/GridmanagerLog.aeolus <== 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.280222 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.280469 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE ==> condor/EventLog <== <c> <a n="MyType"><s>JobAdInformationEvent</s></a> <a n="DeltacloudPublicNetworkAddresses"><s></s></a> <a n="TriggerEventTypeNumber"><i>27</i></a> <a n="DeltacloudUsername"><s>AKIAJI5RB627E2VSWRYQ</s></a> <a n="Cmd"><s>job_32_33</s></a> <a n="Cluster"><i>33</i></a> <a n="EventTypeNumber"><i>28</i></a> <a n="JobStatus"><i>1</i></a> <a n="GridResource"><s>deltacloud http://localhost:3003/api</s></a> <a n="DeltacloudProviderId"><s>i-ac1d10c3</s></a> <a n="GlobalJobId"><s>ip-10-82-250-97.ec2.internal#33.0#1304704844</s></a> <a n="EventTime"><s>2011-05-06T14:01:10</s></a> <a n="DeltacloudPrivateNetworkAddresses"><s></s></a> <a n="TriggerEventTypeName"><s>ULOG_GRID_SUBMIT</s></a> <a n="GridJobId"><s>deltacloud Condor_localhost_ip-10-82-250-97.ec2.internal#33.0#1304704844 i-ac1d10c3</s></a> <a n="Proc"><i>0</i></a> <a n="Subproc"><i>0</i></a> <a n="DeltacloudAvailableActions"><s>stop</s></a> <a n="CurrentTime"><e>time()</e></a> <a n="Owner"><s>aeolus</s></a> </c> ==> condor/GridmanagerLog.aeolus <== 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.280741 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.280846 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:01:10 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.281201 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.281397 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13 05/06/11 14:01:10 [21636] directory_util::rec_touch_file: Creating directory /tmp/condorLocks/13/56 05/06/11 14:01:10 [21636] FileLock::obtain(1) - @1304704870.282158 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now WRITE 05/06/11 14:01:10 [21636] directory_util::rec_clean_up: file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] Lock file /tmp/condorLocks/13/56/9182961121613650.lockc has been deleted. 05/06/11 14:01:10 [21636] FileLock::obtain(2) - @1304704870.282513 lock on /tmp/condorLocks/13/56/9182961121613650.lockc now UNLOCKED 05/06/11 14:01:10 [21636] (33.0) gm state change: GM_CREATE_VM -> GM_SAVE_INSTANCE_ID 05/06/11 14:01:10 [21636] (33.0) doEvaluateState called: gmState GM_SAVE_INSTANCE_ID, condorState 1 Fri May 6 14:01:11 EDT 2011 05/06/11 14:01:12 [21636] Received ADD_JOBS signal Fri May 6 14:01:12 EDT 2011
Created a JIRA ticket for Deltacloud API. (DTACLOUD-39)
So as far as reproduction steps, its pretty inconsistent but this is what I did this morning. 1. Launch the cloudEngine AMI, RHEL-6.1-CloudEngine-x86_64-8-Access (ami-a0d926c9) 2. add a provider account 3. import a ami into cloud engine 4. launch 5-10 instances 5. if no dbomatic crash occurs, stop all instances 6. if no dbomatic crash occurs, launch 5-10 instances 7. repeat steps 5 and 6 as needed. Please note, that dbomatic crash can occur during instance creating and stopping instances, it is purely random. Both cases are logged in the previous comment 2 (delete) and comment 4 (create).
Created attachment 499962 [details] conductor & condor logs so I added the following to dbomatic script to print out the events that it is parsing... def characters(string) unless string.strip == "" @logger.info "character string - #{string}" # <---------------- if @tag == "MyType" @event_type = string At this point I restart dbomatic and start 10 images, if dbomatic crashed then I marked it in the dbomatic.log similar to the following and restarted. then I would stop the images, if dbomatic crashed then I would update the log again and do it all over again. ****************************************************************************************** ******* dbomatic died here on stopping multiple images (seemed to crash on the first), manually restarting to try again... ********* ****************************************************************************************** This was the last message I added to the log, I restarted dbomatic and it finished printing out a bunch of events about request exceeded limit, all of the instances went to error on the restart but dbomatic was still running at the end.
So I did more debugging around the dbomatic crash and here is what I got...First I had to add a ton of log messages and isolated it down to this code block from dbomatic script. notifier.watch(condor_event_log_dir, :all_events){ |event| logger.info("inside notifier.watch, checking #{event.name}...") if event.name == "EventLog" && event.flags.include?(:modify) parse_log_file parser end logger.info("notifier.watch, finished checking #{event.name}...") } I surrounded this code with a begin/rescue code block but that revealed nothing, the dbomatic log file still showed the process silently ended with a bunch of these messages: inside notifier.watch, checking GridmanagerLog.aeolus... notifier.watch, finished checking GridmanagerLog.aeolus... So it seems that this code block runs out of events and dies? It never reaches the next lines in the script which are: while true begin logger.info("starting notifier.run...") # dajo added notifier.run logger.info("finishing notifier.run...") # dajo added rescue => e So I am at a lost as what the next step is but I believe it is related to notifier.watch. Looking at its rdoc, its suppose to raise a SystemCallError if something bad happens but my begin/rescue didn't catch anything.
notifier.watch is a callback that gets invoked whenever there is a change to the condor log directory. A begin/rescue around the call to notifier.watch won't necessarily catch the error because you will be wrapping the registration of the notifier handler itself and not the invocation of that handler. Rather if inside the notifier.watch handler you surrounded the code with a begin/rescue all, you might have some more luck (be sure to print out the backtrace).
comment 9 mentions to surround the lines within notifier.watch with a begin/rescue but that didn't reveal anything which is consistent with the previous log output: inside notifier.watch, checking GridmanagerLog.aeolus... notifier.watch, finished checking GridmanagerLog.aeolus... It starts checking a event and comes back to print the finished log message. Its as if that notifier.watch dries up (under heavy use perhaps?). Here is what I tried this time: notifier.watch(condor_event_log_dir, :all_events){ |event| begin logger.info("inside notifier.watch, checking #{event.name}...") if event.name == "EventLog" && event.flags.include?(:modify) logger.info("starting parse_log_file-parser inside notifier.watch...") parse_log_file parser logger.info("finished parse_log_file-parser inside notifier.watch...") end logger.info("notifier.watch, finished checking #{event.name}...") rescue => e logger.error "#{e.backtrace.shift}: #{e.message}" e.backtrace.each do |step| logger.error "\tfrom #{step}" end end }
What is the result of this last round of testing w/ this output? Just sent a patch for extended dbomatic logging to the list. Would try to build a conductor rpm with that and running the tests. Hopefully this will give us a more detailed look at what is failing where. https://fedorahosted.org/pipermail/aeolus-devel/2011-May/001635.html
The same as comment 8, the log ended with a bunch of the following: inside notifier.watch, checking GridmanagerLog.aeolus... notifier.watch, finished checking GridmanagerLog.aeolus...
I pulled the patch down from the mailing this, moved it into space, and gave it a whirl. There are several lines that are throwing exceptions on a bad logger object, I commented out several and gave up on it. Below is a snippet of the end of the log after the process disappears. Feel free to use this server and attempt to bring the issue to a resolution. 10.16.120.131 EventLog modification event triggered, parsing log EventLog modification event trigger completed, parsing finished - current position 301180 EventLog modification event triggered, parsing log Unexpected nil instance, skipping... EventLog modification event trigger completed, parsing finished - current position 301860 EventLog modification event triggered, parsing log EventLog modification event trigger completed, parsing finished - current position 302350 EventLog modification event triggered, parsing log update_instance_state_event for #<Instance:0x7fd9a448b6f8> Unexpected nil GridResource/ExecuteHost field, skipping cloud id update EventLog modification event trigger completed, parsing finished - current position 303183
Mo passed me some new RPMS to test which include a bunch of additional logging and with it, these two similar stack traces are print multiple times in the dbomatic-parser.log: Feel free to log into 10.16.120.131 to take a peek... Instance #<Instance:0x7f4b0de78330> found, running update events update_instance_state_event for #<Instance:0x7f4b0de78330> update_instance_state_event saving instance #<Instance:0x7f4b0de78330> /usr/lib/ruby/gems/1.8/gems/activerecord-2.3.8/lib/active_record/attribute_methods.rb:264:in `method_missing': undefined method `stat' for #<Instance:0x7f4b0de78330> from /usr/share/aeolus-conductor/dbomatic/dbomatic:173:in `update_instance_state_event' from /usr/share/aeolus-conductor/dbomatic/dbomatic:253:in `end_element' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/document.rb:125:in `end_element_namespace' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/push_parser.rb:47:in `native_write' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/push_parser.rb:47:in `<<' from /usr/share/aeolus-conductor/dbomatic/dbomatic:313:in `parse_log_file' from /usr/share/aeolus-conductor/dbomatic/dbomatic:337 update_instance_state_event completed fo #<Instance:0x7f4b0de78330> update_instance_cloud_id for #<Instance:0x7f4b0de78330> Instance #<Instance:0x7f4b0de5c108> found, running update events update_instance_state_event for #<Instance:0x7f4b0de5c108> update_instance_state_event saving instance #<Instance:0x7f4b0de5c108> /usr/lib/ruby/gems/1.8/gems/activerecord-2.3.8/lib/active_record/attribute_methods.rb:264:in `method_missing': undefined method `stat' for #<Instance:0x7f4b0de5c108> from /usr/share/aeolus-conductor/dbomatic/dbomatic:173:in `update_instance_state_event' from /usr/share/aeolus-conductor/dbomatic/dbomatic:253:in `end_element' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/document.rb:125:in `end_element_namespace' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/push_parser.rb:47:in `native_write' from /usr/lib/ruby/gems/1.8/gems/nokogiri-1.4.3.1/lib/nokogiri/xml/sax/push_parser.rb:47:in `<<' from /usr/share/aeolus-conductor/dbomatic/dbomatic:313:in `parse_log_file' from /usr/share/aeolus-conductor/dbomatic/dbomatic:351 from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/watcher.rb:41:in `[]' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/watcher.rb:41:in `callback!' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/event.rb:128:in `callback!' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/notifier.rb:215:in `process' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/notifier.rb:215:in `each' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/notifier.rb:215:in `process' from /usr/lib/ruby/gems/1.8/gems/rb-inotify-0.8.1/lib/rb-inotify/notifier.rb:198:in `run' from /usr/share/aeolus-conductor/dbomatic/dbomatic:357 update_instance_state_event completed fo #<Instance:0x7f4b0de5c108> update_instance_cloud_id for #<Instance:0x7f4b0de5c108>
instances in various states.. condor_rm all jobs in condor.. then tried to remove failed jobs in conductor ui -- Submitter: dhcp-2-205.brq.redhat.com : <10.34.2.205:41258> : dhcp-2-205.brq.redhat.com ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 aeolus 5/27 19:11 0+00:00:00 X 0 0.0 job_ce12_3 2.0 aeolus 5/27 14:18 0+00:00:00 X 0 0.0 job_ce20_4 3.0 aeolus 5/27 15:04 0+00:00:00 X 0 0.0 job_ce30_5 150.0 aeolus 5/27 23:50 3+16:10:37 X 0 0.0 job_instancep861l_ 151.0 aeolus 5/28 00:03 0+00:00:00 X 0 0.0 job_instancemanyp_ 152.0 aeolus 5/28 00:04 0+00:00:00 X 0 0.0 job_instancerfqjo_ 153.0 aeolus 5/28 00:10 0+00:00:00 X 0 0.0 job_instance9vd87_ 154.0 aeolus 5/28 00:21 0+00:00:00 X 0 0.0 job_instancelud5s_ 155.0 aeolus 5/28 00:21 0+00:00:00 X 0 0.0 job_instancejk78f_ 156.0 aeolus 5/28 00:22 0+00:00:00 X 0 0.0 job_instancedj05a_ 157.0 aeolus 5/28 00:23 0+00:00:00 X 0 0.0 job_instanceggjej_ 158.0 aeolus 5/28 00:23 0+00:00:00 X 0 0.0 job_instancewpr83_ 159.0 aeolus 5/28 00:24 0+00:00:00 X 0 0.0 job_instances1llc_ 160.0 aeolus 5/28 00:24 0+00:00:00 X 0 0.0 job_instancejjj60_ 161.0 aeolus 5/28 00:25 0+00:00:00 X 0 0.0 job_instanceu53jc_ 162.0 aeolus 5/28 00:25 0+00:00:00 X 0 0.0 job_instancee68i1_ 163.0 aeolus 5/28 00:26 0+00:00:00 X 0 0.0 job_instancef84kr_ 164.0 aeolus 5/28 00:26 0+00:00:00 X 0 0.0 job_instancexulkq_ 165.0 aeolus 5/28 00:27 0+00:00:00 X 0 0.0 job_instancejydh5_ 166.0 aeolus 5/28 00:27 0+00:00:00 X 0 0.0 job_instance9ixbh_ 167.0 aeolus 5/28 00:28 0+00:00:00 X 0 0.0 job_instanceg855o_ 168.0 aeolus 5/28 00:28 0+00:00:00 X 0 0.0 job_instancefmn7u_ 169.0 aeolus 5/31 16:30 0+00:00:00 X 0 0.0 job_instance4wmyl_ 170.0 aeolus 5/31 16:43 0+00:00:00 X 0 0.0 job_instancevcstw_ 0 jobs; 0 idle, 0 running, 0 held [root@dhcp-2-205 tmp]# tail -f /var/log/aeolus-conductor/dbomatic-parser.log 2011-05-31 17:05:26 INFO Instance instancee68i1 update events completed 2011-05-31 17:05:26 INFO Instance instancee68i1 found, running update events 2011-05-31 17:05:26 INFO update_instance_state_event for instancee68i1 2011-05-31 17:05:26 WARN Unexpected trigger type ULOG_GRID_RESOURCE_DOWN, not updating instance state 2011-05-31 17:05:26 INFO update_instance_cloud_id for instancee68i1 2011-05-31 17:05:26 INFO update_instance_cloud_id updating instance instancee68i1 to cloud provider ec2-us-east-1 2011-05-31 17:05:26 INFO update_instance_cloud_id completed for instancee68i1 2011-05-31 17:05:26 INFO update_instance_addresses for instancee68i1, setting public addresses: --- and private addresses 2011-05-31 17:05:26 INFO update_instance_addresses completed for instancee68i1 2011-05-31 17:05:26 INFO Instance instancee68i1 update events completed 2011-05-31 17:13:34 INFO Instance instancedj05a found, running update events 2011-05-31 17:13:34 INFO update_instance_state_event for instancedj05a 2011-05-31 17:13:34 INFO update_instance_state_event saving instance instancedj05a 2011-05-31 17:13:34 DEBUG updated_instance_state_event saved instance instancedj05a, creating event for state stopped@2011-05-31T17:13:34 2011-05-31 17:13:34 INFO update_instance_state_event completed fo instancedj05a 2011-05-31 17:13:34 INFO update_instance_cloud_id for instancedj05a 2011-05-31 17:13:34 WARN Unexpected nil GridResource/ExecuteHost field, skipping cloud id update 2011-05-31 17:13:34 INFO update_instance_addresses for instancedj05a, setting public addresses: --- and private addresses 2011-05-31 17:13:34 INFO update_instance_addresses completed for instancedj05a 2011-05-31 17:13:34 INFO Instance instancedj05a update events completed 2011-05-31 17:13:34 INFO Instance instancee68i1 found, running update events 2011-05-31 17:13:34 INFO update_instance_state_event for instancee68i1 2011-05-31 17:13:34 INFO update_instance_state_event saving instance instancee68i1 2011-05-31 17:13:34 DEBUG updated_instance_state_event saved instance instancee68i1, creating event for state stopped@2011-05-31T17:13:34 2011-05-31 17:13:34 INFO update_instance_state_event completed fo instancee68i1 2011-05-31 17:13:34 INFO update_instance_cloud_id for instancee68i1 2011-05-31 17:13:34 WARN Unexpected nil GridResource/ExecuteHost field, skipping cloud id update 2011-05-31 17:13:34 INFO update_instance_addresses for instancee68i1, setting public addresses: --- and private addresses 2011-05-31 17:13:34 INFO update_instance_addresses completed for instancee68i1 2011-05-31 17:13:34 INFO Instance instancee68i1 update events completed
resolved the issue in comment 15 to be from a crashed but running deltacloud ec2 driver
no commit messages. :( moving to on_qa because we know this is fixed
should be retested because dbomatic currently does not start correctly after install
verified [root@ibm-x3950m2-01 noarch]# rpm -qa | grep aeolus aeolus-conductor-0.3.0-0.el6.20110623205403git551632a.noarch aeolus-configure-2.0.1-0.el6.20110622123902gitdf4ae05.noarch aeolus-conductor-doc-0.3.0-0.el6.20110623205403git551632a.noarch aeolus-all-0.3.0-0.el6.20110623205403git551632a.noarch rubygem-aeolus-cli-0.0.1-1.el6.20110623205403git551632a.noarch aeolus-conductor-daemons-0.3.0-0.el6.20110623205403git551632a.noarch aeolus-conductor-devel-0.3.0-0.el6.20110623205403git551632a.noarch
release pending...
release pending 3
release pending.. 2
perm close
closing out old bugs