Bug 788819

Summary: No visibility when dbomatic has an issue starting up an instance when resources are low on the cloud provider
Product: [Retired] CloudForms Cloud Engine Reporter: Richard Su <rwsu>
Component: aeolus-conductorAssignee: Angus Thomas <athomas>
Status: CLOSED DUPLICATE QA Contact: wes hayutin <whayutin>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 1.0.0CC: akarol, dajohnso, deltacloud-maint, ssachdev
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-09 14:33:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Richard Su 2012-02-09 04:19:12 UTC
Description of problem:

A RHEV environment may run out of resources. For example it may have adequate disk space to create additional VMs, but its hosts are close to reaching capacity and only a few more additional instances can be started. In this situation, Conductor/dbomatic doesn't know how much more resources are available in RHEV. You will still be able to create deployments, and the VMs will be created in RHEV, but when dbomatic goes to start them, some of the instances will come up and show as RUNNING while others are stuck on STOPPED state. 

There is an error that is written to /var/log/deltacloud-core/mock.log. But from conductor you have no visibility as to what is the problem. It would be nice if conductor can show this error.

From mock.log:
thin server (localhost:3002) [deltacloud-mock][2742]: 127.0.0.1 - - [08/Feb/2012 23:00:49] "POST /api/instances/ae349b88-4cf6-464d-83a1-6af8e2b3224a/start HTTP/1.1" 500 5794 1.6570
thin server (localhost:3002) [deltacloud-mock][2742]: RHEVM::RHEVMBackendException:Cannot run VM. There are no available running Hosts with sufficient memory in VM's Cluster .
/usr/share/deltacloud-core/lib/deltacloud/drivers/rhevm/rhevm_client.rb:89:in `vm_action'
/usr/share/deltacloud-core/lib/deltacloud/drivers/rhevm/rhevm_driver.rb:157:in `start_instance'
/usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `call'
/usr/share/deltacloud-core/lib/deltacloud/base_driver/exceptions.rb:151:in `safely'
/usr/share/deltacloud-core/lib/deltacloud/drivers/rhevm/rhevm_driver.rb:156:in `start_instance'
/usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:128:in `send'
/usr/share/deltacloud-core/lib/deltacloud/helpers/application_helper.rb:128:in `instance_action'
/usr/share/deltacloud-core/lib/deltacloud/server.rb:523
/usr/share/deltacloud-core/lib/sinatra/rabbit.rb:125:in `instance_eval'
/usr/share/deltacloud-core/lib/sinatra/rabbit.rb:125:in `POST /api/instances/:id/start'


Version-Release number of selected component (if applicable):
aeolus-conductor-daemons-0.8.0-24.el6.noarch
aeolus-conductor-0.8.0-24.el6.noarch

Steps to Reproduce:
1. Create a RHEV environment that has limited resources available
2. Launch a deployment that will create enough instances that will max out the resources in RHEV
  
Actual results:
Some instances stuck in STOPPED state.


Expected results:
An error should be shown indicating the nature of the problem. 
Would be great if conductor can figure out how many more instances can be created in X provider and prevent users from creating more than can be supported, but this would be more complicated.

Comment 1 Dave Johnson 2012-02-09 14:33:18 UTC

*** This bug has been marked as a duplicate of bug 786535 ***