Description of problem: ============================================ Through conductor I deployed 10 images but only 9 vms actually started. Looking through the logs I found this backend exception in deltacloud-core. This type of error need to be communicated to the user. Version-Release number of selected component (if applicable): ================================================================= deltacloud-core-0.5.0-4.rc1.el6.noarch deltacloud-core-ec2-0.5.0-4.rc1.el6.noarch deltacloud-core-rhevm-0.5.0-4.rc1.el6.noarch deltacloud-core-vsphere-0.5.0-4.rc1.el6.noarch rubygem-deltacloud-client-0.4.0-3.el6.noarch Additional info: ================================== thin server (localhost:3002) [deltacloud-mock][18292]: RHEVM::RHEVMBackendException:Cannot run VM. There are no available running Hosts with sufficient memory in VM's Cluster . /usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/rhevm/rhevm_client.rb:89:in `vm_action' /usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/rhevm/rhevm_driver.rb:153:in `start_instance' /usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `call' /usr/share/deltacloud-core/bin/../lib/deltacloud/base_driver/exceptions.rb:151:in `safely' /usr/share/deltacloud-core/bin/../lib/deltacloud/drivers/rhevm/rhevm_driver.rb:152:in `start_instance' /usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:128:in `send' /usr/share/deltacloud-core/bin/../lib/deltacloud/helpers/application_helper.rb:128:in `instance_action' /usr/share/deltacloud-core/bin/../lib/deltacloud/server.rb:503 /usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `instance_eval' /usr/share/deltacloud-core/bin/../lib/sinatra/rabbit.rb:125:in `POST /api/instances/:id/start' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `call' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:1151:in `compile!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:724:in `route_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:708:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:758:in `process_route' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:755:in `process_route' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:707:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `each' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:706:in `route!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:843:in `dispatch!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `instance_eval' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `catch' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:808:in `invoke' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:644:in `call!' /usr/lib/ruby/gems/1.8/gems/sinatra-1.2.6/lib/sinatra/base.rb:629:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_syslog.rb:48:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_date.rb:31:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_accept.rb:149:in `call' /usr/lib/ruby/gems/1.8/gems/rack-1.3.0/lib/rack/head.rb:9:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_driver_select.rb:45:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_matrix_params.rb:106:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_runtime.rb:36:in `call' /usr/share/deltacloud-core/bin/../lib/sinatra/rack_etag.rb:41:in `call' /usr/lib/ruby/gems/1.8/gems/rack-accept-0.4.3/lib/rack/accept/context.rb:22:in `call'
So maybe this isn't deltacloud's issue, conductor needs to catch this? Please advise...
Cleaning up ON_QA bugs I came across this related issue: bug 744289 pointing to --> https://issues.apache.org/jira/browse/DTACLOUD-88
Is conductor problem not Deltacloud problem.
https://issues.apache.org/jira/browse/DTACLOUD-88 is still open, mostly because I had not gotten to that record in the JIRA cleanup process (reverifying and closing old reports). According to BZ-744289, DTACLOUD-88 should actually be resolved. This report shows that Deltacloud is actually returning the backend error message as requested in DTACLOUD-88: "Cannot run VM. There are no available running Hosts with sufficient memory in VM's Cluster", not just the generic "Operation start failed." Since the problem reported is that "this type of error need to be communicated to the user" and the user was interfacing with Conductor, I'm reassigning this BZ to that component.(Also see development's comments in Comment 3 above).
Dave, is this error properly reported through API? Mean do you get '50x' error? If so, then this is not a bug. DC just properly reports what it got from the RHEV-M server.
Imre, This looks like an instance where we're not reacting correctly to an error. Can you please investigate? Angus
*** Bug 788819 has been marked as a duplicate of this bug. ***
Patch has been posted: https://fedorahosted.org/pipermail/aeolus-devel/2012-February/008839.html
Based on the review pushing back to ON_DEV
Patch resent: https://fedorahosted.org/pipermail/aeolus-devel/2012-March/009517.html
I tried to reproduce it more times. Issues: - wrong flash massage(exception from DC appeared in the UI when the user stopped instance) - state of instance during starting changed from pending to create_failed and vice versa - failed instance (rhevm out of memory) changed state from create_failed and stopped and vice versa.
Created attachment 570415 [details] flash msg only after reload
As I wrote on the list, could you provide the steps necessary to reproduce these issues because I wasn't able to do that?
After more reproductions I could not to hit issues with states. I think, it appears only with special cases. Before I had 4 running instances from 5 in deployment, I achieved running ones only 2-3 of 5 in today's reproducing. Also strange thing for me was that the count of running instances was various. I had observed case when I stopped running instances then create_failed instance started in previous reproduction, but not in today's one. I'm not sure if it was caused by rhevm or dbomatic. Still I think you should fix the flash message when the conductor gets error about memory and show it immediately not after refresh the page.
Patch fixed: https://fedorahosted.org/pipermail/aeolus-devel/2012-April/009987.html
Rebased revision sent: https://fedorahosted.org/pipermail/aeolus-devel/2012-May/010523.html
Pushed to master: commit f9d0e42701c5bf22e06363cfa9427fd16b206965 Author: Imre Farkas <ifarkas> Date: Tue Feb 14 16:51:00 2012 +0100 BZ786535: display failures for instances (rev. 4) https://bugzilla.redhat.com/show_bug.cgi?id=786535 Rebased and autoupdate moved to mustache
Tested rpms: >> rpm -qa |grep aeolus aeolus-configure-2.8.6-1.el6cf.noarch rubygem-aeolus-image-0.3.0-12.el6.noarch rubygem-aeolus-cli-0.7.1-1.el6cf.noarch aeolus-conductor-0.13.8-1.el6cf.noarch aeolus-conductor-daemons-0.13.8-1.el6cf.noarch aeolus-conductor-doc-0.13.8-1.el6cf.noarch aeolus-all-0.13.8-1.el6cf.noarch I launched a rhevm instance to a realm w/o available hosts. Conductor returned the following alert - no page refresh required: *********** Alerts 1 ce-gqiig/testRHEVM Instance Failure 500 : Cannot run VM. There are no available running Hosts in the Host Cluster. *********** See the attached screenshot for full page view. Marking this BZ as 'verified'
Created attachment 615589 [details] Alert shown w/o refresh
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-1516.html