Description of problem: Let say that we set a quota of 10 instances (that's the default after installation) Start 10 instances (all good). One compute node fails and instances are being migrated to another hypervisor. The instances will fail to start because quota are exceeded. This appears to be an issue related to the fact that even if a compute node is down in nova, the instances on that node are still taken into account for quota, even if dead, leaving no space to rebuild the new ones. Version-Release number of selected component (if applicable): openstack-nova-common-2015.1.0-4.el7ost.noarch openstack-nova-console-2015.1.0-4.el7ost.noarch openstack-nova-scheduler-2015.1.0-4.el7ost.noarch openstack-nova-novncproxy-2015.1.0-4.el7ost.noarch openstack-nova-conductor-2015.1.0-4.el7ost.noarch openstack-nova-api-2015.1.0-4.el7ost.noarch python-nova-2015.1.0-4.el7ost.noarch python-novaclient-2.23.0-1.el7ost.noarch How reproducible: always Steps to Reproduce: 1. set quota to 10 instances 2. start 10 instances 3. kill one compute node hard (crash kernel or whatever) 4. wait for nova to recognize the compute node is down 5. start evacuation Actual results: Instances will fail to start. Expected results: Instances will start. Additional info: I have collected sosreports from many failures here: http://mrg-01.mpc.lab.eng.bos.redhat.com/sosreports/ I can't pinpoint the time in the logs of when that happened. Also those sosreports will be wiped soon'ish. Please download them if necessary.
Effectively the problem here appears to be that instances under evacuation should not be double-counted against quota while the instance rebuilds are in-flight. I think this can be reproduced and fixed independently of the instance HA setup.
Created attachment 1037323 [details] sos-report
Created attachment 1037325 [details] sos-report
Created attachment 1037326 [details] sos-report
I have tested this same condition with the .10 build and I was not able to reproduce it. It´s probably been fixed in the rebase from .4 to .10, but still worth checking. Lowering priority/severity
(In reply to Fabio Massimo Di Nitto from comment #9) > I have tested this same condition with the .10 build and I was not able to > reproduce it. It´s probably been fixed in the rebase from .4 to .10, but > still worth checking. > > Lowering priority/severity Any update on this Fabio - have you encountered this again? I am assuming no and moving to 8.0/7.0.z but please keep me posted.
(In reply to Stephen Gordon from comment #10) > (In reply to Fabio Massimo Di Nitto from comment #9) > > I have tested this same condition with the .10 build and I was not able to > > reproduce it. It´s probably been fixed in the rebase from .4 to .10, but > > still worth checking. > > > > Lowering priority/severity > > Any update on this Fabio - have you encountered this again? I am assuming no > and moving to 8.0/7.0.z but please keep me posted. I didn´t test this condition anylonger. It´s worth keeping it as TestOnly bug in my opinion since it touches a specific boundary and it´s easy to test.
> I didn´t test this condition anylonger. It´s worth keeping it as TestOnly > bug in my opinion since it touches a specific boundary and it´s easy to test. Since this bug hasn't seen any updates for a while, I'm assuming the issue hasn't been encountered again. I'm going to close this bug for now, if you feel it should remain open, don't hesitate to let us know. Cheers!