Created attachment 764292 [details] logs Description of problem: I installed an AIO + one more nova compute host. in the host that has only nova-compute I stopped the openstack-nova-compute service and ran 10 instances. 5 out of the 10 instances got stuck in state BUILD with task 'scheduling' even after i started the service the instances are not starting. Version-Release number of selected component (if applicable): openstack-nova-api-2013.1.2-2.el6ost.noarch openstack-nova-scheduler-2013.1.2-2.el6ost.noarch openstack-nova-compute-2013.1.2-2.el6ost.noarch How reproducible: 100% Steps to Reproduce: 1. create an AIO + one more host with nova-compute on it 2. stop the nova-compute on the host that has nova-compute only 3. start multiple instances Actual results: some of the instances are stuck on status BUILD with task scheduling even after starting the service the instances are not finishing the build. Expected results: 1. if we cannot run the instances we should not start them at all (i.e we should detect that the service is down and we cannot run the instances on that host) 2. if for any reason we do start them we should move them to Error once we find that we cannot sustain them 3. if an instance is stuck in scheduling task we should be able to start it once we have an additional resource. Additional info: logs [root@opens-vdsb tmp(keystone_admin)]# nova list +--------------------------------------+--------------------------------------------+--------+--------------------------+ | ID | Name | Status | Networks | +--------------------------------------+--------------------------------------------+--------+--------------------------+ | 009efbf6-de2b-451b-870f-fdf1c19414e8 | dafna-009efbf6-de2b-451b-870f-fdf1c19414e8 | BUILD | | | 116b0b55-a9bc-4dcd-b7d8-abe141510e38 | dafna-116b0b55-a9bc-4dcd-b7d8-abe141510e38 | BUILD | | | 896e10ef-0906-46ad-8001-99a35062a381 | dafna-896e10ef-0906-46ad-8001-99a35062a381 | BUILD | | | 9b2d2161-2ee0-4c66-99bb-73ce26759cc3 | dafna-9b2d2161-2ee0-4c66-99bb-73ce26759cc3 | ACTIVE | novanetwork=192.168.32.2 | | a316b3a5-5b46-4cb4-aa24-9c8d328a0d67 | dafna-a316b3a5-5b46-4cb4-aa24-9c8d328a0d67 | ACTIVE | novanetwork=192.168.32.6 | | ae19066f-70d3-4f40-a402-0dad2d2cabb4 | dafna-ae19066f-70d3-4f40-a402-0dad2d2cabb4 | ACTIVE | novanetwork=192.168.32.4 | | bff72b2f-f7f1-48e7-9b29-2dc34499d318 | dafna-bff72b2f-f7f1-48e7-9b29-2dc34499d318 | BUILD | | | c09f7a34-5d79-4d7e-96f4-ae2ac29d270e | dafna-c09f7a34-5d79-4d7e-96f4-ae2ac29d270e | ACTIVE | novanetwork=192.168.32.3 | | dc6c71b1-a630-46f8-b5c3-51cd215112f9 | dafna-dc6c71b1-a630-46f8-b5c3-51cd215112f9 | ACTIVE | novanetwork=192.168.32.5 | | f8eed9d0-6c2e-4129-a073-dedfb8c5e0a6 | dafna-f8eed9d0-6c2e-4129-a073-dedfb8c5e0a6 | BUILD | | +--------------------------------------+--------------------------------------------+--------+--------------------------+ [root@opens-vdsb tmp(keystone_admin)]# virsh -r list Id Name State ---------------------------------------------------- 3 instance-00000016 running 4 instance-0000001a running 5 instance-00000014 running 6 instance-00000012 running 7 instance-00000018 running [root@nott-vdsa ~(keystone_admin)]# virsh -r list Id Name State ----------------------------------------------------
This bug seems to have been caught with RHOS 3.0. Now that we have 4.0 builds - it would be good to confirm weather this is still an issue.
it seems that instances move to ERROR state if they cannot run in Havana
After a brief chat with Dafna - she seems to think the issue is fixed, so the bug has likely been fixed in the Havana release. Due to the nature of the bug - she was keen to do a few more tests so I am leaving a needinfo on, so that we can confirm that it is indeed fixed.
Closing as we are unable to reproduce it with 4.0; please reopen if it reappears.