Bug 977082 - nova [Negative]: instances are stuck on task 'scheduling' when running multiple instances and compute service is down on one of the hosts
nova [Negative]: instances are stuck on task 'scheduling' when running multip...
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova (Show other bugs)
x86_64 Linux
unspecified Severity high
: ---
: 4.0
Assigned To: Nikola Dipanov
Ami Jeain
Depends On:
  Show dependency treegraph
Reported: 2013-06-23 07:53 EDT by Dafna Ron
Modified: 2016-04-22 01:02 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2014-01-02 11:20:04 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
logs (1.78 MB, application/x-gzip)
2013-06-23 07:53 EDT, Dafna Ron
no flags Details

  None (edit)
Description Dafna Ron 2013-06-23 07:53:27 EDT
Created attachment 764292 [details]

Description of problem:

I installed an AIO + one more nova compute host. 

in the host that has only nova-compute I stopped the openstack-nova-compute service and ran 10 instances. 

5 out of the 10 instances got stuck in state BUILD with task 'scheduling' 
even after i started the service the instances are not starting. 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. create an AIO + one more host with nova-compute on it
2. stop the nova-compute on the host that has nova-compute only
3. start multiple instances 

Actual results:

some of the instances are stuck on status BUILD with task scheduling 
even after starting the service the instances are not finishing the build. 

Expected results:

1. if we cannot run the instances we should not start them at all (i.e we should detect that the service is down and we cannot run the instances on that host) 
2. if for any reason we do start them we should move them to Error once we find that we cannot sustain them 
3. if an instance is stuck in scheduling task we should be able to start it once we have an additional resource. 

Additional info: logs

[root@opens-vdsb tmp(keystone_admin)]# nova list
| ID                                   | Name                                       | Status | Networks                 |
| 009efbf6-de2b-451b-870f-fdf1c19414e8 | dafna-009efbf6-de2b-451b-870f-fdf1c19414e8 | BUILD  |                          |
| 116b0b55-a9bc-4dcd-b7d8-abe141510e38 | dafna-116b0b55-a9bc-4dcd-b7d8-abe141510e38 | BUILD  |                          |
| 896e10ef-0906-46ad-8001-99a35062a381 | dafna-896e10ef-0906-46ad-8001-99a35062a381 | BUILD  |                          |
| 9b2d2161-2ee0-4c66-99bb-73ce26759cc3 | dafna-9b2d2161-2ee0-4c66-99bb-73ce26759cc3 | ACTIVE | novanetwork= |
| a316b3a5-5b46-4cb4-aa24-9c8d328a0d67 | dafna-a316b3a5-5b46-4cb4-aa24-9c8d328a0d67 | ACTIVE | novanetwork= |
| ae19066f-70d3-4f40-a402-0dad2d2cabb4 | dafna-ae19066f-70d3-4f40-a402-0dad2d2cabb4 | ACTIVE | novanetwork= |
| bff72b2f-f7f1-48e7-9b29-2dc34499d318 | dafna-bff72b2f-f7f1-48e7-9b29-2dc34499d318 | BUILD  |                          |
| c09f7a34-5d79-4d7e-96f4-ae2ac29d270e | dafna-c09f7a34-5d79-4d7e-96f4-ae2ac29d270e | ACTIVE | novanetwork= |
| dc6c71b1-a630-46f8-b5c3-51cd215112f9 | dafna-dc6c71b1-a630-46f8-b5c3-51cd215112f9 | ACTIVE | novanetwork= |
| f8eed9d0-6c2e-4129-a073-dedfb8c5e0a6 | dafna-f8eed9d0-6c2e-4129-a073-dedfb8c5e0a6 | BUILD  |                          |

[root@opens-vdsb tmp(keystone_admin)]# virsh -r list
 Id    Name                           State
 3     instance-00000016              running
 4     instance-0000001a              running
 5     instance-00000014              running
 6     instance-00000012              running
 7     instance-00000018              running

[root@nott-vdsa ~(keystone_admin)]# virsh -r list
 Id    Name                           State
Comment 1 Nikola Dipanov 2013-11-01 09:06:31 EDT
This bug seems to have been caught with RHOS 3.0. Now that we have 4.0 builds - it would be good to confirm weather this is still an issue.
Comment 2 Dafna Ron 2013-11-01 10:15:37 EDT
it seems that instances move to ERROR state if they cannot run in Havana
Comment 3 Nikola Dipanov 2013-11-13 11:32:14 EST
After a brief chat with Dafna - she seems to think the issue is fixed, so the bug has likely been fixed in the Havana release. Due to the nature of the bug - she was keen to do a few more tests so I am leaving a needinfo on, so that we can confirm that it is indeed fixed.
Comment 5 Dave Allan 2014-01-02 11:20:04 EST
Closing as we are unable to reproduce it with 4.0; please reopen if it reappears.

Note You need to log in before you can comment on or make changes to this bug.