Bug 1309012 - Once in a while, scheduling will fail to select a node when deploying overcloud
Summary: Once in a while, scheduling will fail to select a node when deploying overcloud
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 10.0 (Newton)
Assignee: Angus Thomas
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-16 17:22 UTC by David Hill
Modified: 2016-11-14 15:36 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-10-14 19:00:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description David Hill 2016-02-16 17:22:06 UTC
Description of problem:
Once in a while, scheduling will fail to select a node when deploying overcloud but retries to schedule the same node and will succeed.

| fault                                | {"message": "No valid host was found. There are not enough hosts available.", "code": 500, "details": "  File \"/usr/lib/python2.7/site-packages/nova/conductor/manager.py\", line 739, in build_inst
ances |
|                                      |     request_spec, filter_properties)                                                                                                                                                                 
      |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/utils.py\", line 343, in wrapped                                                                                                                  |
|                                      |     return func(*args, **kwargs)                                                                                                                                                                           |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py\", line 52, in select_destinations                                                                                             |
|                                      |     context, request_spec, filter_properties)                                                                                                                                                              |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/client/__init__.py\", line 37, in __run_method                                                                                                    |
|                                      |     return getattr(self.instance, __name)(*args, **kwargs)                                                                                                                                                 |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/client/query.py\", line 34, in select_destinations                                                                                                |
|                                      |     context, request_spec, filter_properties)                                                                                                                                                              |
|                                      |   File \"/usr/lib/python2.7/site-packages/nova/scheduler/rpcapi.py\", line 120, in select_destinations                                                                                                     |
|                                      |     request_spec=request_spec, filter_properties=filter_properties)                                                                                                                                        |
|                                      |   File \"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py\", line 158, in call                                                                                                                |
|                                      |     retry=self.retry)                                                                                                                                                                                      |
|                                      |   File \"/usr/lib/python2.7/site-packages/oslo_messaging/transport.py\", line 90, in _send                                                                                                                 |
|                                      |     timeout=timeout, retry=retry)                                                                                                                                                                          |
|                                      |   File \"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py\", line 431, in send                                                                                                       |
|                                      |     retry=retry)                                                                                                                                                                                           |
|                                      |   File \"/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py\", line 422, in _send                                                                                                      |
|                                      |     raise result                                                                                                                                                                                           |
|                                      | ", "created": "2016-02-16T16:14:52Z"}                                                                                                                                                                      |
| flavor                               | ceph-storage (63e4d9db-7e98-4191-830a-f203d0f0685f)                                                          



[stack@undercloud-rhosp8 cloud]$ nova list
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                    | Status | Task State | Power State | Networks            |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+
| 5ff3bffc-6e78-4109-a361-f388094dfbef | overcloud-cephstorage-0 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.20 |
| a920f140-49dd-4818-9f41-813444b73af2 | overcloud-cephstorage-1 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.18 |
| 5b7246a4-1ffe-4627-acea-f5f93dbcdfaf | overcloud-cephstorage-2 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.19 |
| 78f1047c-5ed4-46a6-81dc-556df5275a99 | overcloud-cephstorage-2 | ERROR  | -          | NOSTATE     |                     |
| 6087c1dd-b465-4029-898a-64c4ef68280a | overcloud-controller-0  | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.23 |
| 3140957b-7cf7-44e3-8c6d-6f739ba56af5 | overcloud-controller-1  | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.22 |
| 32441962-9ab4-4fd8-b45f-45b2a053fc92 | overcloud-controller-2  | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.24 |
| 73d2cf8d-2570-4a55-a075-af4cf433819c | overcloud-novacompute-0 | BUILD  | spawning   | NOSTATE     | ctlplane=192.0.2.21 |
+--------------------------------------+-------------------------+--------+------------+-------------+---------------------+


Version-Release number of selected component (if applicable):


How reproducible:
In almost every deployment I had this failure

Steps to Reproduce:
1. Deploy an overcloud
2. Wait
3. If it doesn't happen, go back to step 1.

Actual results:
Node failed at scheduling

Expected results:
Always succeed

Additional info:

Comment 2 Mike Burns 2016-04-07 21:11:06 UTC
This bug did not make the OSP 8.0 release.  It is being deferred to OSP 10.

Comment 5 David Hill 2016-11-14 15:36:18 UTC
Will reopen if this happens again.  Haven't seen it in a while.


Note You need to log in before you can comment on or make changes to this bug.