Bug 1486605 - OSP11 -> OSP12 upgrade: major-upgrade-composable-steps-docker fails on composable roles deployment with: ERROR: Property error: : resources.Compute<nested_stack>.resources[0].properties: : Unknown Property NovaComputeSchedulerHints"
Summary: OSP11 -> OSP12 upgrade: major-upgrade-composable-steps-docker fails on compos...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 12.0 (Pike)
Assignee: mathieu bultel
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-08-30 09:03 UTC by Marius Cornea
Modified: 2023-02-22 23:02 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-11 16:26:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
roles_data (8.10 KB, text/plain)
2017-08-30 09:03 UTC, Marius Cornea
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 501600 0 None ABANDONED Check the role data flags and failed is missing 2021-02-12 05:14:05 UTC

Description Marius Cornea 2017-08-30 09:03:56 UTC
Created attachment 1319943 [details]
roles_data

Description of problem:
OSP11 -> OSP12 upgrade: major-upgrade-composable-steps-docker fails on composable roles deployment with: ERROR: Property error: : resources.Compute<nested_stack>.resources[0].properties: : Unknown Property NovaComputeSchedulerHints"

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.0-0.20170821194253.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy OSP11 with composable roles
2. Adjust roles_data file for OSP12
3. Upgrade to OSP12

Actual results:
major-upgrade-composable-steps-docker fails with:

 u'message': u"Failed to run action [action_ex_id=0022936d-54f4-4d5b-a7bf-f6a9b231f8dc, action_cls='<class 'mistral.actions.action_factory.DeployStackAction'>', attributes='{}', params='{u'skip_deploy_identifier': False, u'container': u'overcloud', u'timeout': 240}']\n ERROR: Property error: : resources.Compute<nested_stack>.resources[0].properties: : Unknown Property NovaComputeSchedulerHints",
 u'status': u'FAILED'}


Expected results:
major-upgrade-composable-steps-docker doesn't fail.

Additional info:
Attaching the roles_data.yaml

Comment 1 Marius Cornea 2017-08-30 09:32:29 UTC
I was able to move past this error by assigning the following param to the Compute role in the custom roles data:

deprecated_param_scheduler_hints: 'NovaComputeSchedulerHints'

but then it failed with a new error:


2017-08-30 09:27:44Z [overcloud]: UPDATE_FAILED  resources.Controller: resources[2]: BadRequest: resources.Controller: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-1e3a1b62-af8f-4efb-8e33-a136d536535e)

 Stack overcloud UPDATE_FAILED 

overcloud.Compute.1.Compute:
  resource_type: OS::TripleO::ComputeServer
  physical_resource_id: 862aab51-d3ab-46a1-a321-b2e6c11e5053
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Compute: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
overcloud.Compute.0.Compute:
  resource_type: OS::TripleO::ComputeServer
  physical_resource_id: 4b319bd6-701a-4269-a833-8834e018daa0
  status: CREATE_FAILED
  status_reason: |
    ResourceInError: resources.Compute: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
overcloud.Controller.1.Controller:
  resource_type: OS::TripleO::Server
  physical_resource_id: 9953be5a-64a4-4b9d-b690-48857a6d628a
  status: UPDATE_FAILED
  status_reason: |
    BadRequest: resources.Controller: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-c8d74b27-bda8-46a8-9008-4af0545ecb39)
overcloud.Controller.0.Controller:
  resource_type: OS::TripleO::Server
  physical_resource_id: 4980e829-bdec-44f5-99c2-39962650bd14
  status: UPDATE_FAILED
  status_reason: |
    BadRequest: resources.Controller: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-6cdb6bf3-61e0-4a08-af8c-b1138a9c8c5f)
overcloud.Controller.2.Controller:
  resource_type: OS::TripleO::Server
  physical_resource_id: c4fc238f-40bd-431a-9a36-9f1e675b6254
  status: UPDATE_FAILED
  status_reason: |
    BadRequest: resources.Controller: No valid host was found. No valid host found for resize (HTTP 400) (Request-ID: req-1e3a1b62-af8f-4efb-8e33-a136d536535e)
Heat Stack update failed.
Heat Stack update failed.

Comment 2 Marius Cornea 2017-08-30 10:05:07 UTC
I tried assigning the deprecated_param_flavor option to the Controller and Compute role in the custom roles data file:

  deprecated_param_flavor: 'OvercloudControlFlavor'
  deprecated_param_flavor: 'OvercloudComputeFlavor'

but I ended with duplicate compute instances which are in ERROR state:

(undercloud) [stack@undercloud-0 ~]$ nova list
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| 20bdaf46-3960-45b9-9781-40f54777ce8f | compute-0    | ERROR  | -          | NOSTATE     |                        |
| 64b7ccc9-2090-4a0f-a647-7c783b68ee0f | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.18 |
| 9ed4466e-bf6c-4e24-87d8-9f29860c86f1 | compute-0    | ERROR  | -          | NOSTATE     |                        |
| 23ac5849-bd8c-4af2-9db3-145bc6e27f23 | compute-1    | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
| e269062a-7239-4367-9d56-532849f6e18d | compute-1    | ERROR  | -          | NOSTATE     |                        |
| f8960e82-fd62-460a-88af-42193d7e7fe1 | compute-1    | ERROR  | -          | NOSTATE     |                        |
| 4980e829-bdec-44f5-99c2-39962650bd14 | controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.23 |
| 9953be5a-64a4-4b9d-b690-48857a6d628a | controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.22 |
| c4fc238f-40bd-431a-9a36-9f1e675b6254 | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.24 |
| 0a368bdf-479c-4385-b825-460aa0e943a9 | database-0   | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| f36656e8-b60b-4a34-b8c8-24f091fe3cc2 | database-1   | ACTIVE | -          | Running     | ctlplane=192.168.24.20 |
| a00f242e-7fc7-42de-ab3b-80a43418a3d7 | database-2   | ACTIVE | -          | Running     | ctlplane=192.168.24.8  |
| e8a234ad-15c1-48ee-8eb6-61fc97aa4fc4 | messaging-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.14 |
| 22cad3e7-d066-4336-9693-f83dbb0c091e | messaging-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 384061cf-a195-452b-859a-3276651de8a6 | messaging-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| 594d2ea4-5e26-4d6e-9952-35f375e70f87 | networker-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| 97204f57-396d-4388-8a2c-a2d116aaef0f | networker-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+

Comment 3 Marius Cornea 2017-08-30 10:15:07 UTC
Trying to delete one of the nova instances in error state by uuid ends up deleting all the instances with the same name (including the active one running workloads):

(undercloud) [stack@undercloud-0 ~]$ openstack overcloud node delete --stack overcloud 20bdaf46-3960-45b9-9781-40f54777ce8f
Deleting the following nodes from stack overcloud:
- 20bdaf46-3960-45b9-9781-40f54777ce8f
Started Mistral Workflow tripleo.scale.v1.delete_node. Execution ID: bb5518fe-4f00-491a-9420-8d40731aae0c
('The read operation timed out',)
(undercloud) [stack@undercloud-0 ~]$ nova list
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
/usr/lib/python2.7/site-packages/requests/packages/urllib3/connection.py:303: SubjectAltNameWarning: Certificate for 192.168.24.2 has no `subjectAltName`, falling back to check for a `commonName` for now. This feature is being removed by major browsers and deprecated by RFC 2818. (See https://github.com/shazow/urllib3/issues/497 for details.)
  SubjectAltNameWarning
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| ID                                   | Name         | Status | Task State | Power State | Networks               |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+
| 23ac5849-bd8c-4af2-9db3-145bc6e27f23 | compute-1    | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
| 44b3774b-9542-49d6-b780-1f8a170536d0 | compute-1    | BUILD  | scheduling | NOSTATE     |                        |
| f8960e82-fd62-460a-88af-42193d7e7fe1 | compute-1    | ERROR  | -          | NOSTATE     |                        |
| 4980e829-bdec-44f5-99c2-39962650bd14 | controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.23 |
| 9953be5a-64a4-4b9d-b690-48857a6d628a | controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.22 |
| c4fc238f-40bd-431a-9a36-9f1e675b6254 | controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.24 |
| 0a368bdf-479c-4385-b825-460aa0e943a9 | database-0   | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| f36656e8-b60b-4a34-b8c8-24f091fe3cc2 | database-1   | ACTIVE | -          | Running     | ctlplane=192.168.24.20 |
| a00f242e-7fc7-42de-ab3b-80a43418a3d7 | database-2   | ACTIVE | -          | Running     | ctlplane=192.168.24.8  |
| e8a234ad-15c1-48ee-8eb6-61fc97aa4fc4 | messaging-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.14 |
| 22cad3e7-d066-4336-9693-f83dbb0c091e | messaging-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 384061cf-a195-452b-859a-3276651de8a6 | messaging-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| 594d2ea4-5e26-4d6e-9952-35f375e70f87 | networker-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| 97204f57-396d-4388-8a2c-a2d116aaef0f | networker-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
+--------------------------------------+--------------+--------+------------+-------------+------------------------+

Comment 4 Marios Andreou 2017-08-30 13:35:24 UTC
can we try again and run the upgrade with all the 'new' flags in roles_data (this seems at least related to BZ 1486311 ). If it passes great, otherwise we need to reach out to someone from DFG:DF possibly (or whoever it is that added the mistral support for and those https://github.com/openstack/tripleo-heat-templates/blob/master/roles_data.yaml#L152-L168 deprecated param flags in the roles_data.yaml ) to help out too.

Comment 5 Marius Cornea 2017-09-04 12:16:00 UTC
(In reply to marios from comment #4)
> can we try again and run the upgrade with all the 'new' flags in roles_data
> (this seems at least related to BZ 1486311 ). If it passes great, otherwise
> we need to reach out to someone from DFG:DF possibly (or whoever it is that
> added the mistral support for and those
> https://github.com/openstack/tripleo-heat-templates/blob/master/roles_data.
> yaml#L152-L168 deprecated param flags in the roles_data.yaml ) to help out
> too.

After adding the 'new' flags before starting the upgrade the major upgrade composable step completed fine. We still need to see what these flags represent and how they could impact environments using custom values for them.

FWIW these are the flags that I applied for each roles:
https://review.gerrithub.io/#/c/376753/1/tasks/convert_roles_data.yaml


Note You need to log in before you can comment on or make changes to this bug.