Bug 1570132 - [RFE] Validate roles-data.yaml is included when doing openstack overcloud deploy
Summary: [RFE] Validate roles-data.yaml is included when doing openstack overcloud deploy
Keywords:
Status: NEW
Alias: None
Product: RDO
Classification: Community
Component: openstack-tripleo
Version: Ocata
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: trunk
Assignee: James Slagle
QA Contact: Shai Revivo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-20 16:58 UTC by David Manchado
Modified: 2018-04-20 16:58 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1569293 0 unspecified NEW Need to add deleted compute nodes back to the overcloud stack in the undercloud 2021-02-22 00:41:40 UTC

Description David Manchado 2018-04-20 16:58:57 UTC
Description of problem:
When accidentally commenting out some config file in deploy script, it will not just comment out that file but all the lines from that on [1].
When using composable roles this can mean you will end up deleting the nodes related to them in the undercloud and shutting them down leading to outage and some panic for the openstack operator.


Related to BZ 1569293 [2]

[1] http://pastebin.test.redhat.com/578960
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1569293


Version-Release number of selected component (if applicable):
Tested in Ocata but I guess it might be applicable to any.

How reproducible:
100%

Steps to Reproduce:
1. Deploy openstack using a deploy script similar to [1] without commenting out any config file.
2. Re-run deploy commenting out a config file in the middle section.
3. 

Actual results:
All non-standard roles, in our case we have different composable roles based on the hardware specs (nic ordering mainly) will be deleted in the undercloud and stopped. 
Luckily the OS is not gone and can be booted back up without data loss.
From that moment on, the overcloud cannot be longer managed by the undercloud until it is database or disk restored.

Expected results:
Assuming that a roles-data.yaml file is always needed and has to be the last config file added, it would be great to implement some validation so if there is no roles file the deploy do not happen or warn the operator.

Additional info:
I might suggest not to base the check on roles-data.yaml because it can be customized and check for the -r parameter.


Note You need to log in before you can comment on or make changes to this bug.