Description of problem: Customer is trying to perform a minor update of OSP13 and receiving the following error: InvalidConfiguration: Missing networks from environment configuration. Ensure the following networks are properly configured in the provided environment files [set([u'OS::TripleO::Network::StorageNFS', u'OS::TripleO::Network::StorageMgmt', u'OS::TripleO::Network::BaseManagement'])] It looks like there are networks defined in the environment, that are not in the network_data.yaml file. The networks in the overcloud environment don't exist in Neutron or the Heat DB. It looks like an update was performed at some stage using the wrong network_data.yaml file and now we are unable to proceed due to the validation here failing: https://github.com/openstack/python-tripleoclient/blob/stable/queens/tripleoclient/utils.py#L603-L616 Version-Release number of selected component (if applicable): OSP13 How reproducible: 100% Steps to Reproduce: 1. openstack overcloud update prepare 2. observe failure 3. Actual results: InvalidConfiguration: Missing networks from environment configuration. Ensure the following networks are properly configured in the provided environment files [set([u'OS::TripleO::Network::StorageNFS', u'OS::TripleO::Network::StorageMgmt', u'OS::TripleO::Network::BaseManagement'])] Expected results: I should be able to at least define these networks as: resource_registry: OS::TripleO::Network::BaseManagement: OS::Heat::None OS::TripleO::Network::StorageMgmt: OS::Heat::None OS::TripleO::Network::StorageNFS: OS::Heat::None To get passed this issue. Additional info: Since the validation runs before the templates are uploaded, I can't fix this issue by updating the templates. The only solution I can think of is to add these unwanted networks to network_data.yaml make sure that the environment files will match what is expected in the environment. But that will leave them with 3 extra networks that don't currently exist in the overcloud. My request here is to know if I can possibly comment out the validation check from tripleoclient to get the deployment moving? I would like to avoid having to define the unwanted networks if at all possible. Since we know they don't exist in Neutron or Heat currently, I don't want to risk having the deployment trying to add these networks to the production overcloud. I can see the environment information that forms stack_nets comes from basically this: https://github.com/openstack/python-tripleoclient/blob/stable/queens/tripleoclient/utils.py#L606: openstack stack environment show overcloud | grep OS::TripleO::Network:: | grep -v OS::TripleO::Network::Ports | grep -v OS::Heat::None And it's defined in mysql under heat.raw_template. This is a massive field and I couldn't possible update it without causing more issues. So looking for a way to progress without creating a mess of unused networks that could cause other issues if we just try defining them in network_data.yaml.
Reproducer steps are: 1. Deploy the overcloud and enable the management network; 2. Delete the management ports, subnets and network 3. For full consistency with the environment in question, delete the ManagementNetwork and ManagementSubnet from heat.resources in mysql 4. Change the name of the management network in network_data.yaml to something like Manage and for consistency, I also changed the lower case version to manage as well 5. re-run the overcloud deploy Now we observe the error: (undercloud) [stack@undercloud-0 ~]$ ./overcloud_deploy.sh Started Mistral Workflow tripleo.validations.v1.check_pre_deployment_validations. Execution ID: 3b2125fb-e077-479c-87ae-6784f295e672 Waiting for messages on queue 'tripleo' with no timeout. Removing the current plan files Uploading new plan files Started Mistral Workflow tripleo.plan_management.v1.update_deployment_plan. Execution ID: c6879bff-6d22-4c3d-9136-9e220c7c35ed Plan updated. Processing templates in the directory /tmp/tripleoclient-EcaykL/tripleo-heat-templates Missing networks from environment configuration. Ensure the following networks are properly configured in the provided environment files [set([u'OS::TripleO::Network::Management'])] nt'])] Two solutions I am testing. First one, is to just simply change the name back to Management and management and then re-run the deployment. This solution works, but it creates the network again which in the specific use case on this BZ is not deiserable. But it works. The second solution I'm testing is commenting out the validation in tripleoclient/utils.py: def _get_networks(registry): nets = set() for k, v in six.iteritems(registry): if (k.startswith('OS::TripleO::Network::') and not k.startswith('OS::TripleO::Network::Port') and v != 'OS::Heat::None'): nets.add(k) return nets stack_registry = stack.environment().get('resource_registry', {}) env_registry = environment.get('resource_registry', {}) stack_nets = _get_networks(stack_registry) env_nets = _get_networks(env_registry) env_diff = set(stack_nets) - set(env_nets) #if env_diff: # raise exceptions.InvalidConfiguration('Missing networks from ' # 'environment configuration. ' # 'Ensure the following networks ' # 'are properly configured in ' # 'the provided environment files ' # '[{}]'.format(env_diff)) So far, this seems to be progressing although I haven't dug too much deeper into the potential consequences of doing this. I think it should be fine provided that the network really isn't there, but I will soon find out if something interesting happens as a result.
Commenting out the validation works and it just created the network with the wrong name in addition to the "Manage" on. I assume it will also work if I simply remove the network as well. I think the best solution here is just to include the network_data.yaml file with all of the networks that are expected according to: openstack stack environment show overcloud | grep OS::TripleO::Network:: | grep -v OS::TripleO::Network::Ports | grep -v OS::Heat::None Commenting out the validation doesn't remove the bad networks anyway, it just creates the new on as well. So the issue is still going to exist even if we tried commenting it out: (undercloud) [stack@undercloud-0 ~]$ openstack stack environment show overcloud | grep OS::TripleO::Network:: | grep -v OS::TripleO::Network::Ports | grep -v OS::Heat::None OS::TripleO::Network::External: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/external.yaml OS::TripleO::Network::InternalApi: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/internal_api.yaml OS::TripleO::Network::Manage: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/manage.yaml <<<---- New one OS::TripleO::Network::Management: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/management.yaml <<<---- Original OS::TripleO::Network::Storage: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/storage.yaml OS::TripleO::Network::StorageMgmt: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/storage_mgmt.yaml OS::TripleO::Network::Tenant: http://192.168.24.1:8080/v1/AUTH_e71af90cb44b416a813969996e17ecb4/overcloud/network/tenant.yaml Let's go with the adding the networks. I created this solution article for it: https://access.redhat.com/solutions/4526651 We might want to consider a code solution for how to work around this. But I think at this stage, it's a fairly specific issue and we can address it as an RFE rather than this Urgent BZ. I'll close it off accordingly.