Description of problem: When replacing a controller node with different NIC names, running overcloud deploy with NetworkDeploymenActions ['CREATE'] will apply the os-net-config to the 3 controller nodes and not only the new one. Version-Release number of selected component (if applicable): Red Hat OpenStack Platform release 16.1.1 GA (Train) How reproducible: Execute the procedure for Controller Node replacement changing the nic name on the controller.yaml nic template file, set NetworkDeploymenActions ['CREATE'] on network-environment.yaml file and deploy Steps to Reproduce: 1. Update the controller.yaml with new nic names 2. set NetworkDeploymenActions ['CREATE'] in network-environment.yaml file 3. Execute the Controller Replacement exactly as in the documentation. 4. Run the deployment command Actual results: os-net-config updates the Network configuration in controller-0, controller-1 and controller-2 (breaking the control plane due to invalid nic names) Expected results: os-net-config should update the network only in the replaced node. Additional info:
Update: overcloud is up and running and setting the parameter to CREATE applied changes to my NICs again. this is not the expected.
I'm tagging in Harald for some more context. However it sound like it would be more appropriate to handle different nic names with a per-controller /etc/os-net-config/mapping.yaml file[1]. Finding a single set of nic names which works on all controllers would be difficult. [1] https://docs.openstack.org/os-net-config/latest/usage.html#interface-mapping
Indeed, however the bug I would be pointing at, is the fact that NetworkDeploymentActions= CREATE, should not update the nic configurations nor trigger os-net-config in already up&running controller nodes and only in the NEW controller node. You would expect that running nodes would receive a nic config update if this NetworkDeploymentActions is set to CREATE,UPDATE instead. The scenario I am working on is: 2 controllers in OK state + 1 new controller as it is being replaced for a new server. NetworkDeploymentActions-> CREATE should only setup the network on the 1 new controller node and leave the other 2 untouched.
Tried the same NetworkDeploymenActions ['CREATE'] when adding a new Compute node to the stack, and it will also update the NIC config in the controller nodes when changing something on the NIC template.
https://review.opendev.org/c/openstack/tripleo-ansible/+/779649
I've initiated backports upstream of what I belive is the fix: https://review.opendev.org/c/openstack/tripleo-ansible/+/779649 @jpateteg you may want to test the patch manually? diff --git a/tripleo_ansible/roles/tripleo-network-config/tasks/main.yml b/tripleo_ansible/roles/tripleo-network-config/tasks/main.yml index da04000..568e3fa 100644 --- a/tripleo_ansible/roles/tripleo-network-config/tasks/main.yml +++ b/tripleo_ansible/roles/tripleo-network-config/tasks/main.yml @@ -106,5 +106,5 @@ - (tripleo_network_config_action == "CREATE") or ("UPDATE" in tripleo_network_config_network_deployment_actions) or (os_net_config_returncode_stat.stat.exists and - ((os_net_config_returncode_slurp.content | b64decode) != 0)) or + ((os_net_config_returncode_slurp.content | b64decode | int) != 0)) or (not os_net_config_returncode_stat.stat.exists)
*** Bug 2024013 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3762