Created attachment 1455551 [details] nic configs pre and post conversion Description of problem: Followed docs - https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/fast_forward_upgrades/assembly-preparing_for_overcloud_upgrade#converting-network-interface-templates-to-the-new-structure to convert osp10 form nic configs into osp13 format. Script completed without an error but after reviewing it the conversion missed the {get_input: bridge_name} and instead injected 'null' Unfortunately I have not noticed it in time and later - "openstack overcloud upgrade run --roles Controller --skip-tags validation" injected wrong network configuration and destroyed connectivity to controllers Version-Release number of selected component (if applicable): OSP13 FFU How reproducible: Every time Steps to Reproduce: 1. build osp10 environment 2. convert network files with bridge name variale with /usr/share/openstack-tripleo-heat-templates/tools/yaml-nic-config-2-script.py Actual results: - type: ovs_bridge name: bridge_name: null mtu: 1500 dns_servers: get_param: DnsServers use_dhcp: false addresses: - ip_netmask: get_param: ExternalIpSubnet routes: - default: true next_hop: get_param: ExternalInterfaceDefaultRoute members: - type: interface name: nic6 mtu: 1500 primary: true Expected results: - type: ovs_bridge name: bridge_name mtu: 1500 dns_servers: get_param: DnsServers use_dhcp: false routes: - default: true next_hop: get_param: ExternalInterfaceDefaultRoute addresses: - ip_netmask: get_param: ExternalIpSubnet members: - type: interface name: nic6 mtu: 1500 use_dhcp: false primary: true Additional info: including original and converted file (please disregard compute-hci).
I checked the attached nic templates and I can see the lines being converted: name: {bridge_name} => name: bridge_name: null We can notice that the original nic template doesn't contain {get_input: bridge_name} as reported but only name: {bridge_name} When the original nic template contains name: {get_input: bridge_name} the conversion seems to work fine as it results in: name: bridge_name I'm not sure if the name: {bridge_name} format was supported in OSP10 but according to the docs I couldn't find it documented: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html-single/advanced_overcloud_customization/
This is odd .. I have looked inside my undercloud snapshot from osp10 .. at the beginning of the FFWD journey and it clearly shows: - type: ovs_bridge name: {get_input: bridge_name} mtu: 1500 dns_servers: {get_param: DnsServers} use_dhcp: false addresses: - ip_netmask: {get_param: ExternalIpSubnet} routes: - default: true next_hop: {get_param: ExternalInterfaceDefaultRoute} members: The renamed old file indeed changed the name to name: {bridge_name} .. which is not something I have done manually but this must have occurred during the conversion process.
Ok, I was able to reproduce this issue. If you try the conversion for the file that has at least single comment line (that you would miss to remove based on docs) it fails with: (undercloud) [stack@test1-undercloud test-nics]$ /usr/share/openstack-tripleo-heat-templates/tools/yaml-nic-config-2-script.py --script-dir /usr/share/openstack-tripleo-heat-templates/network/scripts controller.yaml Using script at /usr/share/openstack-tripleo-heat-templates/network/scripts/run-os-net-config.sh The yaml file will be overwritten and the original saved as /home/stack/test-nics/controller.yaml.20180703145330 Overwrite controller.yaml? [y/n] y Converting controller.yaml Traceback (most recent call last): File "/usr/share/openstack-tripleo-heat-templates/tools/yaml-nic-config-2-script.py", line 155, in convert tpl = yaml.load(open(filename).read(), Loader=TemplateLoader) File "/usr/lib64/python2.7/site-packages/yaml/__init__.py", line 71, in load return loader.get_single_data() File "/usr/lib64/python2.7/site-packages/yaml/constructor.py", line 37, in get_single_data node = self.get_single_node() File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 36, in get_single_node document = self.compose_document() File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 55, in compose_document node = self.compose_node(None, None) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 82, in compose_node node = self.compose_sequence_node(anchor) File "/usr/lib64/python2.7/site-packages/yaml/composer.py", line 110, in compose_sequence_node while not self.check_event(SequenceEndEvent): File "/usr/lib64/python2.7/site-packages/yaml/parser.py", line 98, in check_event self.current_event = self.state() File "/usr/lib64/python2.7/site-packages/yaml/parser.py", line 386, in parse_block_sequence_entry return self.parse_block_node() File "/usr/lib64/python2.7/site-packages/yaml/parser.py", line 265, in parse_block_node return self.parse_node(block=True) File "/usr/lib64/python2.7/site-packages/yaml/parser.py", line 371, in parse_node token.start_mark) ParserError: while parsing a block node expected the node content, but found '?' in "<string>", line 139, column 13: comment7_14: ' Create a bridge w ... ^ unfortunately it also converts part of the original file and changes: name: {get_input: bridge_name} -> name: {bridge_name} If you remove extra comment line and re-run it will succeed but set the bridge name to null. That is how I got into a bad state. I think we should just remove the comment lines ourselves within the python script. I am attaching my test files.
Created attachment 1456319 [details] failed conversion and used yaml files
Chris - the issue with the comments only occurs if there is a blank line followed by a comment. Comments are handled fine if there is no blank line proceeding it. See https://bugzilla.redhat.com/show_bug.cgi?id=1576572. This will be fixed in a OSP-13 release post-GA. The failure still does cause the template to be overwritten. The original template would have to be used and edited to remove any blank lines before the comments in order to workaround 1576572. As Marius noted in Comment 1, if the line is: name: {bridge_name} it won't be handled correctly, as that isn't a valid script format (old or new)
Bob, Thanks for clarifying. It seems that we will get a fix later in the release cycle. Feel free to mark this either as duplicate for the other one or use it as the opportunity to maybe add a validation step before this gets applied (it was rather hard to recover after I was deep in a weeds and all my controllers got updated with bad networking). Also, to simplify the process we could integrate this step with something else (remove the extra step)?
Thanks Chris. It does seems like this is a dup of the "comment after blank line" bug 1576572 so I will mark it as such. We have a validation planned (https://bugzilla.redhat.com/show_bug.cgi?id=1565843) for a follow-on OSP-13 release that will detect old style nic config files and note that the script conversion must be run. It should be fairly easy to check for: name: {bridge_name} as part of that validation. I've added a comment to the validation bug to do this - https://bugzilla.redhat.com/show_bug.cgi?id=1565843#c6. *** This bug has been marked as a duplicate of bug 1576572 ***