Bug 2222257

Summary: [FFU 16.2 to 17.1] OpenStack Undercloud upgrade fails when network isolation isn't used
Product: Red Hat OpenStack Reporter: Pedro Navarro <pnavarro>
Component: tripleo-ansibleAssignee: Juan Badia Payno <jbadiapa>
Status: CLOSED ERRATA QA Contact: Khomesh Thakre <kthakre>
Severity: high Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: jbadiapa, jpretori, kgilliga, kthakre, lsvaty, mariel, pgrist
Target Milestone: z1Keywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tripleo-ansible-3.3.1-1.20230518201537.el9ost Doc Type: Known Issue
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-09-20 00:29:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pedro Navarro 2023-07-12 12:32:48 UTC
Description of problem:

OpenStack Undercloud upgrade fails when network isolation isn't used. 

This part of the undercloud upgrade process is failing:

openstack overcloud network extract --stack overcloud --output /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml --yes


Version-Release number of selected component (if applicable): RHEL 8.4 and RHEL 17.1


How reproducible:


Steps to Reproduce:
1. Install fresh 16.2.
time openstack overcloud deploy --templates \
-e /home/stack/templates/node-info.yaml \
-e /home/stack/containers-prepare-parameter.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml \
--ntp-server 172.25.250.1 --log-file /tmp/install_overcloud.log --libvirt-type qemu --timeout 120 --debug

(undercloud) [stack@director ~]$ cat /home/stack/templates/node-info.yaml
parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  ControllerCount: 3
  ComputeCount: 2

(undercloud) [stack@director ~]$ cat /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml
# A Heat environment that can be used to deploy OVN services with non-DVR and HA OVN DB servers.
resource_registry:
  OS::TripleO::Services::NeutronMl2PluginBase: ../../deployment/neutron/neutron-plugin-ml2-ovn.yaml
  OS::TripleO::Services::OVNController: ../../deployment/ovn/ovn-controller-container-puppet.yaml
  OS::TripleO::Services::OVNMetadataAgent: ../../deployment/ovn/ovn-metadata-container-puppet.yaml
# Disabling Neutron services that overlap with OVN
  OS::TripleO::Services::NeutronOvsAgent: OS::Heat::None
  OS::TripleO::Services::ComputeNeutronOvsAgent: OS::Heat::None
  OS::TripleO::Services::NeutronL3Agent: OS::Heat::None
  OS::TripleO::Services::NeutronMetadataAgent: OS::Heat::None
  OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None
  OS::TripleO::Services::ComputeNeutronCorePlugin: OS::Heat::None


parameter_defaults:
  NeutronMechanismDrivers: ovn
  OVNNeutronSyncMode: log
  OVNQosDriver: ovn-qos
  NeutronEnableDVR: False
  NeutronTypeDrivers: 'geneve,vxlan,vlan,flat'
  NeutronNetworkType: ['geneve' , 'vxlan', 'vlan', 'flat']
  NeutronServicePlugins: 'qos,ovn-router,trunk,segments,port_forwarding,log'
  NeutronVniRanges: ['1:65536', ]
  NeutronPluginExtensions: "qos,port_security,dns_domain_ports"
  NeutronRpcWorkers: 1
  ComputeParameters:
    NeutronBridgeMappings: ""
  ControllerParameters:
    OVNCMSOptions: "enable-chassis-as-gw"
  NetworkerParameters:
    OVNCMSOptions: "enable-chassis-as-gw"
  OVNDnsServers: []
  KernelIpNonLocalBind: 1

2. Use rhos-release 17.1 latest compose 
3. openstack undercloud upgrade

Actual results:
2023-07-12 05:33:27.032730 | 2cc26024-0e20-9aaa-1b57-000000000711 |       TASK | Run undercloud-upgrade-ephemeral-heat.py
2023-07-12 05:33:35.704449 | 2cc26024-0e20-9aaa-1b57-000000000711 |      FATAL | Run undercloud-upgrade-ephemeral-heat.py | director | error={"changed": true, "cmd": "/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py", "delta": "0:00:08.501815", "end": "2023-07-12 05:33:35.684553", "msg": "non-zero return code", "rc": 1, "start": "2023-07-12 05:33:27.182738", "stderr": "INFO:undercloud:Exporting network from stack overcloud to /home/stack/overcloud-d
eploy/overcloud/tripleo-overcloud-network-data.yaml\nAnsible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return Code: 2\nException occured while running the command\nTraceback (most recent call last):\n  File \"/usr/lib/python3.6/site-packages/tripleoclient/command.py\", line 32, in run\n    super(Command, self).run(parsed_args)\n  File \"/usr/lib/python3.6/site-packages/osc_lib/command/comman
d.py\", line 39, in run\n    return super(Command, self).run(parsed_args)\n  File \"/usr/lib/python3.6/site-packages/cliff/command.py\", line 186, in run\n    return_code = self.take_action(parsed_args) or 0\n  File \"/usr/lib/python3.6/site-packages/tripleoclient/v2/overcloud_network.py\", line 77, in take_action\n    extra_vars=extra_vars,\n  File \"/usr/lib/python3.6/site-packages/tripleoclient/utils.py\", line 775, in run_ansible_playbook\n    raise RuntimeError(er
r_msg)\nRuntimeError: Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return Code: 2\nAnsible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return Code: 2\nTraceback (most recent call last):\n  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 443, in <module>\n    main()\n  File \"
/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 381, in main\n    export_networks(stack, stack_dir, args.cloud)\n  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 278, in export_networks\n    '--yes'], env={'OS_CLOUD': cloud})\n  File \"/usr/lib64/python3.6/subprocess.py\", line 311, in check_call\n    raise CalledProcessError(retcode, cmd)\nsubprocess.CalledProcessError: Command '['openstack', 'overcloud
', 'network', 'extract', '--stack', 'overcloud', '--output', '/home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml', '--yes']' returned non-zero exit status 1.", "stderr_lines": ["INFO:undercloud:Exporting network from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml", "Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return
 Code: 2", "Exception occured while running the command", "Traceback (most recent call last):", "  File \"/usr/lib/python3.6/site-packages/tripleoclient/command.py\", line 32, in run", "    super(Command, self).run(parsed_args)", "  File \"/usr/lib/python3.6/site-packages/osc_lib/command/command.py\", line 39, in run", "    return super(Command, self).run(parsed_args)", "  File \"/usr/lib/python3.6/site-packages/cliff/command.py\", line 186, in run", "    return_code =
 self.take_action(parsed_args) or 0", "  File \"/usr/lib/python3.6/site-packages/tripleoclient/v2/overcloud_network.py\", line 77, in take_action", "    extra_vars=extra_vars,", "  File \"/usr/lib/python3.6/site-packages/tripleoclient/utils.py\", line 775, in run_ansible_playbook", "    raise RuntimeError(err_msg)", "RuntimeError: Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return Code
: 2", "Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-network-extract.yaml, Run Status: failed, Return Code: 2", "Traceback (most recent call last):", "  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 443, in <module>", "    main()", "  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 381, in main", "    export_networks(stack, stack_dir, args.cloud)", "  File
 \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 278, in export_networks", "    '--yes'], env={'OS_CLOUD': cloud})", "  File \"/usr/lib64/python3.6/subprocess.py\", line 311, in check_call", "    raise CalledProcessError(retcode, cmd)", "subprocess.CalledProcessError: Command '['openstack', 'overcloud', 'network', 'extract', '--stack', 'overcloud', '--output', '/home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml', '
--yes']' returned non-zero exit status 1."], "stdout": "\r\nPLAY [Overcloud Network Extract Networks] **************************************\n2023-07-12 05:33:31.321416 | 2cc26024-0e20-d3a6-f974-000000000008 |    SKIPPED | fail | localhost\n2023-07-12 05:33:31.322236 | 2cc26024-0e20-d3a6-f974-000000000008 |     TIMING | fail | localhost | 0:00:00.066662 | 0.02s\n2023-07-12 05:33:31.348439 | 2cc26024-0e20-d3a6-f974-000000000009 |    SKIPPED | fail | localhost\n2023-07-1
2 05:33:31.349244 | 2cc26024-0e20-d3a6-f974-000000000009 |     TIMING | fail | localhost | 0:00:00.093672 | 0.02s\n2023-07-12 05:33:31.352687 | 2cc26024-0e20-d3a6-f974-00000000000a |       TASK | Check if output file exists\n2023-07-12 05:33:31.581888 | 2cc26024-0e20-d3a6-f974-00000000000a |         OK | Check if output file exists | localhost\n2023-07-12 05:33:31.582843 | 2cc26024-0e20-d3a6-f974-00000000000a |     TIMING | Check if output file exists | localhost | 0:0
0:00.327269 | 0.23s\n2023-07-12 05:33:31.609323 | 2cc26024-0e20-d3a6-f974-00000000000b |    SKIPPED | fail | localhost\n2023-07-12 05:33:31.610168 | 2cc26024-0e20-d3a6-f974-00000000000b |     TIMING | fail | localhost | 0:00:00.354596 | 0.02s\n2023-07-12 05:33:31.614013 | 2cc26024-0e20-d3a6-f974-00000000000d |       TASK | Get network data from overcloud stack\n2023-07-12 05:33:35.238917 | 2cc26024-0e20-d3a6-f974-00000000000d |      FATAL | Get network data from overcl
oud stack | localhost | error={\"changed\": false, \"error\": \"No Stack found for e1e57091-0283-4c1e-a04a-450b1a3cbc0b\", \"msg\": \"Error getting network data from overcloud stack overcloud: %No Stack found for e1e57091-0283-4c1e-a04a-450b1a3cbc0b\", \"network_data\": [], \"success\": false}\n2023-07-12 05:33:35.239914 | 2cc26024-0e20-d3a6-f974-00000000000d |     TIMING | Get network data from overcloud stack | localhost | 0:00:03.984340 | 3.63s\n\r\nNO MORE HOSTS LE
FT *************************************************************\n\r\nPLAY RECAP *********************************************************************\nlocalhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0   \n2023-07-12 05:33:35.242341 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.242582 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 5
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.242819 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:00:03.987253 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.243056 |                                 UUID |       Info |       Host |   Task Name |   Run Time\n2023-07-12 05:33:35.243275 | 2cc26024-0e20-d3a6-f974-00000000000d |    SUMMARY |  localhost | Get network data from overcloud stack | 3.63s\n2023-07-12 05:33:35.243490 | 2cc26024-0e20-d3a6-f974-0
0000000000a |    SUMMARY |  localhost | Check if output file exists | 0.23s\n2023-07-12 05:33:35.243700 | 2cc26024-0e20-d3a6-f974-000000000008 |    SUMMARY |  localhost | fail | 0.02s\n2023-07-12 05:33:35.243926 | 2cc26024-0e20-d3a6-f974-00000000000b |    SUMMARY |  localhost | fail | 0.02s\n2023-07-12 05:33:35.244153 | 2cc26024-0e20-d3a6-f974-000000000009 |    SUMMARY |  localhost | fail | 0.02s\n2023-07-12 05:33:35.244360 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summar
y Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.244575 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.244789 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~\n2023-07-12 05:33:35.245034 |  The following node(s) had failures: localhost\n2023-07-12 05:33:35.245252 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~", "stdout_lines": ["", "PLAY [Overcloud Network Extract Networks] **************************************", "2023-07-12 05:33:31.321416 | 2cc26024-0e20-d3a6-f974-000000000008 |    SKIPPED | fail | localhost", "2023-07-12 05:33:31.322236 | 2cc26024-0e20-d3a6-f974-000000000008 |     TIMING | fail | localhost | 0:00:00.066662 | 0.02s", "2023-07-12 05:33:31.348439 | 2cc26024-0e20-d3a6-f974-000000000009 |    SKIPPED | fail | localhost", "2023-07-12 05:33:31.349244
 | 2cc26024-0e20-d3a6-f974-000000000009 |     TIMING | fail | localhost | 0:00:00.093672 | 0.02s", "2023-07-12 05:33:31.352687 | 2cc26024-0e20-d3a6-f974-00000000000a |       TASK | Check if output file exists", "2023-07-12 05:33:31.581888 | 2cc26024-0e20-d3a6-f974-00000000000a |         OK | Check if output file exists | localhost", "2023-07-12 05:33:31.582843 | 2cc26024-0e20-d3a6-f974-00000000000a |     TIMING | Check if output file exists | localhost | 0:00:00.327269
 | 0.23s", "2023-07-12 05:33:31.609323 | 2cc26024-0e20-d3a6-f974-00000000000b |    SKIPPED | fail | localhost", "2023-07-12 05:33:31.610168 | 2cc26024-0e20-d3a6-f974-00000000000b |     TIMING | fail | localhost | 0:00:00.354596 | 0.02s", "2023-07-12 05:33:31.614013 | 2cc26024-0e20-d3a6-f974-00000000000d |       TASK | Get network data from overcloud stack", "2023-07-12 05:33:35.238917 | 2cc26024-0e20-d3a6-f974-00000000000d |      FATAL | Get network data from overcloud
 stack | localhost | error={\"changed\": false, \"error\": \"No Stack found for e1e57091-0283-4c1e-a04a-450b1a3cbc0b\", \"msg\": \"Error getting network data from overcloud stack overcloud: %No Stack found for e1e57091-0283-4c1e-a04a-450b1a3cbc0b\", \"network_data\": [], \"success\": false}", "2023-07-12 05:33:35.239914 | 2cc26024-0e20-d3a6-f974-00000000000d |     TIMING | Get network data from overcloud stack | localhost | 0:00:03.984340 | 3.63s", "", "NO MORE HOSTS L
EFT *************************************************************", "", "PLAY RECAP *********************************************************************", "localhost                  : ok=1    changed=0    unreachable=0    failed=1    skipped=3    rescued=0    ignored=0   ", "2023-07-12 05:33:35.242341 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.242582 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks
: 5          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.242819 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:00:03.987253 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.243056 |                                 UUID |       Info |       Host |   Task Name |   Run Time", "2023-07-12 05:33:35.243275 | 2cc26024-0e20-d3a6-f974-00000000000d |    SUMMARY |  localhost | Get network data from overcloud stack | 3.63s", "2023-07-12 05:33:35.243490 | 2cc26024
-0e20-d3a6-f974-00000000000a |    SUMMARY |  localhost | Check if output file exists | 0.23s", "2023-07-12 05:33:35.243700 | 2cc26024-0e20-d3a6-f974-000000000008 |    SUMMARY |  localhost | fail | 0.02s", "2023-07-12 05:33:35.243926 | 2cc26024-0e20-d3a6-f974-00000000000b |    SUMMARY |  localhost | fail | 0.02s", "2023-07-12 05:33:35.244153 | 2cc26024-0e20-d3a6-f974-000000000009 |    SUMMARY |  localhost | fail | 0.02s", "2023-07-12 05:33:35.244360 | ~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.244575 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.244789 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~", "2023-07-12 05:33:35.245034 |  The following node(s) had failures: localhost", "2023-07-12 05:33:35.245252 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~"]}
2023-07-12 05:33:35.707338 | 2cc26024-0e20-9aaa-1b57-000000000711 |     TIMING | Run undercloud-upgrade-ephemeral-heat.py | director | 0:01:41.888600 | 8.67s


Expected results:
Undercloud openstack upgrade is completed

Additional info:

(undercloud) [stack@director ~]$ openstack network list
+--------------------------------------+------------------+--------------------------------------+
| ID                                   | Name             | Subnets                              |
+--------------------------------------+------------------+--------------------------------------+
| 9f2ac279-8bc8-414d-879e-222b426cd9fa | ovn_mac_addr_net |                                      |
| c2be8a04-d679-4552-ab3a-cf7cb3a8a99d | ctlplane         | cc5fa076-9c26-474b-a25c-7fff265135a7 |
+--------------------------------------+------------------+--------------------------------------+

Comment 17 errata-xmlrpc 2023-09-20 00:29:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:5138