Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2237255

Summary: [FFU 16.2 to 17.1] overcloud node export fails when network isolation isn't used during upgrade
Product: Red Hat OpenStack Reporter: Khomesh Thakre <kthakre>
Component: python-tripleoclientAssignee: Harald Jensås <hjensas>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: high Docs Contact:
Priority: high    
Version: 17.1 (Wallaby)CC: hbrock, hjensas, jbadiapa, jlabarre, jpretori, jschluet, jslagle, mariel, mburns, pnavarro
Target Milestone: z2Keywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-tripleoclient-16.5.1-17.1.20230927000827.f3599d0.el9ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-01-16 14:30:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Khomesh Thakre 2023-09-04 12:55:20 UTC
Description of problem:

OpenStack Undercloud upgrade fails when network isolation isn't used. 

This part of the undercloud upgrade process is failing:

openstack overcloud node extract provisioned --stack overcloud --roles-file /tmp/tmpctxjqepx --output /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-baremetal-deployment.yaml --working-dir /home/stack/overcloud-deploy/overcloud --yes 

Version-Release number of selected component (if applicable):


How reproducible:
100% 

Steps to Reproduce:

1. Install fresh 16.2.
time openstack overcloud deploy --templates \
-e /home/stack/templates/node-info.yaml \
-e /home/stack/containers-prepare-parameter.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml \
--ntp-server 172.25.250.1 --log-file /tmp/install_overcloud.log --libvirt-type qemu --timeout 120 --debug

(undercloud) [stack@director ~]$ cat /home/stack/templates/node-info.yaml
parameter_defaults:
  OvercloudControllerFlavor: control
  OvercloudComputeFlavor: compute
  ControllerCount: 3
  ComputeCount: 2

(undercloud) [stack@director ~]$ cat /usr/share/openstack-tripleo-heat-templates/environments/services/neutron-ovn-ha.yaml
# A Heat environment that can be used to deploy OVN services with non-DVR and HA OVN DB servers.
resource_registry:
  OS::TripleO::Services::NeutronMl2PluginBase: ../../deployment/neutron/neutron-plugin-ml2-ovn.yaml
  OS::TripleO::Services::OVNController: ../../deployment/ovn/ovn-controller-container-puppet.yaml
  OS::TripleO::Services::OVNMetadataAgent: ../../deployment/ovn/ovn-metadata-container-puppet.yaml
# Disabling Neutron services that overlap with OVN
  OS::TripleO::Services::NeutronOvsAgent: OS::Heat::None
  OS::TripleO::Services::ComputeNeutronOvsAgent: OS::Heat::None
  OS::TripleO::Services::NeutronL3Agent: OS::Heat::None
  OS::TripleO::Services::NeutronMetadataAgent: OS::Heat::None
  OS::TripleO::Services::NeutronDhcpAgent: OS::Heat::None
  OS::TripleO::Services::ComputeNeutronCorePlugin: OS::Heat::None


parameter_defaults:
  NeutronMechanismDrivers: ovn
  OVNNeutronSyncMode: log
  OVNQosDriver: ovn-qos
  NeutronEnableDVR: False
  NeutronTypeDrivers: 'geneve,vxlan,vlan,flat'
  NeutronNetworkType: ['geneve' , 'vxlan', 'vlan', 'flat']
  NeutronServicePlugins: 'qos,ovn-router,trunk,segments,port_forwarding,log'
  NeutronVniRanges: ['1:65536', ]
  NeutronPluginExtensions: "qos,port_security,dns_domain_ports"
  NeutronRpcWorkers: 1
  ComputeParameters:
    NeutronBridgeMappings: ""
  ControllerParameters:
    OVNCMSOptions: "enable-chassis-as-gw"
  NetworkerParameters:
    OVNCMSOptions: "enable-chassis-as-gw"
  OVNDnsServers: []
  KernelIpNonLocalBind: 1

2. Use rhos-release 17.1 latest compose 
3. openstack undercloud upgrade

Actual results:


    "stderr": "INFO:undercloud:Exporting network from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml\nINFO:undercloud:Exporting network virtual IPs from stack overclou
d to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-virtual-ips.yaml\nINFO:undercloud:Exporting provisioned nodes from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-baremetal
-deployment.yaml\nUnable to extract role networks. Network storage not found.\nTraceback (most recent call last):\n  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 443, in <modu
le>\n    main()\n  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 383, in main\n    export_provisioned_nodes(heat, stack, stack_dir, args.cloud)\n  File \"/var/lib/tripleo-confi
g/scripts/undercloud-upgrade-ephemeral-heat.py\", line 332, in export_provisioned_nodes\n    '--working-dir', stack_dir, '--yes'], env={'OS_CLOUD': cloud})\n  File \"/usr/lib64/python3.6/subprocess.py\", line 311,
in check_call\n    raise CalledProcessError(retcode, cmd)\nsubprocess.CalledProcessError: Command '['openstack', 'overcloud', 'node', 'extract', 'provisioned', '--stack', 'overcloud', '--roles-file', '/tmp/tmpctxjq
epx', '--output', '/home/stack/overcloud-deploy/overcloud/tripleo-overcloud-baremetal-deployment.yaml', '--working-dir', '/home/stack/overcloud-deploy/overcloud', '--yes']' returned non-zero exit status 1.",
    "stderr_lines": [
        "INFO:undercloud:Exporting network from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-network-data.yaml",
        "INFO:undercloud:Exporting network virtual IPs from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-virtual-ips.yaml",
        "INFO:undercloud:Exporting provisioned nodes from stack overcloud to /home/stack/overcloud-deploy/overcloud/tripleo-overcloud-baremetal-deployment.yaml",
        "Unable to extract role networks. Network storage not found.",
        "Traceback (most recent call last):",
        "  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 443, in <module>",
        "    main()",
        "  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 383, in main",
        "    export_provisioned_nodes(heat, stack, stack_dir, args.cloud)",
        "  File \"/var/lib/tripleo-config/scripts/undercloud-upgrade-ephemeral-heat.py\", line 332, in export_provisioned_nodes",
        "    '--working-dir', stack_dir, '--yes'], env={'OS_CLOUD': cloud})",
        "  File \"/usr/lib64/python3.6/subprocess.py\", line 311, in check_call",
        "    raise CalledProcessError(retcode, cmd)",
        "subprocess.CalledProcessError: Command '['openstack', 'overcloud', 'node', 'extract', 'provisioned', '--stack', 'overcloud', '--roles-file', '/tmp/tmpctxjqepx', '--output', '/home/stack/overcloud-deploy/ov
ercloud/tripleo-overcloud-baremetal-deployment.yaml', '--working-dir', '/home/stack/overcloud-deploy/overcloud', '--yes']' returned non-zero exit status 1."
    ],

Expected results:

Undercloud upgrade successfully.

Comment 3 Harald Jensås 2023-09-04 18:15:51 UTC
I've created a patch: https://review.opendev.org/c/openstack/python-tripleoclient/+/893684

    Fix node extract when no network-isolation
    
    When network isolation is not enabled by
    environments/network-isolation.yaml the networks in the default
    networks file is not created. But the RoleNetIpMap still have
    entries for each network all with the ctlplane IP address.
    
    This change re-factors the code to ignore RoleNetIpMap entries
    when the network is not found/does not exist.
    
    Change-Id: If35ff8a1b3d9b4e93e6151c9c28d8e9c707bc64a
    Closes-Bug: #2027580


i.e the RoleNetIpMap looks like this:

Compute:
  ctlplane: &id001
  - 192.168.26.11
  internal_api: *id001
Controller:
  ctlplane: &id002
  - 192.168.25.21
  - 192.168.25.25
  - 192.168.25.28
  external: *id002
  internal_api: *id002

Comment 10 James E. LaBarre 2023-12-04 22:32:22 UTC
Confirmed edits are in place in python3-tripleoclient-16.5.1-17.1.20230927000827.f3599d0.el9ost.noarch.rpm from latest compose RHOS-17.1-RHEL-9-20231122.n.1

This compose ran phases 1, 2 & 3 with no errors in the package.

Comment 21 errata-xmlrpc 2024-01-16 14:30:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 17.1.2 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:0209