Bug 1886248

Summary: OSP16.1 Tripleo HeatStack will not generate IP addresses when Network when NetworkDeploymentActions has ['UPDATE'] enabled
Product: Red Hat OpenStack Reporter: John Apple II <jappleii>
Component: openstack-tripleoAssignee: James Slagle <jslagle>
Status: CLOSED NOTABUG QA Contact: Arik Chernetsky <achernet>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 16.1 (Train)CC: mburns
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-08 06:03:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Collection of attachments explained in the Description none

Description John Apple II 2020-10-08 03:40:56 UTC
Created attachment 1719880 [details]
Collection of attachments explained in the Description

Description of problem:

I have deployed an OSP16.1 overcloud with OVN+DVR+HA.  The Compute nodes were generated with the br-ex bond, and we specified that an External IP was supposed to be enabled for the overcloud in the NetworkIsolation.  But upon attempting to redeploy to update the addresses field in the compute.yaml network-config, the stack update fails.




Version-Release number of selected component (if applicable):

$ cat /etc/rhosp-release 
Red Hat OpenStack Platform release 16.1.1 GA (Train)

[stack@director sc-tripleo-templates]$ rpm -qa | grep tripleo
openstack-tripleo-image-elements-10.6.2-0.20200528043425.7dc0fa1.el8ost.noarch
python3-tripleoclient-12.3.2-0.20200615103427.6f877f6.el8ost.noarch
python3-tripleo-common-11.4.1-2.20200826191216.f542f91.el8ost.noarch
openstack-tripleo-puppet-elements-11.2.2-0.20200527003426.226ce95.el8ost.noarch
ansible-tripleo-ipsec-9.2.1-0.20200311073016.0c8693c.el8ost.noarch
python3-tripleoclient-heat-installer-12.3.2-0.20200615103427.6f877f6.el8ost.noarch
openstack-tripleo-heat-templates-11.3.2-1.20200828163406.94ba270.el8ost.noarch
tripleo-ansible-0.5.1-1.20200826223430.0628bbc.el8ost.noarch
openstack-tripleo-common-11.4.1-2.20200826191216.f542f91.el8ost.noarch
puppet-tripleo-11.5.0-0.20200616033428.8ff1c6a.el8ost.noarch
openstack-tripleo-validations-11.3.2-0.20200611115253.08f469d.el8ost.noarch
openstack-tripleo-common-containers-11.4.1-2.20200826191216.f542f91.el8ost.noarch
openstack-tripleo-common-container-base-11.4.1-2.20200826191216.f542f91.el8ost.noarch
ansible-tripleo-ipa-0.2.1-0.20200611104546.c22fc8d.el8ost.noarch
ansible-role-tripleo-modify-image-1.2.1-0.20200527233426.bc21900.el8ost.noarch
openstack-tripleo-common-devtools-11.4.1-2.20200826191216.f542f91.el8ost.noarch
[stack@director sc-tripleo-templates]$






How reproducible:

1. Create a Network Isolation File as per network-isolation.yaml

2. Configure the OVN HA DVR for networking as per ovn-ha-dvr.yaml

3. Configure the br-ex bond for the Compute Nodes, but leave out the ExternalIP on br-ex's bond - see compute.yaml

4. Deploy the overcloud (succeeds)

5. Check that br-ex exists, but has no IP (so DVR cannot function properly on the overcloud as a gateway chassis

6. Update the Templates so that the Network is set to update on stack update

$ grep UPDATE environments/00-node-info.yaml 
  NetworkDeploymentActions: ['CREATE','UPDATE']

7. Configure the External Interface to use the external IP (see compute-fixed.yaml)
updated section is as follows
                addresses:
                - ip_netmask:
                    get_param: ExternalIpSubnet
                dns_servers:
                  get_param: DnsServers
                routes:
                  list_concat_unique:
                    - get_param: ExternalInterfaceRoutes
                    - - default: true
                        next_hop:
                          get_param: ExternalInterfaceDefaultRoute

8. Run a stack update.

Stack update fails (see Actual Results)



Actual results:

See deploymentactualresults.txt
The result of the network ports both before and after.  ComputeIPListMap in the stack shows no updates for the new IP's in the network-config (even though NetworkIsolation had ports configured at the initial deploy)

os-net-config in the Computes shows the updates, but the ip_address field in br-ex is "" in the /etc/os-net-config/config.json file.

Before and After the deploy (up to the failure) shows:
---------
(undercloud) [stack@director environments]$ openstack port list --network external
+--------------------------------------+------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| ID                                   | Name                         | MAC Address       | Fixed IP Addresses                                                             | Status |
+--------------------------------------+------------------------------+-------------------+--------------------------------------------------------------------------------+--------+
| 32ea2bf4-1a0a-427d-9679-9489a8a16ebf | orange-pcmk-0_External       | fa:16:3e:4c:74:98 | ip_address='150.238.245.144', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| 49bd9953-d812-4741-bd8e-2c03aa587985 | orange-controller-0_External | fa:16:3e:29:e0:07 | ip_address='150.238.245.162', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| 57bb7b4e-63c9-4538-a3c4-34a502fe00aa | public_virtual_ip            | fa:16:3e:24:ba:93 | ip_address='150.238.245.130', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| 6951b2c3-1334-42e0-bc7c-0e6271d41f69 | orange-controller-1_External | fa:16:3e:6d:b1:30 | ip_address='150.238.245.135', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| 95d23967-c1a8-4718-b23e-583aafee0d45 | orange-pcmk-2_External       | fa:16:3e:04:36:2c | ip_address='150.238.245.149', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| a6bd1ad7-e3ad-4065-963e-614f41708516 | orange-controller-2_External | fa:16:3e:8f:1c:8c | ip_address='150.238.245.159', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
| bd01f202-bfb6-460e-a1aa-6e7eb21be211 | orange-pcmk-1_External       | fa:16:3e:84:c7:0e | ip_address='150.238.245.167', subnet_id='47652431-30d1-4e7b-9fe9-b2bf8c978726' | DOWN   |
+--------------------------------------+------------------------------+-------------------+--------------------------------------------------------------------------------+--------+




Expected results:


The heat-stack would update the ComputeIpListMap with the external IP's, and then the ports would be created and show in openstack port list --network external

os-net-config in the Computes shows the updates, but the ip_address field in br-ex is would have an appropriate address in the external network in the /etc/os-net-config/config.json file.

br-ex would have it's IP address updated.




Additional info:

$ cat /etc/rhosp-release 
Red Hat OpenStack Platform release 16.1.1 GA (Train)

Comment 1 John Apple II 2020-10-08 06:02:26 UTC
Some CEE engineers contacted me on internal IRC on the BZ after I raised it.  The missing part was the Network on the Compute Role with "External" as a part of the roles_data.yaml.

Once I updated this entry, the Network updated as expected, and the additional network ports showed up and the cloud was deployed correctly with the updated network ports.

Comment 2 John Apple II 2020-10-08 06:03:45 UTC
I'll self-close this one as the solution - while not obvious to me coming from 13 - is there.  I'll pursue seeing how it may be possible to clarify this in documentation for others.