1575752 – Custom network names prevent deployment

Bug 1575752 - Custom network names prevent deployment

Summary: Custom network names prevent deployment

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	13.0 (Queens)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	z2
Target Release:	13.0 (Queens)
Assignee:	Bob Fournier
QA Contact:	mlammon
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-05-07 19:38 UTC by Gurenko Alex
Modified:	2018-12-24 11:40 UTC (History)
CC List:	17 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-8.0.3-2.el7ost
Doc Type:	Release Note
Doc Text:	In previous versions, the *NetName parameters (e.g. InternalApiNetName) changed the names of the default networks. This is no longer supported. To change the names of the default networks, use a custom composable network file (network_data.yaml) and include it with your 'openstack overcloud deploy' command using the '-n' option. In this file, set the "name_lower" field to the custom net name for the network you want to change. For more information, see "Using Composable Networks" in the Advanced Overcloud Customization guide. In addition, you need to add a local parameter for the ServiceNetMap table to network_environment.yaml and override all the default values for the old network name to the new custom name. You can find the default values in /usr/share/openstack-tripleo-heat-templates/network/service_net_map.j2.yaml. This requirement to modify ServiceNetMap will not be necessary in future OSP-13 releases.
Clone Of:
Environment:
Last Closed:	2018-08-29 16:36:37 UTC
Target Upstream Version:
Embargoed:
Flags:	mlammon: needinfo-

Attachments	(Terms of Use)
network_data.yaml file for workaround (4.80 KB, text/plain) 2018-06-08 19:57 UTC, Bob Fournier	no flags	Details
network-environment.yaml with changes to ServiceNetMap (4.00 KB, text/plain) 2018-06-08 19:59 UTC, Bob Fournier	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	574269	'None'	MERGED	Add default value for name_lower in network_data.yaml to update ServiceNetMap	2020-11-07 00:36:54 UTC
OpenStack gerrit	579346	'None'	MERGED	Add default value for name_lower in network_data.yaml to update ServiceNetMap	2020-11-07 00:36:36 UTC
Red Hat Bugzilla	1406331	high	CLOSED	Custom network names prevent deployment	2023-09-14 03:36:30 UTC
Red Hat Product Errata	RHBA-2018:2574	None	None	None	2018-08-29 16:37:23 UTC

Description Gurenko Alex 2018-05-07 19:38:10 UTC

Description of problem: Deployment fails when Custom network names are used, for example:

StorageMgmtNetName: customstgmgmtname
InternalApiNetName: custominternalapiname


Version-Release number of selected component (if applicable): 2018-05-04.1


How reproducible: 100%


Steps to Reproduce:
1. create yaml with following lines:
parameter_defaults:
  StorageMgmtNetName: customstgmgmtname
  InternalApiNetName: custominternalapiname
2. use this yaml with overcloud_deploy.sh script


Actual results: Deployment fails with a very long list of errors

overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.1:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: 812f55a2-4e62-4330-b444-03f8babb83b1
  status: CREATE_FAILED
  status_reason: |
    Error: resources[1]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 2
  deploy_stdout: |

<...>

"2018-05-07 19:03:53,339 ERROR: 22970 -- ERROR configuring haproxy"


Expected results: CREATE_COMPLETE


Additional info: Initial bug was opened for RHOS10

Comment 4 Alex Schultz 2018-05-24 12:02:51 UTC


*** This bug has been marked as a duplicate of bug 1564654 ***

Comment 5 Alex Schultz 2018-05-30 15:41:58 UTC

Sorry it's not a dupe. 

"Error: /Stage[main]/Haproxy/Haproxy::Instance[haproxy]/Haproxy::Config[haproxy]/Concat[/etc/haproxy/haproxy.cfg]/File[/etc/haproxy/haproxy.cfg]/content: change from {md5}1f337186b0e1ba5ee82760cb437fb810 to {md5}685eb1bcc74004d40c46ef85d025553e failed: Execution of '/usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg20180507-12-r8yetb -c' returned 1: [ALERT] 126/190309 (349) : parsing [/etc/haproxy/haproxy.cfg20180507-12-r8yetb:37] : 'server 172.17.1.17:8042' : invalid address: 'check' in 'check'"

Comment 6 Emilien Macchi 2018-05-30 16:53:14 UTC

Alex and I have digged a bit:

Before composable networks we could rename a network by changing these 2 parameters but in OSP13 it requires more steps that need to be documented, since we are now using jinja templates for network configuration:

https://github.com/openstack/tripleo-heat-templates/blob/e64c10b9c13188f37e6f122475fe02280eaa6686/puppet/all-nodes-config.j2.yaml#L180

It can be considered as a backward incompatible change in the way networks names were managed, but I think as long as it's well documented & tested by QE, this is acceptable.

Example of file that also need to be updated:
https://github.com/openstack/tripleo-heat-templates/blob/e64c10b9c13188f37e6f122475fe02280eaa6686/network_data.yaml#L67

Anyway, we decided to assign this bug to DFG:HardProv and engage some documentation change on how do we expect our users to rename a network (list the files that need to be updated).

Comment 7 Bob Fournier 2018-05-31 18:54:10 UTC

There may be a couple issues.  

If its the setting of hieradata in all_nodes_config.j2.yaml per Comment 6, Harald has a patch upstream for that - https://review.openstack.org/#/c/569544/.  However those hardcoded hieradata values existed even in OSP-10 https://github.com/openstack/tripleo-heat-templates/blob/stable/newton/puppet/all-nodes-config.yaml#L176, so I fail to see how this test would have worked then or since.  Its better with the jinja changes as all the hieradata hardcodes can be removed except the InternalApi one needed for contrail.

If its that the names in network_data.yaml need to be updated, that can be handled via docs.  

Note that the original fix was for https://bugs.launchpad.net/tripleo/+bug/1651541, this was prior to the jinja changes for composable networks.

Attempting to duplicate this.

Comment 8 Bob Fournier 2018-05-31 18:56:25 UTC

Alex - can you put the sosreports in a different place that is accessible?  As before, I'm having issues pulling these down from google drive, I get the multiple redirect errors.  Thanks.

Comment 10 Steven Hardy 2018-06-05 11:18:21 UTC

Ok so I discussed this with Bob, and I think there are several problems:

1. Since implementing composable networks, there are two ways to rename a network, network_data.yaml and the *NetName parameters - this presents a problem because it's not always easy to know which one to use (I guess the *NetName parameters should always take precedence, but it's fairly confusing and IMHO we should consider deprecating/removing these parameters)

2. The ServiceNetMap and NetIpMap were updated in an attempt to provide backwards compatibility for the *NetName parameters ref https://review.openstack.org/#/q/topic:bug/1651541 but https://review.openstack.org/#/c/531036/ added the hieradata for the per-network VIPs only using the network_data names - I suspect this is why the haproxy config is failing?

I think we have a workaround, which is to update the network_data.yaml instead of *NetName (one disadvantage is I don't think this will work via the UI).

In terms of a fix, if we can prove it's the VIP hieradata which is the problem, we could update those key names using a similar approach to my previous patches for ServiceNetMap and NetIpMap, but I do think we should consider deprecating *NetName for rocky as the logic required to maintain them is pretty ugly and it's confusing having two interfaces which do the same thing?

Comment 11 Bob Fournier 2018-06-05 21:07:14 UTC

Thanks Steven. I was able to verify the workaround, at least for the internal_api custom naming, using network_data.yaml as follows:
1. I copied network_data.yaml locally and updated name_lower for InternalApi -
name_lower: custominternalapiname
then used network_data.yaml in the deployment.

2. I made a local parameter for ServiceNetMap in network_environment.yaml and changed all use of "internal_api" to "custominternalapiname"

Note - I believe ServiceNetMap had to be manually updated in this case because name_lower is the new custom name, and this replacement code no longer matches it:
https://github.com/openstack/tripleo-heat-templates/blob/master/network/service_net_map.j2.yaml#L130

This workaround worked fine and the deployment completed with the custom InternalApi name.

There seems to be an issue using a custom StorageMgmt name because the
VipPort name is hardcoded for StorageMgmt and can't be substituted like other networks.
https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.j2.yaml#L803.
I will add patch for this.

Currently I'm trying to get the deployment to work with the *NetName substitution by changing access to "name_lower" to instead access "{get_param: {{network.name}}NetName}]}". However, the jinja templating makes extensive use of "network_lower" so I'm not sure supporting both methods in OSP-13z will be possible.
If not, I'd recommend we document the name_lower substitution in network_data.yaml, fix the StorageMgmt VipPort issue as above, and also fix the ServiceNetMap substitution to handle an updated name_lower so we don't need a local copy of ServiceNetMap.

Comment 12 Bob Fournier 2018-06-08 19:57:51 UTC

Created attachment 1449234 [details]
network_data.yaml file for workaround

This must be included in deployment using "-n network_data.yaml"

Comment 13 Bob Fournier 2018-06-08 19:59:06 UTC

Created attachment 1449235 [details]
network-environment.yaml with changes to ServiceNetMap

Comment 14 Bob Fournier 2018-06-08 20:17:10 UTC

I've verified the workaround using network_data.yaml and included the files that can be used.  

The network_data.yaml has the custom names in name_lower for the networks used in the bug description - StorageMgmt and InternalApi.  This is the recommended way of implementing custom network names going forward.

The network-environment.yaml sets up ServiceNetMap to use these custom names for the particular services.  A follow-on fix in 13z release will change it so that modifications to ServiceNetMap are not needed.

Comment 15 Bob Fournier 2018-06-08 21:05:21 UTC

Inadvertently closed, eaving open for additional fixes in OSP-13z.

Comment 16 Omri Hochman 2018-06-08 21:40:22 UTC

Thanks, Bob, Adding  doc-text / release note flags to address the following concern from Dan's side:

"We should probably include a note in the upgrade instructions regardless, just in case anyone does have these parameters in place. I suspect that any attempt to upgrade without making changes to network_data.yaml would result in an error, rather than blowing up the whole stack, but it would still require manual effort to fix."

Comment 29 Bob Fournier 2018-08-08 20:57:04 UTC

With this fix, in order to change the network name that is being used,network_data.yaml must be edited and included in the deployment.

For example to change the name of the InternalApi network, make these changes in network_data.yaml:


- name: InternalApi   <- no change to this line
  name_lower: internal_custom  <- new name of network
  service_net_map_replace: internal_api <- this is a new line that must match old name_lower


After deployment, this would result in:

(undercloud) [stack@host01 ~]$ openstack network list -c Name
+------------------+
| Name             |
+------------------+
| storage          |
| management       |
| tenant           |
| internal_custom  |  <- new name
| ctlplane         |
| storage_mgmt     |
| external         |
+------------------+

Note also, the workaround as described in comment 14 will have the same effect. It can be used prior to this fix being available (this fix will be in OSP-13z2).

Comment 31 mlammon 2018-08-09 22:10:22 UTC

Deploy osp13 puddle: 2018-08-03.3

This is verification of testing the ability to change to custom name of network prior to deploying overcloud.  This is not a change on active deployment.

Environment:
openstack-tripleo-heat-templates-8.0.4-10.el7ost.noarch


1) Updated network_data.yaml  (there is other data on all the networks, this show one change)
Example: /usr/share/openstack-tripleo-heat-templates/network_data.yaml

- name: StorageMgmt
  name_lower: custom_storage_mgmt
  service_net_map_replace: storage_mgmt

2) Add "-n /home/stack/network_data.yaml" line to your deployment configuration.  Below is example


(undercloud) [stack@undercloud-0 ~]$ cat overcloud_deploy.sh
#!/bin/bash

openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/virt/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-n /home/stack/network_data.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/hostnames.yml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/extra_templates.yaml \
-e /home/stack/virt/docker-images.yaml \
--log-file overcloud_deployment_68.log

2018-08-09 21:46:27Z [overcloud]: CREATE_COMPLETE  Stack CREATE completed successfully

 Stack overcloud CREATE_COMPLETE

Host 10.0.0.107 not found in /home/stack/.ssh/known_hosts
Started Mistral Workflow tripleo.deployment.v1.get_horizon_url. Execution ID: 88add242-5bae-4210-834d-1dba3b52fe2c
Overcloud Endpoint: http://10.0.0.107:5000/
Overcloud Horizon Dashboard URL: http://10.0.0.107:80/dashboard
Overcloud rc file: /home/stack/overcloudrc
Overcloud Deployed
(undercloud) [stack@undercloud-0 ~]$ cat network_data.yaml


(undercloud) [stack@undercloud-0 ~]$ openstack server list
+--------------------------------------+--------------+--------+------------------------+----------------+------------+
| ID                                   | Name         | Status | Networks               | Image          | Flavor     |
+--------------------------------------+--------------+--------+------------------------+----------------+------------+
| 0ade3f2c-e7ca-4f9d-ac3a-6c7c81c56804 | controller-0 | ACTIVE | ctlplane=192.168.24.9  | overcloud-full | controller |
| 6fa7a118-136d-4c5d-99b6-01d9b9e2b1b9 | controller-2 | ACTIVE | ctlplane=192.168.24.14 | overcloud-full | controller |
| c91bb0a9-06b7-4e40-9bfa-cf960d852e94 | ceph-0       | ACTIVE | ctlplane=192.168.24.10 | overcloud-full | ceph       |
| 23561ba4-7cb3-4479-bf95-3d888db05ad7 | controller-1 | ACTIVE | ctlplane=192.168.24.8  | overcloud-full | controller |
| 9a4fa664-f1cf-4435-9df1-3a2ca2291502 | compute-2    | ACTIVE | ctlplane=192.168.24.17 | overcloud-full | compute    |
| c61e94b1-2a0a-4562-ac58-098d3d430db1 | ceph-2       | ACTIVE | ctlplane=192.168.24.15 | overcloud-full | ceph       |
| 403ade27-428e-4564-b07a-fdafe59a29e6 | ceph-1       | ACTIVE | ctlplane=192.168.24.18 | overcloud-full | ceph       |
| 9cdee3bd-2c62-485c-94b4-c06c8ee3a3c7 | compute-1    | ACTIVE | ctlplane=192.168.24.12 | overcloud-full | compute    |
| 8756510a-df95-47ac-a662-b63fed394e99 | compute-0    | ACTIVE | ctlplane=192.168.24.6  | overcloud-full | compute    |
+--------------------------------------+--------------+--------+------------------------+----------------+------------+

(undercloud) [stack@undercloud-0 ~]$ openstack network list -c Name
+---------------------+
| Name                |
+---------------------+
| custom_storage_mgmt |      <----- successfully changed
| external            |
| management          |
| internal_api        |
| ctlplane            |
| storage             |
| tenant              |
+---------------------+

Comment 34 errata-xmlrpc 2018-08-29 16:36:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2574

Note You need to log in before you can comment on or make changes to this bug.