Bug 1488837

Summary: /usr/share/openstack-tripleo-heat-templates/network/management.yaml file doesn't exist
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Steven Hardy <shardy>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: medium    
Version: 12.0 (Pike)CC: agurenko, aschultz, bnemec, dbecker, dyasny, mbracho, mbultel, mburns, mcornea, morazi, rhel-osp-director-maint, sasha, sclewis, shardy, tvignaud
Target Milestone: rcKeywords: Triaged
Target Release: 12.0 (Pike)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-7.0.3-0.20171023134947.8da5e1f.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-13 22:05:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1491452    
Bug Blocks:    
Attachments:
Description Flags
roles_data none

Description Marius Cornea 2017-09-06 10:06:34 UTC
Description of problem:

In /usr/share/openstack-tripleo-heat-templates/environments/network-management.yaml we have:

resource_registry:
  OS::TripleO::Network::Management: ../network/management.yaml

  # Port assignments for the controller role
  OS::TripleO::Controller::Ports::ManagementPort: ../network/ports/management.yaml

  # Port assignments for the compute role
  OS::TripleO::Compute::Ports::ManagementPort: ../network/ports/management.yaml

  # Port assignments for the ceph storage role
  OS::TripleO::CephStorage::Ports::ManagementPort: ../network/ports/management.yaml

  # Port assignments for the swift storage role
  OS::TripleO::SwiftStorage::Ports::ManagementPort: ../network/ports/management.yaml

  # Port assignments for the block storage role
  OS::TripleO::BlockStorage::Ports::ManagementPort: ../network/ports/management.yaml

../network/management.yaml and ../network/ports/management.yaml files do not exist:

(undercloud) [stack@undercloud-0 ~]$ ls /usr/share/openstack-tripleo-heat-templates/network/ports/management.yaml
ls: cannot access /usr/share/openstack-tripleo-heat-templates/network/ports/management.yaml: No such file or directory
(undercloud) [stack@undercloud-0 ~]$ ls /usr/share/openstack-tripleo-heat-templates/network/management.yaml
ls: cannot access /usr/share/openstack-tripleo-heat-templates/network/management.yaml: No such file or directory


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-7.0.0-0.20170901051303.0rc1.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy environment with management network enabled

Actual results:
There are missing files related to the management network activation.

Expected results:
The files should be present to enable the management network during deployment and upgrade.

Additional info:

Comment 1 Alex Schultz 2017-09-06 21:08:00 UTC
It looks as if this file is deprecated and the management.yaml was removed as part of the conversion to composable networks. https://review.openstack.org/#/c/492218/  

Since this appears to have been converted to a j2.yaml file, it may be automatically generated at deployment time so the lack of existence on the file system may not actually be a problem.  Steve can you confirm?

Comment 2 Steven Hardy 2017-09-07 12:52:26 UTC
Alex is correct, the file is now rendered on deployment, but tripleoclient downloads the rendered management.yaml to a tempdir, so using the environment file should still work as before.

Can you please confirm if using it for overcloud deploy is actually breaking, and if so please can you share the full deploy command and any error, thanks!

Comment 3 Marius Cornea 2017-09-07 13:04:03 UTC
(In reply to Steven Hardy from comment #2)
> Alex is correct, the file is now rendered on deployment, but tripleoclient
> downloads the rendered management.yaml to a tempdir, so using the
> environment file should still work as before.
> 
> Can you please confirm if using it for overcloud deploy is actually
> breaking, and if so please can you share the full deploy command and any
> error, thanks!

I've actually hit this error during the upgrade and the deploy command failed quickly complaining that it couldn't find the management related files. I don't have the env anymore but this is the original deploy command used for OSP11:

openstack overcloud deploy --templates $THT \
-r ~/openstack_deployment/roles/roles_data.yaml \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/openstack_deployment/environments/nodes.yaml \
-e ~/openstack_deployment/environments/network-environment.yaml \
-e ~/openstack_deployment/environments/disk-layout.yaml \
-e ~/openstack_deployment/environments/scheduler_hints_env.yaml \
-e ~/openstack_deployment/environments/ips-from-pool-all.yaml \
-e ~/openstack_deployment/environments/neutron-settings.yaml \
-e ~/openstack_deployment/environments/nfs-storage.yaml \
--log-file overcloud_deployment.log &> overcloud_install.log

For upgrade it gets added the additional environment files used for the upgrade and docker transition.

How should the management network be activated in case of standard deployments(default roles_data provided in the rpm) and composable roles which use custom roles_data?

Comment 4 Ben Nemec 2017-09-07 15:27:20 UTC
What does the roles_data.yaml file look like?  If the management network isn't enabled there I don't believe the template will be generated.  This may be something we need to release note for upgrades since I believe it's new for this release.

Comment 5 Marius Cornea 2017-09-07 20:34:54 UTC
Created attachment 1323453 [details]
roles_data

I'm attaching the custom roles data file used during my test. It doesn't include a reference to the management network.

Nevertheless I see this as a more general issue. How does it work when the user uses the default roles_data? $THT/environments/network-management.yaml which is the current way of activating the management network points to missing files. Should it still be used during deployment/removed during upgrade?

Comment 6 Ben Nemec 2017-09-07 21:49:04 UTC
Yeah, the Management network isn't assigned to any roles so it won't be generated.

Although come to think of it I don't think network-management.yaml is needed anymore.  It will be added to network-isolation.yaml automatically if the Management network is enabled.  For example, the controller role has:

networks:
    - External
    - InternalApi
    - Storage
    - StorageMgmt
    - Tenant

It needs to also have:
    - Management

for the management network to be enabled.  Same for any roles you want connected to the management network.

It looks like we do have a release note for this: https://docs.openstack.org/releasenotes/tripleo-heat-templates/pike.html#deprecation-notes but deprecation is actually the wrong section for it since the file no longer exists.  It should probably be in upgrades since it's a mandatory change in this cycle.

Comment 7 Marius Cornea 2017-09-08 10:35:27 UTC
I had a chat with shardy about this and I think we need to cover a couple of aspects related to the management network and the new composable networks feature:

Deployment
==========
1. How to enable the management network on deployments using default roles_data.yaml provided in /usr/share/openstack-tripleo-heat-templates/roles_data.yaml:

make a copy of network_data.yaml, and change enabled: false to enabled: true, then pass it to the deploy command via -n network_data.yaml

2. How to enable the management network on deployments using custom roles_data.yaml:

add Management network to the networks attribute assigned to each role. 

How does this work with the role_data generated from the preset roles in tht? In this case is the custom network_data.yaml with enabled management network still required? 


Upgrade
=======
The same changes are required for upgrade and in addition we might require to remove the $THT/environments/network-management.yaml environment file. Further testing needs to be done from my side to confirm that it's failing because of missing files.

Comment 8 Steven Hardy 2017-09-12 15:50:58 UTC
> Yeah, the Management network isn't assigned to any roles so it won't be generated.

Ok so I did some testing and that isn't quite the case - for networks which exist in the network_data.yaml, but that are disabled, we do render the resources e.g for network/port etc, but they're mapped to noop implementation (e.g this is basically the same as before we enabled composable networks), e.g:

(undercloud) [stack@undercloud ~]$ openstack overcloud deploy --templates tripleo-heat-templates -e tripleo-heat-templates/environments/network-isolation.yaml -e tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e network-environment.yaml ^C

(undercloud) [stack@undercloud ~]$ openstack stack environment show overcloud | grep Management
  OS::TripleO::BlockStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::CephStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::Compute::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::Controller::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::Network::Management: OS::Heat::None
  OS::TripleO::ObjectStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::SwiftStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml

To deploy with the management network enabled, you can still use the  -e tripleo-heat-templates/environments/network-management.yaml option, but you must also copy the network_data.yaml and set enabled: true for the management network (otherwise, as reported, the files aren't rendered, so the hard-coded paths in the environment file fail to resolve).

So if you do that and deploy like:

(undercloud) [stack@undercloud ~]$ openstack overcloud deploy --templates tripleo-heat-templates -e tripleo-heat-templates/environments/network-isolation.yaml -e tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e network-environment.yaml -e tripleo-heat-templates/environments/network-management.yaml -n network_data.yaml

$ openstack stack environment show overcloud | grep Management
  OS::TripleO::BlockStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/management.yaml
  OS::TripleO::CephStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/management.yaml
  OS::TripleO::Compute::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/management.yaml
  OS::TripleO::Controller::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/management.yaml
  OS::TripleO::Network::Management: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/management.yaml
  OS::TripleO::ObjectStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/noop.yaml
  OS::TripleO::SwiftStorage::Ports::ManagementPort: http://192.168.24.1:8080/v1/AUTH_3691e8e1bb6f462fa6053f0caa326614/overcloud/network/ports/management.yaml

We can see that the resources aren't noop'd anymore, and the network is created as expected:

(undercloud) [stack@undercloud ~]$ openstack network list
+--------------------------------------+--------------+--------------------------------------+
| ID                                   | Name         | Subnets                              |
+--------------------------------------+--------------+--------------------------------------+
| 1fd52aa6-87c9-457c-9643-880f38789386 | ctlplane     | 3ceb06af-6844-4095-ae0d-71d3355ddeca |
| 48c0ad6d-810d-417e-ab84-0d50525208bc | internal_api | aef33da0-d0c3-43c5-bfc0-48c0f9ed615f |
| 62ffb962-08d0-49ce-b7d5-7ef4a8b41f5a | storage      | 3a55264f-8972-42ac-bb83-909a4e51a2e1 |
| 8df9e0a1-abb2-432f-8a41-4d1bd1e5617d | tenant       | 2d45aec5-ccc9-4773-9ac2-56dd048670ad |
| a32505ee-36aa-43f7-a23d-10ed3ed6d96c | storage_mgmt | 59fc0ebf-4c8f-4ff9-acb4-f2129d250bce |
| e2ba2bfc-e4a3-48b1-8881-1491e928cf09 | management   | bdb889af-5359-4efa-a757-bb8fb25b7fcb |
| e79cfdd8-53cc-410d-af70-aae611d20252 | external     | d62b8232-0e3c-4f4d-ae20-bdd90909d962 |
+--------------------------------------+--------------+--------------------------------------+

So we have two options:

1. Document copying the network_data.yaml to enable the management network

2. Add some compatibility code to tripleoclient which automatically enables the management network in network_data.yaml when the network-management.yaml environment is selected (this would be removed after we remove the now deprecated environment file).

I think the second option is probably best, so I'll look into it, but IMHO this isn't a blocker, provided Marius can confirm the findings I outline above.

Comment 9 Steven Hardy 2017-09-12 15:55:38 UTC
Note to clarify Ben's point that we need to update the roles_data.yaml - that isn't required because the network-management.yaml environment overrides these resource_registry mappings, where the ports would normally be noop for disabled networks, or those not included in the networks list in roles_data.yaml:

https://github.com/openstack/tripleo-heat-templates/blob/master/environments/network-isolation.j2.yaml#L34

So while it is try we'll need to figure out the patter to add an optional network in the roles_data.yaml model, I think it's not a blocker since the old environment can still be made to work (unless custom roles are in use, which is another issue that means we should move away from the network-management environment file, we could render it, but I'm not sure we should since it's now deprecated?).

Comment 10 Marius Cornea 2017-09-14 15:13:34 UTC
(In reply to Steven Hardy from comment #8)
> So we have two options:
> 
> 1. Document copying the network_data.yaml to enable the management network
> 
> 2. Add some compatibility code to tripleoclient which automatically enables
> the management network in network_data.yaml when the network-management.yaml
> environment is selected (this would be removed after we remove the now
> deprecated environment file).
> 
> I think the second option is probably best, so I'll look into it, but IMHO
> this isn't a blocker, provided Marius can confirm the findings I outline
> above.

I tested a basic deployment and copying network_data.yaml and setting management network to  enabled: true + passing environments/network-management.yaml worked:

openstack overcloud deploy --templates $THT \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/openstack_deployment/environments/nodes.yaml \
-e ~/openstack_deployment/environments/network-environment.yaml \
-e ~/openstack_deployment/environments/neutron-settings.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e ~/docker-osp12.yaml \
-n ~/network_data.yaml

Nevertheless I think option 2 is much better because we'd avoid any manual adjustments that are going to be needed for network_data.yaml when upgrading to the next release.

I'm going next to test the same approach for a composable roles deployment and then move to upgrade.

Comment 11 Marius Cornea 2017-09-19 16:45:54 UTC
I tried upgrading the same basic scenario with the same instructions including th custom network_data.yaml but upgrade failed with an error reported by the Neutron server (bug 1493234).

Comment 18 errata-xmlrpc 2017-12-13 22:05:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462

Comment 20 Red Hat Bugzilla 2023-09-18 00:12:40 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days