Bug 1875508

Summary: THT fails to deploy Spine and Leaf architecture if TLS everywhere is enabled
Product: Red Hat OpenStack Reporter: Alex Stupnikov <astupnik>
Component: openstack-tripleo-heat-templatesAssignee: Harald Jensås <hjensas>
Status: CLOSED ERRATA QA Contact: David Rosenfeld <drosenfe>
Severity: low Docs Contact:
Priority: high    
Version: 13.0 (Queens)CC: alee, amcleod, gregraka, hjensas, mburns, pbabbar, pweeks, rheslop, sputhenp
Target Milestone: z16Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)Flags: pweeks: needinfo-
pweeks: needinfo-
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.4.1-82.el7ost Doc Type: Enhancement
Doc Text:
This enhancement enables you to override the Orchestration service (heat) parameter, `ServiceNetMap`, for a role when you deploy the overcloud. + On spine-and-leaf (edge) deployments that use TLS-everywhere, hiera interpolation has been problematic when used to map networks on roles. Overriding the ServiceNetMap per role fixes the issues seen in some TLS-everywhere deployments, provides an easier interface, and replaces the need for the more complex hiera interpolation.
Story Points: ---
Clone Of:
: 1918585 (view as bug list) Environment:
Last Closed: 2021-06-16 10:58:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1918585    

Comment 2 Alex Stupnikov 2020-09-03 16:09:29 UTC
Description of problem:

My troubleshooting may be wrong, please correct me if I miss something.

Customer reported that he is unable to build spine and leaf RHOSP 13 deployment with TLS everywhere:

Aug 31 10:04:24 compute-0 os-collect-config: "Warning: Could not get certificate: Execution of '/usr/bin/getcert request -I libvirt-client-cert -f /etc/pki/libvirt/clientcert.pem -c IPA -N CN= -K libvirt/ -D  -C systemctl reload libvirtd -w -k /etc/pki/libvirt/private/clientkey.pem' returned 2: New signing request \"libvirt-client-cert\" added.",
Aug 31 10:04:24 compute-0 os-collect-config: "Error: /Stage[main]/Tripleo::Profile::Base::Certmonger_user/Tripleo::Certmonger::Libvirt[libvirt-client-cert]/Certmonger_certificate[libvirt-client-cert]: Could not evaluate: Could not get certificate: Server at https://idm.example.com/ipa/xml denied our request, giving up: 3007 (RPC failed at server.  'fqdn' is required).",

From what I see there are 3 problems with getcert command: subjectname, principal name and dnsname are not provided.

From THT and puppet-tripleo [1] I can see that NovaLibvirtNetwork is used to define those parameters. Obviously, this wouldn't work if there are a number of compute types with different internal API networks.

There are multiple compute settings generated using NovaLibvirtNetwork from service net mapping, but our documentation recommends to adjust other hieradata parameters [2]. It is not the case for libvirt_certificates_specs.

I kindly ask you to:

1. Help with a workaround (to define role-specific libvirt_certificates_specs)
2. Figure out permanent solution (adjust documentation, modify THT, etc)

P.S. I tried to understand if other *certificates_specs definitions are broken. It looks like apache and haproxy are using dynamic lists of networks and are not affected by this issue. I ask to double-check this anyway.

Thanks in advance.

[1]
https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/puppet/services/nova-libvirt.yaml#L270
https://github.com/openstack/puppet-tripleo/blob/stable/queens/manifests/certmonger/libvirt.pp#L58

[2]
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/spine_leaf_networking/index#assigning-routes-for-roles

Comment 8 Harald Jensås 2020-09-21 17:24:04 UTC
@Ade, It seems we don't document the libvirt_certificates_specs and libvirt_vnc_certificates_specs, since the spine_leaf_networking book in doc's don't example TLS.

We should enchance the docs.



[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/spine_leaf_networking/index

Comment 47 Harald Jensås 2021-03-16 15:58:46 UTC
I think one way to verify would be to use the role-based parameter to map some service to a non-default network. If we deploy Compute and Controller roles, we can for example re-map some Nova stuff that is default on 'internal_api' to the 'storage' network.

Ie:

parameter_defaults:
  ControllerServiceNetMap:
    NovaMetadataNetwork: storage
    NovaVncProxyNetwork: storage
    NovaLibvirtNetwork: storage
  ComputeServiceNetMap:
    NovaMetadataNetwork: storage
    NovaVncProxyNetwork: storage
    NovaLibvirtNetwork: storage


Then check that etc/nova.conf ``[vnc]/server_listen`` is set to the node's ip address on the storage network.



Unless we have a spine-and-leaf OSP13 job with TLSe enabled?

Then we should switch that to use the per-role ServiceNetMap functionaly this fix adds, and remove the hiera interpolation overrides.
i.e replace the hiera interpolation overrides as exampled in the doc[1]:


This
----

  Compute1ServiceNetMap:
    NeutronTenantNetwork: tenant1
    NovaApiNetwork: internal_api1
    NovaLibvirtNetwork: internal_api1
    NovaVncProxyNetwork: internal_api1

Not-this
--------

  Compute1ExtraConfig:
    nova::compute::libvirt::vncserver_listen: "%{hiera('internal_api1')}"
    nova::compute::vncserver_proxyclient_address: "%{hiera('internal_api1')}"
    neutron::agents::ml2::ovs::local_ip: "%{hiera('tenant1')}"
    cold_migration_ssh_inbound_addr: "%{hiera('internal_api1')}"
    live_migration_ssh_inbound_addr: "%{hiera('internal_api1')}"
    nova::migration::libvirt::live_migration_inbound_addr: "%{hiera('internal_api1')}"
    nova::my_ip: "%{hiera('internal_api1')}"
    tripleo::profile::base::database::mysql::client::mysql_client_bind_address: "%{hiera('internal_api1')}"




https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html-single/spine_leaf_networking/index#assigning-routes-for-roles (step 4.)

Comment 70 errata-xmlrpc 2021-06-16 10:58:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 13.0 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:2385