Bug 1804079 - TLS Everywhere fails with DCN: cinder active/active with etcd fails during certificate creation
Summary: TLS Everywhere fails with DCN: cinder active/active with etcd fails during ce...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 16.0 (Train)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ga
: 16.1 (Train on RHEL 8.2)
Assignee: Alan Bishop
QA Contact: Tzach Shefi
URL:
Whiteboard:
: 1792477 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-18 07:57 UTC by Sadique Puthen
Modified: 2020-07-29 07:50 UTC (History)
17 users (show)

Fixed In Version: puppet-tripleo-11.5.0-0.20200611115535.e86dd81.el8ost openstack-tripleo-heat-templates-11.3.2-0.20200616081529.396affd.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, the etcd service was not configured properly to run in a container. As a result, an error occurred when the service tried to create the TLS certificate. With this update, the etcd service runs in a container and can create the TLS certificate.
Clone Of:
Environment:
Last Closed: 2020-07-29 07:50:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
ansible.log (55.23 KB, application/x-bzip)
2020-02-18 07:59 UTC, Sadique Puthen
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1869955 0 None None None 2020-03-31 20:57:49 UTC
OpenStack gerrit 716432 0 None MERGED Workaround for cinder A/A and etcd with TLS-everywhere 2020-11-11 21:58:18 UTC
OpenStack gerrit 716661 0 None MERGED Fix etcd's support for internal TLS 2020-11-11 21:58:18 UTC
OpenStack gerrit 717295 0 None MERGED Create DNS entries in IPA for openstack services 2020-11-11 21:58:18 UTC
Red Hat Product Errata RHBA-2020:3148 0 None None None 2020-07-29 07:50:47 UTC

Description Sadique Puthen 2020-02-18 07:57:14 UTC
Description of problem:

Trying to deploy an HCI edge location using a separate stack with cinder active/active and etcd and TLS everywhere fails with below error message.

07:28:21 puppet-user: Error: Could not find user etcd\n<13>Feb 18 07:28:21 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Etcd/File[/etc/pki/tls/certs/etcd.crt]/owner: change from 'root' to 'etcd' failed: Could not find user etcd\n<13>Feb 18 07:28:21 puppet-user: Error: Could not find group etcd\n<13>Feb 18 07:28:21 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Etcd/File[/etc/pki/tls/certs/etcd.crt]/group: change from 'root' to 'etcd' failed: Could not find group etcd\n<13>Feb 18 07:28:21 puppet-user: Error: Could not find user etcd\n<13>Feb 18 07:28:21 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Etcd/File[/etc/pki/tls/private/etcd.key]/owner: change from 'root' to 'etcd' failed: Could not find user etcd\n<13>Feb 18 07:28:21 puppet-user: Error: Could not find group etcd\n<13>Feb 18 07:28:21 puppet-user: Error: /Stage[main]/Tripleo::Certmonger::Etcd/File[/etc/pki/tls/private/etcd.key]/group: change from 'root' to 'etcd' failed: Could not find group etcd

Template used for edge stack/location can be found here. https://gitlab.cee.redhat.com/sputhenp/openstack/blob/master/basic/templates/osp-16/edge-1/overcloud-deploy-edge-1-tls-everywhere.sh

Templates used for central stack  can be found here https://gitlab.cee.redhat.com/sputhenp/openstack/blob/master/basic/templates/osp-16/overcloud-deploy-tls-everywhere.sh

Attaching ansible.log

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Sadique Puthen 2020-02-18 07:59:19 UTC
Created attachment 1663665 [details]
ansible.log

Comment 2 Alan Bishop 2020-02-18 16:59:39 UTC
This is one of several issues identified with cinder using etcd for its DLM when running active/active. See bug #1792477 (this BZ is item 1. in that BZ's description).

While I know how to fix the ownership problem, there are several more layers to the overall problem, and it's not clear whether cinder will be able to use etcd for its DLM. I'll take this BZ for now.

Comment 9 Ade Lee 2020-04-14 19:58:16 UTC
The security DFG side of this  -- that is -- adding the ability to add DNS entries to the IPA server is being tracked here:

https://bugzilla.redhat.com/show_bug.cgi?id=1823932

Comment 10 Alan Bishop 2020-06-17 20:33:12 UTC
The original problem was an issue with the puppet-tripleo code responsible for creating the etcd cert. That code was fixed a while ago, but several fixes and enhancements in other areas were required for full tls-e support. To keep the focus on this BZ, I'm stating the puppet-tripleo code is now working correctly.

Bear in mind that testing the fix requires the following
- Deploy cinder in A/A mode
- Deploy tls-e using tripleo-ipa (see bug #1823932), or wait until bug #1843701 is fixed and use novajoin
- Deploy with EnableEtcdInternalTLS set True

Comment 18 Paul Grist 2020-07-08 22:38:26 UTC
*** Bug 1792477 has been marked as a duplicate of this bug. ***

Comment 20 Tzach Shefi 2020-07-13 07:21:34 UTC
Verified on:
puppet-tripleo-11.5.0-0.20200616033427.8ff1c6a.el8ost.noarch

Following a TLS-everywhere DCN with Cinder A/A deployment, everything is TLS including Cinder A/A. 

Deployment details, proving TLS is enabled:


Overcloud_deploy.sh -.  showing only TLS related bits, other lines were removed for simplicity.
[stack@site-undercloud-0 ~]$ cat overcloud_deploy.sh
#!/bin/bash
openstack overcloud deploy \
-e /home/stack/central/enable-tls.yaml \
-e /home/stack/central/inject-trust-anchor.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-endpoints-public-ip.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/tls-everywhere-endpoints-dns.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/services/haproxy-public-tls-certmonger.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ssl/enable-internal-tls.yaml \

Same for DCN site:
[stack@site-undercloud-0 ~]$ cat overcloud_dcn1.sh
source /home/stack/stackrc
sudo cp overcloud_deploy.sh overcloud_deploy_dcn1.sh
sudo cp /home/stack/central/enable-tls.yaml /home/stack/dcn1/enable-tls.yaml   -> tls enabled on DCN. 
..


Confirm Cinder A/A

(dcn1) [stack@site-undercloud-0 ~]$ cinder service-list
+------------------+------------------------------------+---------+---------+-------+
| Binary           | Host                               | Zone    | Status  | State | 
+------------------+------------------------------------+---------+---------+-------+
| cinder-scheduler | central-controller0-0.redhat.local | nova    | enabled | up    | 
| cinder-scheduler | central-controller0-1.redhat.local | nova    | enabled | up    | 
| cinder-scheduler | central-controller0-2.redhat.local | nova    | enabled | up    |
| cinder-volume    | dcn1-computehci1-0@tripleo_ceph    | az-dcn1 | enabled | up    |
| cinder-volume    | dcn1-computehci1-1@tripleo_ceph    | az-dcn1 | enabled | up    |
| cinder-volume    | dcn1-computehci1-2@tripleo_ceph    | az-dcn1 | enabled | up    | 
| cinder-volume    | dcn2-computehci2-0@tripleo_ceph    | az-dcn2 | enabled | up    |
| cinder-volume    | dcn2-computehci2-1@tripleo_ceph    | az-dcn2 | enabled | up    |
| cinder-volume    | dcn2-computehci2-2@tripleo_ceph    | az-dcn2 | enabled | up    |
| cinder-volume    | hostgroup@tripleo_iscsi            | nova    | enabled | up    |


Cinder endpoint of both central and DCN are are https:
(central) [stack@site-undercloud-0 ~]$ openstack endpoint list | grep cinder
| 6bb513a8310d4b32aebe51a75a421d00 | regionOne | cinderv3     | volumev3       | True    | public    | https://overcloud.redhat.local:13776/v3/%(tenant_id)s             |
| dd6137b413b74a4dbc25c5bec4ef3c8f | regionOne | cinderv3     | volumev3       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e343a6b884d447698159a346a0ddd1d4 | regionOne | cinderv2     | volumev2       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |
| e62845df63844a998edfd83b292f5c98 | regionOne | cinderv3     | volumev3       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e6a592920ecd4f9393d90849b5b65095 | regionOne | cinderv2     | volumev2       | True    | public    | https://overcloud.redhat.local:13776/v2/%(tenant_id)s             |
| ed44a6da498c42d4b1fd2c7b513a788b | regionOne | cinderv2     | volumev2       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |


(dcn1) [stack@site-undercloud-0 ~]$ openstack endpoint list | grep cinder
| 6bb513a8310d4b32aebe51a75a421d00 | regionOne | cinderv3     | volumev3       | True    | public    | https://overcloud.redhat.local:13776/v3/%(tenant_id)s             |
| dd6137b413b74a4dbc25c5bec4ef3c8f | regionOne | cinderv3     | volumev3       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e343a6b884d447698159a346a0ddd1d4 | regionOne | cinderv2     | volumev2       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |
| e62845df63844a998edfd83b292f5c98 | regionOne | cinderv3     | volumev3       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e6a592920ecd4f9393d90849b5b65095 | regionOne | cinderv2     | volumev2       | True    | public    | https://overcloud.redhat.local:13776/v2/%(tenant_id)s             |
| ed44a6da498c42d4b1fd2c7b513a788b | regionOne | cinderv2     | volumev2       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |

Same for dcn2 site:
(dcn2) [stack@site-undercloud-0 ~]$ openstack endpoint list | grep cinder
| 6bb513a8310d4b32aebe51a75a421d00 | regionOne | cinderv3     | volumev3       | True    | public    | https://overcloud.redhat.local:13776/v3/%(tenant_id)s             |
| dd6137b413b74a4dbc25c5bec4ef3c8f | regionOne | cinderv3     | volumev3       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e343a6b884d447698159a346a0ddd1d4 | regionOne | cinderv2     | volumev2       | True    | admin     | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |
| e62845df63844a998edfd83b292f5c98 | regionOne | cinderv3     | volumev3       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v3/%(tenant_id)s  |
| e6a592920ecd4f9393d90849b5b65095 | regionOne | cinderv2     | volumev2       | True    | public    | https://overcloud.redhat.local:13776/v2/%(tenant_id)s             |
| ed44a6da498c42d4b1fd2c7b513a788b | regionOne | cinderv2     | volumev2       | True    | internal  | https://overcloud.internalapi.redhat.local:8776/v2/%(tenant_id)s  |


A basic cinder create works on DCN2
(dcn2) [stack@site-undercloud-0 ~]$ cinder list
+--------------------------------------+-----------+-------+------+-------------+----------+-------------+
| ID                                   | Status    | Name  | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+-------+------+-------------+----------+-------------+
| 25f0d715-0c49-4f64-b13a-3482ca4fa104 | available | test1 | 1    | tripleo     | false    |             |
+--------------------------------------+-----------+-------+------+-------------+----------+-------------+

We have automation job for TLS everywhere DCN Cinder A/A.
In fact I've used that very same job to deploy the above system.
Tempest volumes test are passing. 

Confirm we can deploy TLS-everywhere DCN Cinder A/A.

Comment 23 errata-xmlrpc 2020-07-29 07:50:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3148


Note You need to log in before you can comment on or make changes to this bug.