Bug 1962304 - cinder volume at DCN unable to read central cephx keyring
Summary: cinder volume at DCN unable to read central cephx keyring
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z2
: 16.2 (Train on RHEL 8.4)
Assignee: Alan Bishop
QA Contact: Tzach Shefi
URL:
Whiteboard:
Depends On:
Blocks: 1894668
TreeView+ depends on / blocked
 
Reported: 2021-05-19 17:29 UTC by John Fulton
Modified: 2022-03-23 22:29 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20220116004909.64b2e88.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-23 22:28:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1930620 0 None None None 2021-06-02 21:11:12 UTC
OpenStack gerrit 794341 0 None MERGED Fix cinder's cephx keyring file permissions 2022-02-16 14:23:52 UTC
OpenStack gerrit 796679 0 None MERGED Fix cinder's cephx keyring file permissions 2022-02-16 14:23:54 UTC
Red Hat Issue Tracker OSP-4003 0 None None None 2022-01-06 15:25:09 UTC
Red Hat Product Errata RHSA-2022:0995 0 None None None 2022-03-23 22:29:04 UTC

Description John Fulton 2021-05-19 17:29:18 UTC
As discovered in BZ 1894668, a dcn site was deployed with the cinder_volume service and glance was configured to use the central and dcn site ceph cluster.

The within the glance container it was possible to use the cephx keyring to make an RBD connection to central Ceph but within the cinder_volume container it was not. However, the cinder_volume container was able to make an RBD connection to the dcn ceph cluster.

Comment 2 John Fulton 2021-05-19 17:31:29 UTC
Though we can set the permissions to 644 to workaround it, I think the correct fix is to ensure the cinder user/group own the central cephx key file the same way that user owns the dcn1 cephx keyring file. The glance user owns both in the glance container.

[root@dcn1-computehci1-2 ~]# podman exec -ti glance_api ls -l  /etc/ceph/*.openstack.keyring
-rw-------. 1 glance glance 227 May 14 09:40 /etc/ceph/central.client.openstack.keyring
-rw-------. 1 glance glance 201 May 14 09:34 /etc/ceph/dcn1.client.openstack.keyring
[root@dcn1-computehci1-2 ~]# 

[root@dcn1-computehci1-2 ~]# podman exec -ti cinder_volume ls -l  /etc/ceph/*.openstack.keyring
-rw-r--r--. 1    167    167 227 May 14 09:40 /etc/ceph/central.client.openstack.keyring
-rw-------. 1 cinder cinder 201 May 14 09:34 /etc/ceph/dcn1.client.openstack.keyring
[root@dcn1-computehci1-2 ~]#

Comment 4 Alan Bishop 2021-05-19 18:37:49 UTC
As John noted in comment #3, the issue is that permissions for cinder to access the ceph keyrings are only being applied to each site's primary cluster, and not the other sites. This means a dcn site can access its own keyring, but it's unable to access the central site's keyring (insufficient permission), and that causes cinder to not be able to migrate an edge volume to the central site. I very puzzled why I didn't encounter the issue when I tested offline volume migration

Glance handles things by adding access to all ceph clusters associated with GlanceMultistoreConfig [1]. 
Starting in upstream Wallaby, nova does something similar [2] so that it can access glance images directly via the associated ceph cluster.
Cinder does something similar in Wallaby when support was added for CinderRbdMultiConfig [3].

[1] https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/glance/glance-api-container-puppet.yaml#L579-L590
[2] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-compute-container-puppet.yaml#L1301-L1312
[3] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/cinder/cinder-common-container-puppet.yaml#L195-L206

This is a cinder THT issue, so I'm updating the BZ and grabbing it for me to solve. I'm not 100% sure of the best approach, because Train does not support CinderRbdMultiConfig, and it doesn't feel right for cinder to use data in GlanceMultistoreConfig.

Comment 9 Alan Bishop 2021-06-18 14:00:15 UTC
The fix has merged on upstream stable/train.

Comment 19 Luigi Toscano 2022-03-01 10:37:06 UTC
After making sure the deployment follows the documentation (with `openstack overcloud export `, its output uses CephExternalMultiConfig as expected and and in the end the cinder_volume container on the dcn site can access the 'openstack' keyring of the central site.


openstack-tripleo-heat-templates-11.6.1-2.20220116004910.el8ost.noarch

Comment 25 errata-xmlrpc 2022-03-23 22:28:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenStack Platform 16.2 (openstack-tripleo-heat-templates) security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0995


Note You need to log in before you can comment on or make changes to this bug.