Summary: | manila-share fails initialization ceph-ansible > 3.1.0-0.1.beta4 | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Tom Barron <tbarron> |
Component: | openstack-tripleo-heat-templates | Assignee: | Giulio Fidente <gfidente> |
Status: | CLOSED ERRATA | QA Contact: | Yogev Rabl <yrabl> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 13.0 (Queens) | CC: | adeza, aherr, aschoen, ceph-eng-bugs, dmacpher, gfidente, gmeno, jamsmith, jschluet, m.andre, mburns, nthomas, pgrist, sankarshan, scohen |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 13.0 (Queens) | Flags: | scohen:
needinfo+
|
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-tripleo-heat-templates-8.0.2-15.el7ost | Doc Type: | Known Issue |
Doc Text: |
The manila-share service fails to initialize because changes to ceph-ansible's complex ceph-keys processing generate incorrect content in the /etc/ceph/ceph.client.manila.keyring file.
To allow the manila-share service to initialize:
1) Make a copy of /usr/share/openstack/tripleo-heat-templates to use for the overcloud deploy.
2) Edit the .../tripleo-heat-templates/docker/services/ceph-ansible/ceph-base.yaml file to change all triple backslashes in line 295 to single backslashes.
Before:
mon_cap: 'allow r, allow command \\\"auth del\\\", allow command \\\"auth caps\\\", allow command \\\"auth get\\\", allow command \\\"auth get-or-create\\\"'
After:
mon_cap: 'allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"'
3) Deploy the overcloud substituting the path to the copy of tripleo-heat-templates wherever /usr/share/openstack-tripleo-heat templates occurred in your original overcloud-deploy command.
The ceph key /etc/ceph/ceph.client.manila.keyring file will have proper contents and the manila-share service will initialize properly.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-06-27 13:55:29 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | |||
Bug Blocks: | 1469208, 1489934, 1489938 |
Description
Tom Barron
2018-05-04 14:49:59 UTC
Permissions for the ceph keyfiles in the manila-share container may be the issue. On ceph-daemon based containers the keys look like this: [root controller-1 ~]# docker exec ceph-nfs-pacemaker ls -ld /etc/ceph drwxr-xr-x. 2 ceph ceph 200 May 2 19:22 /etc/ceph [root controller-1 ~]# docker exec ceph-nfs-pacemaker ls -l /etc/ceph total 28 -rw-------. 1 root root 159 May 2 19:20 ceph.client.admin.keyring -rw-------. 1 ceph ceph 292 May 2 19:22 ceph.client.manila.keyring -rw-------. 1 ceph ceph 299 May 2 19:22 ceph.client.openstack.keyring -rw-------. 1 ceph ceph 149 May 2 19:22 ceph.client.radosgw.keyring -rw-r--r--. 1 root root 931 May 2 19:19 ceph.conf -rw-------. 1 ceph ceph 688 May 2 19:20 ceph.mon.keyring -rw-r--r--. 1 root root 92 Apr 6 04:17 rbdmap whereas in the manila-share container: [root controller-1 ~]# docker exec openstack-manila-share-docker-0 ls -ld /etc/ceph drwxr-xr-x. 2 167 167 200 May 2 19:22 /etc/ceph [root controller-1 ~]# docker exec openstack-manila-share-docker-0 ls -l /etc/ceph total 28 -rw-------. 1 root root 159 May 2 19:20 ceph.client.admin.keyring -rw-------. 1 167 167 292 May 2 19:22 ceph.client.manila.keyring -rw-------. 1 167 167 299 May 2 19:22 ceph.client.openstack.keyring -rw-------. 1 167 167 149 May 2 19:22 ceph.client.radosgw.keyring -rw-r--r--. 1 root root 931 May 2 19:19 ceph.conf -rw-------. 1 167 167 688 May 2 19:20 ceph.mon.keyring -rw-r--r--. 1 root root 92 Apr 6 04:17 rbdmap There is no user with uid/gid 167 inside the manila-share container and ceph user has uid/gid 64045: [root controller-1 ~]# docker exec openstack-manila-share-docker-0 grep ':167:' /etc/passwd [root controller-1 ~]# docker exec openstack-manila-share-docker-0 grep ':167:' /etc/group [root controller-1 ~]# docker exec openstack-manila-share-docker-0 grep ceph /etc/passwd ceph:x:64045:64045::/home/ceph:/usr/sbin/nologin [root controller-1 ~]# docker exec openstack-manila-share-docker-0 grep ceph /etc/group ceph:x:64045: Whereas in the ceph-daemon image on on the host itself the ceph user has uid/gid 167: [root controller-1 ~]# docker exec ceph-nfs-pacemaker grep ceph /etc/passwd ceph:x:167:167:Ceph daemons:/var/lib/ceph:/sbin/nologin [root controller-1 ~]# docker exec ceph-nfs-pacemaker grep ceph /etc/group ceph:x:167: [root controller-1 ~]# grep ceph /etc/passwd ceph:x:167:167:Ceph daemons:/var/lib/ceph:/sbin/nologin [root controller-1 ~]# grep ceph /etc/group ceph:x:167: The file ownership and permissions differences between the two puddles actually don't matter since the manila-share container runs as root and can read all the keyfiles. The issue is reproducible on the host, outside the container: [root@controller-0 ~]# ceph -n client.manila --keyring=/etc/ceph/ceph.client.manila.keyring mds dump Error EACCES: access denied and is due to a bad manila client keyring: [root@controller-0 ~]# cat /etc/ceph/ceph.client.manila.keyring [client.manila] key = AQDSe+daAAAAABAAQ+8L/490ZS8AQefbKWwYdg== caps mds = "allow *" caps mgr = "allow *" caps mon = "allow r, allow command \\\"auth del\\\", allow command \\\"auth caps\\\", allow command \\\"auth get\\\", allow command \\\"auth get-or-create\\\"" caps osd = "allow rw" [root@controller-0 ~]# I rebuilt the keyring by hand: [root@controller-1 ceph]# ceph-authtool /etc/ceph/ceph.client.manila.keyring -n client.manila --cap mds 'allow *' --cap osd 'allow *' --cap mgr 'allow *' --cap mon "allow r, allow command 'auth del', allow command 'auth caps', allow command 'auth get', allow command 'auth get-or-create'" [root@controller-1 ceph]# ceph auth import -i /etc/ceph/ceph.client.manila.keyring [root@controller-1 ceph]# cat /etc/ceph/ceph.client.manila.keyring [client.manila] key = AQDSe+daAAAAABAAQ+8L/490ZS8AQefbKWwYdg== caps mds = "allow *" caps mgr = "allow *" caps mon = "allow r, allow command 'auth del', allow command 'auth caps', allow command 'auth get', allow command 'auth get-or-create'" caps osd = "allow *" And now 'ceph -n client.manila --keyring=/etc/ceph/ceph.client.manila.keyring mds dump' works and the manila-share log is showing successful eviction and driver initialization. Moving this to ceph-ansible since that is the component that builds the keyrings. There are over 80 changes in ceph-ansible between the two builds (the 2018-04-11 build uses 3.1.0-0.1.beta4 and the 2018-04-26.3 build uses 3.1.0-0.1.beta8). Could this one be the culprit? tbarron@tbarron ceph-ansible (beta-3.1)$ git show 42481550 commit 424815501a0c6072234a8e1311a0fefeb5bcc222 Author: Sébastien Han <seb> Date: Wed Apr 18 15:11:55 2018 +0200 client: add quotes to the dict values ceph-authtool does not support raw arguements so we have to quote caps declaration like this allow 'bla bla' instead of allow bla bla Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1568157 Signed-off-by: Sébastien Han <seb> diff --git a/roles/ceph-client/tasks/create_users_keys.yml b/roles/ceph-client/tasks/create_users_keys.yml index b5b012e1..36bbcc8e 100644 --- a/roles/ceph-client/tasks/create_users_keys.yml +++ b/roles/ceph-client/tasks/create_users_keys.yml @@ -1,7 +1,7 @@ --- - name: set_fact keys_tmp - preserve backward compatibility after the introduction of the ceph_keys module set_fact: - keys_tmp: "{{ keys_tmp|default([]) + [ { 'key': item.key, 'name': item.name, 'caps': { 'mon': item.mon_cap, 'osd': item.osd_cap|default(''), 'mds': item.mds_cap|default(''), 'mgr': +item.mgr_cap|default('') } , 'mode': item.mode } ] }}" + keys_tmp: "{{ keys_tmp|default([]) + [ { 'key': item.key, 'name': item.name, 'caps': { 'mon': item.mon_cap|quote, 'osd': item.osd_cap|default('')|quote, 'mds': +item.mds_cap|default('')|quote, 'mgr': item.mgr_cap|default('')|quote } , 'mode': item.mode } ] }}" when: - item.get('mon_cap', None) # it's enough to assume we are running an old-fashionned syntax simply by checking the presence of mon_cap since every key needs this cap with_items: "{{ keys }}" I will try to address this in the templates, seems the place where the fix should really go. The TL;DR is we need this fix for proper deployment of Manila CephFS-NFS and without it there is a work-around but it will take a THT replacement and OC rebuild, so ideally we would like to get this into beta. Tested with 2018-05-07.2 puddle and manila-share initializes fine and ceph client eviction has no issues. 2018-05-11 14:52:16.266 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] CephFS initializing... connect /usr/lib/python2.7/site-packages/ceph_volume_client.py:462 2018-05-11 14:52:16.268 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] Premount eviction of manila starting connect /usr/lib/python2.7/site-packages/ceph_volume_client.py:465 2018-05-11 14:52:16.269 44 INFO ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] evict clients with auth_name=manila 2018-05-11 14:52:16.274 44 DEBUG ceph_volume_client [-] _ready_to_evict: state=up:active _ready_to_evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:125 2018-05-11 14:52:16.275 44 DEBUG ceph_volume_client [-] mds_command: 4260, ['session', 'evict', 'auth_name=manila'] _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:157 2018-05-11 14:52:16.705 44 DEBUG ceph_volume_client [-] mds_command: complete 0 _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:165 2018-05-11 14:52:16.706 44 DEBUG ceph_volume_client [-] _ready_to_evict: state=up:active _ready_to_evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:125 2018-05-11 14:52:16.707 44 DEBUG ceph_volume_client [-] mds_command: 4271, ['session', 'evict', 'auth_name=manila'] _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:157 2018-05-11 14:52:16.709 44 DEBUG ceph_volume_client [-] mds_command: complete 0 _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:165 2018-05-11 14:52:16.712 44 DEBUG ceph_volume_client [-] _ready_to_evict: state=up:active _ready_to_evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:125 2018-05-11 14:52:16.712 44 DEBUG ceph_volume_client [-] mds_command: 4263, ['session', 'evict', 'auth_name=manila'] _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:157 2018-05-11 14:52:16.719 44 DEBUG ceph_volume_client [-] mds_command: complete 0 _evict /usr/lib/python2.7/site-packages/ceph_volume_client.py:165 2018-05-11 14:52:16.719 44 INFO ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] evict: joined all 2018-05-11 14:52:16.720 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] Premount eviction of manila completes connect /usr/lib/python2.7/site-packages/ceph_volume_client.py:467 2018-05-11 14:52:16.720 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] CephFS mounting... connect /usr/lib/python2.7/site-packages/ceph_volume_client.py:468 2018-05-11 14:52:16.731 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] Connection to cephfs complete connect /usr/lib/python2.7/site-packages/ceph_volume_client.py:470 2018-05-11 14:52:16.732 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] Recovering from partial auth updates (if any)... recover /usr/lib/python2.7/site-packages/ceph_volume_client.py:265 2018-05-11 14:52:16.732 44 DEBUG ceph_volume_client [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] Nothing to recover. No auth meta files. recover /usr/lib/python2.7/site-packages/ceph_volume_client.py:270 2018-05-11 14:52:16.733 44 INFO manila.share.drivers.cephfs.driver [req-e906cb0f-2d83-4490-8bf7-fe284516a3f2 - - - - -] [cephfs] Ceph client connection complete. manila client keyring is not over-escaped: ()[root@controller-1 /]# cat /etc/ceph/ceph.client.manila.keyring [client.manila] key = AQCtf/RaAAAAABAA2pG5oT2Kv93P/0hu9z105g== caps mds = "allow *" caps mgr = "allow *" caps mon = "allow r, allow command \"auth del\", allow command \"auth caps\", allow command \"auth get\", allow command \"auth get-or-create\"" caps osd = "allow rw" ()[root@controller-1 /]# If QE is satisfied this one can go to VERIFIED. verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |