Bug 1569258 - osp13 ceph-ansible cephx keys missing
Summary: osp13 ceph-ansible cephx keys missing
Keywords:
Status: CLOSED DUPLICATE of bug 1568157
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: ceph-ansible
Version: 13.0 (Queens)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Sébastien Han
QA Contact: Yogev Rabl
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-18 22:39 UTC by Pavel Sedlák
Modified: 2019-05-01 00:35 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-18 23:44:02 UTC
Target Upstream Version:


Attachments (Terms of Use)
ceph-install-workflow.log (119.45 KB, application/x-gzip)
2018-04-18 22:41 UTC, Pavel Sedlák
no flags Details
docker-inspect-ceph-create-keys.log (3.44 KB, application/x-gzip)
2018-04-18 22:42 UTC, Pavel Sedlák
no flags Details

Description Pavel Sedlák 2018-04-18 22:39:55 UTC
Seems some of the client setup create cephx keys commands hangs, ansible fails on container disappearing
though for one of keys attempts reports container was not there (as like too early ...?).

trying the container command manually (if i picked the correct command to try though) fails as:
> sh-4.2# /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json
> 2018-04-18 22:13:44.275048 7fe53bda1700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
in /etc/ceph just two files exist:
> -rw-r--r--. 1 root root 881 Apr 18 19:18 ceph.conf
> -rw-r--r--. 1 root root  92 Mar  8 21:41 rbdmap

same can be observed on e.g. ceph-0 node,
it seems they were created at some moment,
and are present on undercloud based on

> [root@undercloud-0 ceph-ansible]# ls -l /tmp/file-mistral-action0iXxwp/cf43be60-4337-11e8-892f-525400dba766/etc/ceph/
> total 24
> -rw-r--r--. 1 root root 159 Apr 18 16:59 ceph.client.admin.keyring
> -rw-r--r--. 1 root root 292 Apr 18 16:59 ceph.client.manila.keyring
> -rw-r--r--. 1 root root 299 Apr 18 16:59 ceph.client.openstack.keyring
> -rw-r--r--. 1 root root 149 Apr 18 16:59 ceph.client.radosgw.keyring
> -rw-r--r--. 1 root root  67 Apr 18 16:59 ceph.mgr.controller-0.keyring
> -rw-r--r--. 1 root root 688 Apr 18 16:59 ceph.mon.keyring

but not present on compute-0 neither ceph-0 nodes.



---
following info is about the ceph-ansible client cephx keys failure itself,
likely not much related to the reason why the keys are missing

ceph-ansible fails with output:
> 018-04-18 15:18:59,427 p=8377 u=mistral |  TASK [ceph-client : create cephx key(s)] ***************************************
> 2018-04-18 15:18:59,427 p=8377 u=mistral |  task path: /usr/share/ceph-ansible/roles/ceph-client/tasks/create_users_keys.yml:34
> 2018-04-18 15:18:59,427 p=8377 u=mistral |  Wednesday 18 April 2018  15:18:59 -0400 (0:00:00.047)       0:02:34.000 ******* 
> 2018-04-18 15:23:59,488 p=8377 u=mistral |  failed: [192.168.24.15] (item={'caps': {'mds': u'', 'osd': u'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics', 'mon': u'allow r', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==', 'name': u'client.openstack'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.openstack.keyring", "--name", "client.openstack", "--add-key", "AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==", "--cap", "mds", "", "--cap", "osd", "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics", "--cap", "mon", "allow r", "--cap", "mgr", "allow *"], "delta": "0:04:59.846104", "end": "2018-04-18 19:23:59.470272", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow r", "osd": "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics"}, "key": "AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==", "mode": "0600", "name": "client.openstack"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:18:59.624168", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running: Exited (0) Less than a second ago", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running: Exited (0) Less than a second ago"], "stdout": "rpc error: code = 2 desc = containerd: container not found", "stdout_lines": ["rpc error: code = 2 desc = containerd: container not found"]}
> 2018-04-18 15:23:59,731 p=8377 u=mistral |  failed: [192.168.24.15] (item={'caps': {'mds': u'allow *', 'osd': u'allow rw', 'mon': u'allow r, allow command \\\\\\"auth del\\\\\\", allow command \\\\\\"auth caps\\\\\\", allow command \\\\\\"auth get\\\\\\", allow command \\\\\\"auth get-or-create\\\\\\"', 'mgr': u'allow *'}, 'name': u'client.manila', 'key': u'AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==', 'mode': u'0600'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.manila.keyring", "--name", "client.manila", "--add-key", "AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==", "--cap", "mds", "allow *", "--cap", "osd", "allow rw", "--cap", "mon", "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "--cap", "mgr", "allow *"], "delta": "0:00:00.043788", "end": "2018-04-18 19:23:59.718230", "item": {"caps": {"mds": "allow *", "mgr": "allow *", "mon": "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "osd": "allow rw"}, "key": "AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==", "mode": "0600", "name": "client.manila"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:23:59.674442", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running"], "stdout": "", "stdout_lines": []}
> 2018-04-18 15:23:59,974 p=8377 u=mistral |  failed: [192.168.24.15] (item={'caps': {'mds': u'', 'osd': u'allow rwx', 'mon': u'allow rw', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==', 'name': u'client.radosgw'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.radosgw.keyring", "--name", "client.radosgw", "--add-key", "AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==", "--cap", "mds", "", "--cap", "osd", "allow rwx", "--cap", "mon", "allow rw", "--cap", "mgr", "allow *"], "delta": "0:00:00.045295", "end": "2018-04-18 19:23:59.962056", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow rw", "osd": "allow rwx"}, "key": "AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==", "mode": "0600", "name": "client.radosgw"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:23:59.916761", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running"], "stdout": "", "stdout_lines": []}

when later retrying /tmp/ansible-mistral-actionM8MqNe/ansible-playbook-command.sh manually,
i've observed that container exists on compute-0 (192.168.24.15 seen above),
and noticed just `/usr/bin/ceph --cluster ceph auth get client.openstack -f json` being run (checked in 5 sec period)
it looked as follows:

> [root@compute-0 heat-admin]# docker exec ceph-create-keys ps -ax
>     PID TTY      STAT   TIME COMMAND
>       1 ?        Ss     0:00 sleep 300
>       6 ?        Ssl    0:00 /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json
>      53 ?        Rs     0:00 ps -ax
> [root@compute-0 heat-admin]# ps -ef|grep -i ceph
> root       53973   53972  0 21:05 ?        00:00:00 /usr/bin/python /tmp/ansible_wlRcSA/ansible_module_ceph_key.py
> root       53974   53973  0 21:05 ?        00:00:00 /usr/bin/docker-current exec ceph-create-keys ceph --cluster ceph auth get client.openstack -f json
> root       53997   53980  0 21:05 ?        00:00:00 /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json
> root       54483   51570  0 21:10 pts/0    00:00:00 grep --color=auto -i ceph
> [root@compute-0 heat-admin]# /tmp/strace -p 53997
> /tmp/strace: Process 53997 attached
> select(0, NULL, NULL, NULL, {0, 12388}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
> ...
> select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 34013}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 1000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 2000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 4000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 8000})  = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout)
> select(0, NULL, NULL, NULL, {0, 50000}^C/tmp/strace: Process 53997 detached

docker restart ceph-create-keys
docker exec -t -i ceph-create-keys /bin/sh
> sh-4.2# /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json
> 2018-04-18 22:13:44.275048 7fe53bda1700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
> 
> ^CCluster connection interrupted or timed out


undercloud:
> ansible-2.4.3.0-1.el7ae.noarch
> ansible-role-redhat-subscription-1.0.1-1.el7ost.noarch
> ceph-ansible-3.1.0-0.1.beta6.el7cp.noarch
> ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch
compute's docker image used:
> 192.168.24.1:8787/rhceph/rhceph-3-rhel7                  latest              e3c91d93a251        6 weeks ago         589 MB
ceph packages inside that image:
> python-cephfs-12.2.1-45.el7cp.x86_64
> ceph-common-12.2.1-45.el7cp.x86_64
> ceph-selinux-12.2.1-45.el7cp.x86_64
> ceph-radosgw-12.2.1-45.el7cp.x86_64
> ceph-mgr-12.2.1-45.el7cp.x86_64
> libcephfs2-12.2.1-45.el7cp.x86_64
> ceph-base-12.2.1-45.el7cp.x86_64
> ceph-mon-12.2.1-45.el7cp.x86_64
> ceph-mds-12.2.1-45.el7cp.x86_64
> ceph-osd-12.2.1-45.el7cp.x86_64

Comment 1 Pavel Sedlák 2018-04-18 22:41:16 UTC
Created attachment 1423800 [details]
ceph-install-workflow.log

Comment 2 Pavel Sedlák 2018-04-18 22:42:04 UTC
Created attachment 1423801 [details]
docker-inspect-ceph-create-keys.log

Comment 4 John Fulton 2018-04-18 23:44:02 UTC
- This is a ceph-ansible bug 1568157
- Marking as duplicate

*** This bug has been marked as a duplicate of bug 1568157 ***


Note You need to log in before you can comment on or make changes to this bug.