Seems some of the client setup create cephx keys commands hangs, ansible fails on container disappearing though for one of keys attempts reports container was not there (as like too early ...?). trying the container command manually (if i picked the correct command to try though) fails as: > sh-4.2# /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json > 2018-04-18 22:13:44.275048 7fe53bda1700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory in /etc/ceph just two files exist: > -rw-r--r--. 1 root root 881 Apr 18 19:18 ceph.conf > -rw-r--r--. 1 root root 92 Mar 8 21:41 rbdmap same can be observed on e.g. ceph-0 node, it seems they were created at some moment, and are present on undercloud based on > [root@undercloud-0 ceph-ansible]# ls -l /tmp/file-mistral-action0iXxwp/cf43be60-4337-11e8-892f-525400dba766/etc/ceph/ > total 24 > -rw-r--r--. 1 root root 159 Apr 18 16:59 ceph.client.admin.keyring > -rw-r--r--. 1 root root 292 Apr 18 16:59 ceph.client.manila.keyring > -rw-r--r--. 1 root root 299 Apr 18 16:59 ceph.client.openstack.keyring > -rw-r--r--. 1 root root 149 Apr 18 16:59 ceph.client.radosgw.keyring > -rw-r--r--. 1 root root 67 Apr 18 16:59 ceph.mgr.controller-0.keyring > -rw-r--r--. 1 root root 688 Apr 18 16:59 ceph.mon.keyring but not present on compute-0 neither ceph-0 nodes. --- following info is about the ceph-ansible client cephx keys failure itself, likely not much related to the reason why the keys are missing ceph-ansible fails with output: > 018-04-18 15:18:59,427 p=8377 u=mistral | TASK [ceph-client : create cephx key(s)] *************************************** > 2018-04-18 15:18:59,427 p=8377 u=mistral | task path: /usr/share/ceph-ansible/roles/ceph-client/tasks/create_users_keys.yml:34 > 2018-04-18 15:18:59,427 p=8377 u=mistral | Wednesday 18 April 2018 15:18:59 -0400 (0:00:00.047) 0:02:34.000 ******* > 2018-04-18 15:23:59,488 p=8377 u=mistral | failed: [192.168.24.15] (item={'caps': {'mds': u'', 'osd': u'allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics', 'mon': u'allow r', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==', 'name': u'client.openstack'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.openstack.keyring", "--name", "client.openstack", "--add-key", "AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==", "--cap", "mds", "", "--cap", "osd", "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics", "--cap", "mon", "allow r", "--cap", "mgr", "allow *"], "delta": "0:04:59.846104", "end": "2018-04-18 19:23:59.470272", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow r", "osd": "allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics"}, "key": "AQDZkNdaAAAAABAApMVyKwrJZ4MJHA6ca9Q7Ig==", "mode": "0600", "name": "client.openstack"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:18:59.624168", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running: Exited (0) Less than a second ago", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running: Exited (0) Less than a second ago"], "stdout": "rpc error: code = 2 desc = containerd: container not found", "stdout_lines": ["rpc error: code = 2 desc = containerd: container not found"]} > 2018-04-18 15:23:59,731 p=8377 u=mistral | failed: [192.168.24.15] (item={'caps': {'mds': u'allow *', 'osd': u'allow rw', 'mon': u'allow r, allow command \\\\\\"auth del\\\\\\", allow command \\\\\\"auth caps\\\\\\", allow command \\\\\\"auth get\\\\\\", allow command \\\\\\"auth get-or-create\\\\\\"', 'mgr': u'allow *'}, 'name': u'client.manila', 'key': u'AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==', 'mode': u'0600'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.manila.keyring", "--name", "client.manila", "--add-key", "AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==", "--cap", "mds", "allow *", "--cap", "osd", "allow rw", "--cap", "mon", "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "--cap", "mgr", "allow *"], "delta": "0:00:00.043788", "end": "2018-04-18 19:23:59.718230", "item": {"caps": {"mds": "allow *", "mgr": "allow *", "mon": "allow r, allow command \\\\\\\"auth del\\\\\\\", allow command \\\\\\\"auth caps\\\\\\\", allow command \\\\\\\"auth get\\\\\\\", allow command \\\\\\\"auth get-or-create\\\\\\\"", "osd": "allow rw"}, "key": "AQDZkNdaAAAAABAAOLlLTuUzlvWd5zBF3mOk3g==", "mode": "0600", "name": "client.manila"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:23:59.674442", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running"], "stdout": "", "stdout_lines": []} > 2018-04-18 15:23:59,974 p=8377 u=mistral | failed: [192.168.24.15] (item={'caps': {'mds': u'', 'osd': u'allow rwx', 'mon': u'allow rw', 'mgr': u'allow *'}, 'mode': u'0600', 'key': u'AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==', 'name': u'client.radosgw'}) => {"changed": true, "cmd": ["docker", "exec", "ceph-create-keys", "ceph-authtool", "--create-keyring", "/etc/ceph/ceph.client.radosgw.keyring", "--name", "client.radosgw", "--add-key", "AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==", "--cap", "mds", "", "--cap", "osd", "allow rwx", "--cap", "mon", "allow rw", "--cap", "mgr", "allow *"], "delta": "0:00:00.045295", "end": "2018-04-18 19:23:59.962056", "item": {"caps": {"mds": "", "mgr": "allow *", "mon": "allow rw", "osd": "allow rwx"}, "key": "AQDZkNdaAAAAABAAtXTnibZ9qbqJMYFRRzQVNw==", "mode": "0600", "name": "client.radosgw"}, "msg": "non-zero return code", "rc": 1, "start": "2018-04-18 19:23:59.916761", "stderr": "Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running", "stderr_lines": ["Error response from daemon: Container fbd8a8b44ceeba4ae86e895183692339f23585f83b69da597e2ea4e3a081921c is not running"], "stdout": "", "stdout_lines": []} when later retrying /tmp/ansible-mistral-actionM8MqNe/ansible-playbook-command.sh manually, i've observed that container exists on compute-0 (192.168.24.15 seen above), and noticed just `/usr/bin/ceph --cluster ceph auth get client.openstack -f json` being run (checked in 5 sec period) it looked as follows: > [root@compute-0 heat-admin]# docker exec ceph-create-keys ps -ax > PID TTY STAT TIME COMMAND > 1 ? Ss 0:00 sleep 300 > 6 ? Ssl 0:00 /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json > 53 ? Rs 0:00 ps -ax > [root@compute-0 heat-admin]# ps -ef|grep -i ceph > root 53973 53972 0 21:05 ? 00:00:00 /usr/bin/python /tmp/ansible_wlRcSA/ansible_module_ceph_key.py > root 53974 53973 0 21:05 ? 00:00:00 /usr/bin/docker-current exec ceph-create-keys ceph --cluster ceph auth get client.openstack -f json > root 53997 53980 0 21:05 ? 00:00:00 /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json > root 54483 51570 0 21:10 pts/0 00:00:00 grep --color=auto -i ceph > [root@compute-0 heat-admin]# /tmp/strace -p 53997 > /tmp/strace: Process 53997 attached > select(0, NULL, NULL, NULL, {0, 12388}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout) > ... > select(0, NULL, NULL, NULL, {0, 50000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 34013}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 1000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 2000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 4000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 8000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 16000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 32000}) = 0 (Timeout) > select(0, NULL, NULL, NULL, {0, 50000}^C/tmp/strace: Process 53997 detached docker restart ceph-create-keys docker exec -t -i ceph-create-keys /bin/sh > sh-4.2# /usr/bin/python2.7 /usr/bin/ceph --cluster ceph auth get client.openstack -f json > 2018-04-18 22:13:44.275048 7fe53bda1700 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory > > ^CCluster connection interrupted or timed out undercloud: > ansible-2.4.3.0-1.el7ae.noarch > ansible-role-redhat-subscription-1.0.1-1.el7ost.noarch > ceph-ansible-3.1.0-0.1.beta6.el7cp.noarch > ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch compute's docker image used: > 192.168.24.1:8787/rhceph/rhceph-3-rhel7 latest e3c91d93a251 6 weeks ago 589 MB ceph packages inside that image: > python-cephfs-12.2.1-45.el7cp.x86_64 > ceph-common-12.2.1-45.el7cp.x86_64 > ceph-selinux-12.2.1-45.el7cp.x86_64 > ceph-radosgw-12.2.1-45.el7cp.x86_64 > ceph-mgr-12.2.1-45.el7cp.x86_64 > libcephfs2-12.2.1-45.el7cp.x86_64 > ceph-base-12.2.1-45.el7cp.x86_64 > ceph-mon-12.2.1-45.el7cp.x86_64 > ceph-mds-12.2.1-45.el7cp.x86_64 > ceph-osd-12.2.1-45.el7cp.x86_64
Created attachment 1423800 [details] ceph-install-workflow.log
Created attachment 1423801 [details] docker-inspect-ceph-create-keys.log
- This is a ceph-ansible bug 1568157 - Marking as duplicate *** This bug has been marked as a duplicate of bug 1568157 ***