Description of problem: Deployment fails during ceph-ansible deployment. Only Controller/Compute roles are used in deployment to 5 nodes (3 control, 2 compute). All nodes have ceph osd docker service on them. The ceph-ansible deployment fails with: The error was: error while evaluating conditional (hostvars[item]['ceph_osd_container_stat'].get('rc') == 0): 'dict object' has no attribute 'ceph_osd_container_stat' Version-Release number of selected component (if applicable): ceph-ansible-3.1.9-1.el7.noarch How reproducible: Seems to happen about 50% of the time. Additional info: The corresponding code that is failing is here: https://github.com/ceph/ceph-ansible/blob/60bc1e38db0e797ad6553584927f86486ae09c19/roles/ceph-handler/handlers/main.yml#L109
Created attachment 1499379 [details] ceph log
Created attachment 1499383 [details] ceph hieradata
Preliminary testing shows that the error does not happen with ceph-ansible-3.1.6. Running more tests to confirm it.
Confirmed the problem does not happen in ceph-ansible-3.1.6