Description of problem: OSP11 -> OSP12 upgrade: unable to rerun a failed major-upgrade-composable-steps-docker with ceph-ansible enabled because 'collect running osds' task fails. We can see that systemctl list-units | grep \"loaded active\" | grep -Eo 'ceph-osd@[0-9]{1,2}.service command fails as it has been run in the previous run and the service was already disabled. We should make it idempotent and be able to run this playbook multiple times to allow the user to run the OSP upgrade process multiple times in case failures occur. 2017-09-07 04:59:31,753 p=24012 u=mistral | PLAY [switching from non-containerized to containerized ceph mgr] ************** 2017-09-07 04:59:31,753 p=24012 u=mistral | skipping: no hosts matched 2017-09-07 04:59:31,764 p=24012 u=mistral | PLAY [switching from non-containerized to containerized ceph osd] ************** 2017-09-07 04:59:31,821 p=24012 u=mistral | TASK [Gathering Facts] ********************************************************* 2017-09-07 04:59:35,159 p=24012 u=mistral | ok: [192.168.24.20] 2017-09-07 04:59:35,165 p=24012 u=mistral | TASK [collect running osds] **************************************************** 2017-09-07 04:59:35,511 p=24012 u=mistral | fatal: [192.168.24.20]: FAILED! => {"changed": false, "cmd": "systemctl list-units | grep \"loaded active\" | grep -Eo 'ceph-osd@[0-9]{1,2}.service'", "delta": "0:00:00.013085", "end": "2017-09-07 08:59:35.029177", "failed": true, "rc": 1, "start": "2017-09-07 08:59:35.016092", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []} 2017-09-07 04:59:35,512 p=24012 u=mistral | PLAY RECAP ********************************************************************* 2017-09-07 04:59:35,512 p=24012 u=mistral | 192.168.24.10 : ok=50 changed=5 unreachable=0 failed=0 2017-09-07 04:59:35,512 p=24012 u=mistral | 192.168.24.11 : ok=52 changed=7 unreachable=0 failed=0 2017-09-07 04:59:35,512 p=24012 u=mistral | 192.168.24.13 : ok=4 changed=0 unreachable=0 failed=0 2017-09-07 04:59:35,512 p=24012 u=mistral | 192.168.24.20 : ok=5 changed=0 unreachable=0 failed=1 2017-09-07 04:59:35,513 p=24012 u=mistral | 192.168.24.8 : ok=4 changed=0 unreachable=0 failed=0 2017-09-07 04:59:35,513 p=24012 u=mistral | 192.168.24.9 : ok=57 changed=6 unreachable=0 failed=0 2017-09-07 04:59:35,513 p=24012 u=mistral | localhost : ok=0 changed=0 unreachable=0 failed=0 Version-Release number of selected component (if applicable): ceph-ansible-3.0.0-0.rc6.4.g0d9489f.el7.noarch
Patch upstream has merged.
@Marius, if you are verifying this bug fix, could you please provide the qa_ack?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:3387