Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1489353

Summary: OSP11 -> OSP12 upgrade: unable to rerun a failed major-upgrade-composable-steps-docker with ceph-ansible enabled because 'collect running osds' task fails
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Marius Cornea <mcornea>
Component: Ceph-AnsibleAssignee: Sébastien Han <shan>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.0CC: adeza, aschoen, ceph-eng-bugs, dbecker, gmeno, hnallurv, mburns, mcornea, morazi, nthomas, rhel-osp-director-maint, sankarshan, seb, yrabl
Target Milestone: rcKeywords: Triaged
Target Release: 3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.0.0-0.1.rc7.el7cp Ubuntu: ceph-ansible_3.0.0~rc7-2redhat1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-12-05 23:42:05 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2017-09-07 09:13:03 UTC
Description of problem:

OSP11 -> OSP12 upgrade: unable to rerun a failed major-upgrade-composable-steps-docker with ceph-ansible enabled because 'collect running osds' task fails.

We can see that systemctl list-units | grep \"loaded active\" | grep -Eo 'ceph-osd@[0-9]{1,2}.service command fails as it has been run in the previous run and the service was already disabled. 

We should make it idempotent and be able to run this playbook multiple times to allow the user to run the OSP upgrade process multiple times in case failures occur.

2017-09-07 04:59:31,753 p=24012 u=mistral |  PLAY [switching from non-containerized to containerized ceph mgr] **************
2017-09-07 04:59:31,753 p=24012 u=mistral |  skipping: no hosts matched
2017-09-07 04:59:31,764 p=24012 u=mistral |  PLAY [switching from non-containerized to containerized ceph osd] **************
2017-09-07 04:59:31,821 p=24012 u=mistral |  TASK [Gathering Facts] *********************************************************
2017-09-07 04:59:35,159 p=24012 u=mistral |  ok: [192.168.24.20]
2017-09-07 04:59:35,165 p=24012 u=mistral |  TASK [collect running osds] ****************************************************
2017-09-07 04:59:35,511 p=24012 u=mistral |  fatal: [192.168.24.20]: FAILED! => {"changed": false, "cmd": "systemctl list-units | grep \"loaded active\" | grep -Eo 'ceph-osd@[0-9]{1,2}.service'", "delta": "0:00:00.013085", "end": "2017-09-07 08:59:35.029177", "failed": true, "rc": 1, "start": "2017-09-07 08:59:35.016092", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
2017-09-07 04:59:35,512 p=24012 u=mistral |  PLAY RECAP *********************************************************************
2017-09-07 04:59:35,512 p=24012 u=mistral |  192.168.24.10              : ok=50   changed=5    unreachable=0    failed=0   
2017-09-07 04:59:35,512 p=24012 u=mistral |  192.168.24.11              : ok=52   changed=7    unreachable=0    failed=0   
2017-09-07 04:59:35,512 p=24012 u=mistral |  192.168.24.13              : ok=4    changed=0    unreachable=0    failed=0   
2017-09-07 04:59:35,512 p=24012 u=mistral |  192.168.24.20              : ok=5    changed=0    unreachable=0    failed=1   
2017-09-07 04:59:35,513 p=24012 u=mistral |  192.168.24.8               : ok=4    changed=0    unreachable=0    failed=0   
2017-09-07 04:59:35,513 p=24012 u=mistral |  192.168.24.9               : ok=57   changed=6    unreachable=0    failed=0   
2017-09-07 04:59:35,513 p=24012 u=mistral |  localhost                  : ok=0    changed=0    unreachable=0    failed=0   


Version-Release number of selected component (if applicable):
ceph-ansible-3.0.0-0.rc6.4.g0d9489f.el7.noarch

Comment 2 seb 2017-09-08 11:36:15 UTC
Patch upstream has merged.

Comment 5 Harish NV Rao 2017-09-12 07:03:36 UTC
@Marius, if you are verifying this bug fix, could you please provide the qa_ack?

Comment 9 errata-xmlrpc 2017-12-05 23:42:05 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387