Bug 1663026

Summary: [FFU] [Rhos-10->13] Ceph upgrade fails on ffu-converge step
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Archit Modi <amodi>
Component: Ceph-AnsibleAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Yogev Rabl <yrabl>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.2CC: aadil, anharris, aschoen, ceph-eng-bugs, edonnell, elicohen, gabrioux, gfidente, gmeno, johfulto, kdreyer, lbezdick, msufiyan, nthomas, sankarshan, shan, tchandra, tserlin, yrabl
Target Milestone: z1   
Target Release: 3.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-ansible-3.2.3-1.el7cp Ubuntu: ceph-ansible_3.2.3-2redhat1 Doc Type: Bug Fix
Doc Text:
Cause: ceph-ansible has tasks which try to interpret some json output of the ceph CLI. In the playbook, those output are json representation in a string, therefore, we have to convert them in real json by using the `from_json` filter so we can manage with dict instead. When the output is empty, we need to provide a default value anyway to the `from_json` filter, so there was a default value that had an incorrect type. Consequence: `from_json` filter expect to receive a string while we were passing a json type causing the filter to throw an error. Fix: set the default value passed to `from_json` to the correct type (a string) Result: tasks containing filter `from_json` don't fail anymore.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-31 10:36:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1578730    
Attachments:
Description Flags
ceph_ansible_playbook.log none

Comment 1 Giulio Fidente 2019-01-03 09:27:26 UTC
Created attachment 1518101 [details]
ceph_ansible_playbook.log

Comment 2 Giulio Fidente 2019-01-03 09:28:46 UTC
in job #46 looking at the undercloud logs it looks like this is happening with ceph-ansible 3.2

  Dec 19 14:28:57 Updated: ceph-ansible-3.2.0-1.el7cp.noarch

there was a recent change in ceph-ansible meant to fix this issue [1] and it seems the fix is included in the build tested by the job; probably the fix is not sufficient in the scenario where there is some output but it's not in json format (maybe spurious error messages)

Attaching the playbook logs and moving the bug to ceph/ceph-ansible

1. https://github.com/ceph/ceph-ansible/commit/2cea33f7fc4bf59eaa249ca26ba326105e392402

Comment 6 Ken Dreyer (Red Hat) 2019-01-07 21:54:43 UTC
https://github.com/ceph/ceph-ansible/pull/3474 is now available in ceph-ansible v3.2.1 upstream.

Comment 13 Giulio Fidente 2019-01-11 16:31:21 UTC
Sorry guys, moving back to ASSIGNED because there is another error affecting FFU visible in the logs.

The switch-from-non-containerized-to-containerized-ceph-daemons playbook hits the same issue fixed by BZ #1650572 ; maybe the condition added with https://github.com/ceph/ceph-ansible/pull/3389 needs to be applied for the "switch" playbook too?

Comment 14 Sébastien Han 2019-01-14 15:36:06 UTC
Do you mind testing? https://github.com/ceph/ceph-ansible/pull/3500
Thanks.

Comment 16 Giulio Fidente 2019-01-20 21:14:28 UTC
*** Bug 1665664 has been marked as a duplicate of this bug. ***

Comment 19 Lukas Bezdicka 2019-01-22 09:53:58 UTC
Hi,
FFWD worked with https://github.com/ceph/ceph-ansible/pull/3512
If this patch is present in the last build than it's working.

Lukas

Comment 23 Eliad Cohen 2019-01-23 20:05:36 UTC
Verified, converge step completes successfully.

Comment 34 errata-xmlrpc 2019-01-31 10:36:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0223