Bug 1863036

Summary: [OSP13->OSP16.1] Keep ceph-ansible 3 in the Undercloud during the whole FFU procedure.
Product: Red Hat OpenStack Reporter: Jose Luis Franco <jfrancoa>
Component: openstack-tripleo-heat-templatesAssignee: Francesco Pantano <fpantano>
Status: CLOSED ERRATA QA Contact: Jose Luis Franco <jfrancoa>
Severity: high Docs Contact:
Priority: urgent    
Version: 16.1 (Train)CC: fpantano, mburns, spower, yrabl
Target Milestone: z1Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-11.3.2-0.20200616081537.396affd.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-08-27 15:19:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1853457    
Bug Blocks:    

Description Jose Luis Franco 2020-08-03 14:07:44 UTC
Description of problem:

In order to solve a ceph upgrade issue, the docker2podman Ansible playbook was backported to ceph-ansible 3 (https://bugzilla.redhat.com/show_bug.cgi?id=1853457) this allows us to adapt the FFU procedure for ceph-enabled environments keeping ceph-ansible 3 in the Undercloud until the whole FFU procedure has finished. And only, once the converge step has run, start with the ceph upgrade step (including ceph-ansible update to ceph-ansible 4).

For this to work, a set of changes is required in the tripleo-heat-templates, otherwise the ceph_systemd step (which belongs to the FFU procedure) will fail during the ceph-ansible repo validations.

Currently, there isn't a way to workaround this, as the fail_without_ceph_ansible parameter has the value hardcoded: https://github.com/openstack/tripleo-heat-templates/blob/stable/train/deployment/ceph-ansible/ceph-base.yaml#L634
being impossible to override the value during the FFU process. An alternative would be to skip the validations during the ceph_systemd step, but that could cover up some other issues which are tackled by this validations.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Run the Undercloud upgrade adding ceph-ansible in to_keep and making sure no ceph repos are configured after leapp.
2. Run overcloud upgrade prepare and avoid passing any value for CephAnsibleRepo
3. Run containers prepare step
4. Run ceph_systemd step for the first controller, it will fail in "TASK [Fail if ceph-ansible doesn't belong to the specified repo] ***************"




Actual results:

ceph_systemd step fails to run

Expected results:

ceph_systemd step works.


Additional info:

Comment 5 Yogev Rabl 2020-08-07 16:13:34 UTC
verified

Comment 8 errata-xmlrpc 2020-08-27 15:19:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1 director bug fix advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3542