Bug 1984880

Summary: rolling upgrade to rhcs 4.2 z2 failing due to wrong mon host name
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Avi Avraham <aavraham>
Component: Ceph-AnsibleAssignee: Dimitri Savineau <dsavinea>
Status: CLOSED ERRATA QA Contact: Vasishta <vashastr>
Severity: urgent Docs Contact: Aron Gunn <agunn>
Priority: unspecified    
Version: 4.2CC: agunn, aschoen, ceph-eng-bugs, ceph-qe-bugs, dsavinea, gmeno, jbiao, kdreyer, nthomas, tserlin, vashastr, vereddy, ykaul
Target Milestone: ---Keywords: Regression, UpgradeBlocker
Target Release: 4.2z3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-ansible-4.0.62-1.el8cp, ceph-ansible-4.0.62-1.el7cp Doc Type: Bug Fix
Doc Text:
.Rolling upgrade fails when Ceph containers are collocated The `rolling_update.yml` Ansible playbook fails when the Ceph Monitor and Ceph Object Gateway daemons are collocated with containers, and when the multi-site Ceph Object Gateway is enabled. This failure was caused by the `radosgw-admin` commands not able to execute because of the Ceph Monitor container is stopped during the upgrade process. With this release, the multi-site Ceph Object Gateway code within the `ceph-handler` role is skipped during the upgrade process. As a result, the `rolling_update.yml` Ansible playbook runs successfully.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-09-27 18:26:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1890121    

Description Avi Avraham 2021-07-22 11:50:18 UTC
Description of problem:

We are attempting an upgrade from RHCS 4.2z1 to RHCS 4.2z2 in a multi-site environment, and we are facing an issue where the playbook fails.

The issue appears to be in the container_exec_cmd variable. 
In a rolling update, the container_exec_cmd gets the value of mon_host.

However, mon_host gets the value of a difference mon than the one we are running on, so container_exec_cmd gets a wrong value. 

If my inventory is

[mons] 
Mon1 
Mon2 
Mon3

Then the ansible is delegating to Mon1, but mon_host is equal to Mon3. The playbook fails when it tries to run Ceph commands.

Version-Release number of selected component (if applicable):
RHCS 4.2z1 

How reproducible:
run rolling upgrade 

Steps to Reproduce:
1.
2.
3.

Actual results:
Upgrade aborts with an error in the task "add endpoints to their zone groups (s)."
 

Expected results:
rolling update end successfully 

Additional info:

Comment 22 errata-xmlrpc 2021-09-27 18:26:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 4.2 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3670