Bug 2295143 - Ceph upgrade fails when running FFU 16.2 to 17.1
Summary: Ceph upgrade fails when running FFU 16.2 to 17.1
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Ansible
Version: 5.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 5.3z8
Assignee: Teoman ONAY
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks: 2160009
TreeView+ depends on / blocked
 
Reported: 2024-07-02 08:54 UTC by Itzik Brown
Modified: 2025-02-18 14:43 UTC (History)
20 users (show)

Fixed In Version: ceph-ansible-6.0.28.17-1.el8cp
Doc Type: Bug Fix
Doc Text:
.The "Update the placement of radosgw hosts" task no longer fails during upgrade Previously, the "Update the placement of radosgw hosts" task would fail during an upgrade from Red Hat Ceph Storage 4 to Red Hat Ceph Storage 5. With this fix, the "Update the placement of radosgw hosts" task completes as expected.
Clone Of:
Environment:
Last Closed: 2025-02-13 19:22:53 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph-ansible pull 7575 0 None Merged cephadm-adopt: fix "Update the placement of radosgw hosts" task 2024-08-08 07:18:27 UTC
Red Hat Issue Tracker RHCEPH-9274 0 None None None 2024-07-02 08:59:52 UTC
Red Hat Knowledge Base (Solution) 7083494 0 None None None 2024-08-21 00:08:12 UTC
Red Hat Product Errata RHBA-2025:1478 0 None None None 2025-02-13 19:23:00 UTC

Description Itzik Brown 2024-07-02 08:54:37 UTC
Description of problem:
FFU fails in the Ceph upgrade stage.
From the ceph-upgrade-run.log

2024-07-02 07:58:20,105 p=554280 u=root n=ansible | TASK [Update the placement of radosgw hosts] ***********************************
2024-07-02 07:58:20,105 p=554280 u=root n=ansible | Tuesday 02 July 2024  07:58:20 +0000 (0:00:00.231)       0:05:29.024 ********** 
2024-07-02 07:58:20,198 p=554280 u=root n=ansible | fatal: [controller-0 -> {{ groups[mon_group_name][0] }}]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: the inline if-expression on line 10 evaluated to false and no else section was defined.\n\nThe error appears to be in '/usr/share/ceph-ansible/infrastructure-playbooks/cephadm-adopt.yml': line 1016, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: Update the placement of radosgw hosts\n      ^ here\n"}
2024-07-02 07:58:20,289 p=554280 u=root n=ansible | fatal: [controller-1 -> {{ groups[mon_group_name][0] }}]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: the inline if-expression on line 10 evaluated to false and no else section was defined.\n\nThe error appears to be in '/usr/share/ceph-ansible/infrastructure-playbooks/cephadm-adopt.yml': line 1016, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: Update the placement of radosgw hosts\n      ^ here\n"}
2024-07-02 07:58:20,316 p=554280 u=root n=ansible | fatal: [controller-2 -> {{ groups[mon_group_name][0] }}]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: the inline if-expression on line 10 evaluated to false and no else section was defined.\n\nThe error appears to be in '/usr/share/ceph-ansible/infrastructure-playbooks/cephadm-adopt.yml': line 1016, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n    - name: Update the placement of radosgw hosts\n      ^ here\n"}
2024-07-02 07:58:20,316 p=554280 u=root n=ansible | NO MORE HOSTS LEFT 

Version-Release number of selected component (if applicable):
RHOS-16.2-RHEL-8-20240612.n.1
RHOS-17.1-RHEL-8-20240701.n.1

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 8 John Fulton 2024-07-16 13:50:57 UTC
Workaround:

Downgrade ceph-ansible to a version older than ceph-ansible-6.0.28.8-1.el8cp

And use workaround described in comment #0 of https://bugzilla.redhat.com/show_bug.cgi?id=2262133

```
extra_container_args:
- -v
- /etc/pki/ca-trust:/etc/pki/ca-trust:ro
```

Comment 19 John Fulton 2024-08-18 16:02:53 UTC
Manny, 

ceph-ansible-6.0.28.8-1.el8cp has bug 2295143 so use the previous version so you do not hit that bug as a workaround.

The required version per the table (https://access.redhat.com/solutions/2045583) may be the latest, but it seems to have this bug. When we ship an update in the latest version, then we (which will become the required version) that is what will ultimately solved this problem. Until then the "required" version has this bug.

Comment 22 Manny 2024-08-21 00:09:12 UTC
Please see KCS Article #, (https://access.redhat.com/solutions/7083494) for this issue

BR
Manny

Comment 45 errata-xmlrpc 2025-02-13 19:22:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.3 security and bug fix updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2025:1478

Comment 46 John Fulton 2025-02-18 14:43:16 UTC
(In reply to John Fulton from comment #8)
> Workaround:
> 
> Downgrade ceph-ansible to a version older than ceph-ansible-6.0.28.8-1.el8cp
> 
> And use workaround described in comment #0 of
> Partnerhttps://bugzilla.redhat.com/show_bug.cgi?id=2262133
> 
> ```
> extra_container_args:
> - -v
> - /etc/pki/ca-trust:/etc/pki/ca-trust:ro
> ```

It is no longer necessary to downgrade now that ceph-ansible 6.0.28.20-1 has been released which contains a fix for this bug and others. So the recommendation is the usual: use the latest available version.


Note You need to log in before you can comment on or make changes to this bug.