Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2267114

Summary: /etc/ceph content get deleted during OSP FFU 17.1 at control plane system upgrade step
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: alisci <alisci>
Component: CephadmAssignee: Adam King <adking>
Status: CLOSED UPSTREAM QA Contact: Mohit Bisht <mobisht>
Severity: medium Docs Contact:
Priority: high    
Version: 5.3CC: adking, akane, cephqe-warriors, dhill, fpantano, fpiccion, gfidente, jelle.hoylaerts.ext, johfulto, ktordeur, lonavarr, madgupta, rkachach, sabose, saraut, tserlin, vdas
Target Milestone: ---Keywords: Reopened
Target Release: 5.3z9   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2026-03-04 08:52:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description alisci 2024-02-29 19:47:39 UTC
Description of problem:
During FFU to OSP 17.1.2 with director deployed ceph, at the system upgrade step of the control plane (step 10.1.8 of the FFU docs), CU found /etc/ceph content deleted and deployed failing for keyring missing. CU brought ahead the FFU by recovering the key rings from other controller nodes , however the health status is unhealthy

Version-Release number of selected component (if applicable):
OSP 17.1.2
Ceph 5.3.6


will follow detailed data on following private comments

Comment 11 Kenny Tordeurs 2024-04-30 19:29:19 UTC
What's the next step to try and get this working properly, maybe we need to backup and restore this folder before and after leapp upgrade ?

Comment 12 Adam King 2024-05-01 17:21:33 UTC
(In reply to Kenny Tordeurs from comment #11)
> What's the next step to try and get this working properly, maybe we need to
> backup and restore this folder before and after leapp upgrade ?

That's something that would work, assuming the label isn't getting the contents added back on its own. If you have that label and then do a `ceph mgr fail` (assuming you have multiple mgr daemons) just to get it to retry everything, does it not repopulate that directory?

Comment 13 Kenny Tordeurs 2024-05-02 08:52:41 UTC
@Adam the `ceph mgr fail` run from another node doesn't fix anything on the node where /etc/ceph is missing.

Comment 14 Madhur Gupta 2024-05-16 13:50:12 UTC
@adking could you please respond to Kenny's query? Our customer tried the `ceph mgr fail` but it doesn't fix it. It is annoying for them to be doing it manually.

Comment 15 Adam King 2024-05-16 14:45:30 UTC
(In reply to Kenny Tordeurs from comment #13)
> @Adam the `ceph mgr fail` run from another node doesn't fix anything on the
> node where /etc/ceph is missing.

`ceph mgr fail` should re-trigger the client keyring distribution. I'm wondering if the /etc/ceph directory must exist for that to work though. If the output of `ceph orch host ls` and `ceph orch client-keyring ls` confirms the client-keyring is setup and the hosts have the labels I'd guess you have to try recreating that directory on all the hosts.

Comment 20 David Hill 2025-05-14 16:31:03 UTC
Is it leapp or the fact that we removed ceph-common that broke it ?
~~~
[dhill@supportshell-1 log]$ grep ceph dnf.rpm.log
2025-01-06T13:10:00-0500 SUBDEBUG Upgrade: libcephfs2-2:14.2.22-128.el8cp.x86_64
2025-01-06T13:10:00-0500 SUBDEBUG Upgrade: python3-ceph-argparse-2:14.2.22-128.el8cp.x86_64
2025-01-06T13:10:01-0500 SUBDEBUG Upgrade: python3-cephfs-2:14.2.22-128.el8cp.x86_64
2025-01-06T13:10:05-0500 SUBDEBUG Upgrade: puppet-ceph-5.0.1-2.20230811230058.30f9f59.el8ost.noarch
2025-01-06T13:11:02-0500 SUBDEBUG Upgrade: ceph-common-2:14.2.22-128.el8cp.x86_64
2025-01-06T13:12:04-0500 SUBDEBUG Upgraded: ceph-common-2:14.2.11-208.el8cp.x86_64
2025-01-06T13:12:09-0500 SUBDEBUG Upgraded: python3-cephfs-2:14.2.11-208.el8cp.x86_64
2025-01-06T13:12:09-0500 SUBDEBUG Upgraded: libcephfs2-2:14.2.11-208.el8cp.x86_64
2025-01-06T13:12:10-0500 SUBDEBUG Upgraded: puppet-ceph-3.1.2-2.20211230004837.201e8b1.el8ost.noarch
2025-01-06T13:12:13-0500 SUBDEBUG Upgraded: python3-ceph-argparse-2:14.2.11-208.el8cp.x86_64
2025-05-07T10:05:36-0400 SUBDEBUG Installed: cephadm-2:16.2.10-266.el8cp.noarch
2025-05-07T10:07:22-0400 SUBDEBUG Upgrade: libcephfs2-2:16.2.10-266.el8cp.x86_64
2025-05-07T10:07:23-0400 SUBDEBUG Upgrade: python3-ceph-argparse-2:16.2.10-266.el8cp.x86_64
2025-05-07T10:07:25-0400 SUBDEBUG Upgrade: python3-cephfs-2:16.2.10-266.el8cp.x86_64
2025-05-07T10:07:25-0400 SUBDEBUG Installed: python3-ceph-common-2:16.2.10-266.el8cp.x86_64
2025-05-07T10:07:40-0400 SUBDEBUG Upgrade: ceph-common-2:16.2.10-266.el8cp.x86_64
2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: ceph-common-2:14.2.22-128.el8cp.x86_64
2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: python3-cephfs-2:14.2.22-128.el8cp.x86_64
2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: libcephfs2-2:14.2.22-128.el8cp.x86_64
2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: python3-ceph-argparse-2:14.2.22-128.el8cp.x86_64
[supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [16:28:52+0000]
~~~
vs installed RPMs:
~~~
[dhill@supportshell-1 log]$ cat ../../etc/redhat-release 
Red Hat Enterprise Linux release 9.2 (Plow)

[dhill@supportshell-1 log]$ cat ../../installed-rpms  | grep ceph
cephadm-16.2.10-266.el8cp.noarch                            Wed May  7 10:05:36 2025
libcephfs2-16.2.10-266.el8cp.x86_64                         Wed May  7 10:07:22 2025
puppet-ceph-5.0.1-2.20230811230058.30f9f59.el8ost.noarch    Mon Jan  6 13:10:05 2025
~~~

Comment 21 David Hill 2025-05-14 16:32:35 UTC
~~~
2025-05-13 14:03:57.906 DEBUG    PID: 819771 leapp.workflow.Download.dnf_package_download: Removing dependent packages:
2025-05-13 14:03:57.906 DEBUG    PID: 819771 leapp.workflow.Download.dnf_package_download:  ceph-common                                x86_64  2:16.2.10-266.el8cp                         @System                      77 M
~~~

Comment 22 David Hill 2025-05-14 16:36:17 UTC
If you remove and do not upgrade ceph-common, it deletes /etc/ceph .

Comment 23 David Hill 2025-05-14 16:47:09 UTC
The issue is right here in the ceph.spec file:
~~~
%dir %{_sysconfdir}/ceph/
%config %{_sysconfdir}/bash_completion.d/ceph
%config %{_sysconfdir}/bash_completion.d/rados
%config %{_sysconfdir}/bash_completion.d/rbd
%config %{_sysconfdir}/bash_completion.d/radosgw-admin
%config(noreplace) %{_sysconfdir}/ceph/rbdmap
~~~

I could see this as a RPM bug if we have %config in a %dir, we still delete %dir when uninstalling the RPM regardless of if there're %config files under that directory.

Comment 24 David Hill 2025-05-14 17:03:11 UTC
There's also this possibility https://access.redhat.com/solutions/7054004

Comment 25 John Fulton 2025-05-14 19:16:17 UTC
I opened a doc bug https://issues.redhat.com/browse/OSPRH-16691 so that we can add a warning in the docs before LEAP is run.

Comment 29 Sahina Bose 2025-08-29 07:56:53 UTC
Closing this bug as part of bulk closing of bugs that have been open for more than a year without any significant updates. Please reopen with justification if you think this bug is still relevant and needs to be addressed in an upcoming release

Comment 31 John Fulton 2025-09-08 16:39:20 UTC
We need whoever maintains the latest RHCSv5 ceph-common RPM to ensure that /etc/ceph/ is not deleted when the RPM is removed.

@tserlin can you update the ceph spec file in the RPM and publish a new latest RHCSv5 RPM help with this?

Comment 36 Red Hat Bugzilla 2026-03-04 08:52:20 UTC
This product has been discontinued or is no longer tracked in Red Hat Bugzilla.

Comment 37 Red Hat Bugzilla 2026-03-05 04:26:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days or the product is inactive and locked