Bug 2267114
| Summary: | /etc/ceph content get deleted during OSP FFU 17.1 at control plane system upgrade step | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | alisci <alisci> |
| Component: | Cephadm | Assignee: | Adam King <adking> |
| Status: | CLOSED UPSTREAM | QA Contact: | Mohit Bisht <mobisht> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 5.3 | CC: | adking, akane, cephqe-warriors, dhill, fpantano, fpiccion, gfidente, jelle.hoylaerts.ext, johfulto, ktordeur, lonavarr, madgupta, rkachach, sabose, saraut, tserlin, vdas |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 5.3z9 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2026-03-04 08:52:20 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
alisci
2024-02-29 19:47:39 UTC
What's the next step to try and get this working properly, maybe we need to backup and restore this folder before and after leapp upgrade ? (In reply to Kenny Tordeurs from comment #11) > What's the next step to try and get this working properly, maybe we need to > backup and restore this folder before and after leapp upgrade ? That's something that would work, assuming the label isn't getting the contents added back on its own. If you have that label and then do a `ceph mgr fail` (assuming you have multiple mgr daemons) just to get it to retry everything, does it not repopulate that directory? @Adam the `ceph mgr fail` run from another node doesn't fix anything on the node where /etc/ceph is missing. @adking could you please respond to Kenny's query? Our customer tried the `ceph mgr fail` but it doesn't fix it. It is annoying for them to be doing it manually. (In reply to Kenny Tordeurs from comment #13) > @Adam the `ceph mgr fail` run from another node doesn't fix anything on the > node where /etc/ceph is missing. `ceph mgr fail` should re-trigger the client keyring distribution. I'm wondering if the /etc/ceph directory must exist for that to work though. If the output of `ceph orch host ls` and `ceph orch client-keyring ls` confirms the client-keyring is setup and the hosts have the labels I'd guess you have to try recreating that directory on all the hosts. Is it leapp or the fact that we removed ceph-common that broke it ? ~~~ [dhill@supportshell-1 log]$ grep ceph dnf.rpm.log 2025-01-06T13:10:00-0500 SUBDEBUG Upgrade: libcephfs2-2:14.2.22-128.el8cp.x86_64 2025-01-06T13:10:00-0500 SUBDEBUG Upgrade: python3-ceph-argparse-2:14.2.22-128.el8cp.x86_64 2025-01-06T13:10:01-0500 SUBDEBUG Upgrade: python3-cephfs-2:14.2.22-128.el8cp.x86_64 2025-01-06T13:10:05-0500 SUBDEBUG Upgrade: puppet-ceph-5.0.1-2.20230811230058.30f9f59.el8ost.noarch 2025-01-06T13:11:02-0500 SUBDEBUG Upgrade: ceph-common-2:14.2.22-128.el8cp.x86_64 2025-01-06T13:12:04-0500 SUBDEBUG Upgraded: ceph-common-2:14.2.11-208.el8cp.x86_64 2025-01-06T13:12:09-0500 SUBDEBUG Upgraded: python3-cephfs-2:14.2.11-208.el8cp.x86_64 2025-01-06T13:12:09-0500 SUBDEBUG Upgraded: libcephfs2-2:14.2.11-208.el8cp.x86_64 2025-01-06T13:12:10-0500 SUBDEBUG Upgraded: puppet-ceph-3.1.2-2.20211230004837.201e8b1.el8ost.noarch 2025-01-06T13:12:13-0500 SUBDEBUG Upgraded: python3-ceph-argparse-2:14.2.11-208.el8cp.x86_64 2025-05-07T10:05:36-0400 SUBDEBUG Installed: cephadm-2:16.2.10-266.el8cp.noarch 2025-05-07T10:07:22-0400 SUBDEBUG Upgrade: libcephfs2-2:16.2.10-266.el8cp.x86_64 2025-05-07T10:07:23-0400 SUBDEBUG Upgrade: python3-ceph-argparse-2:16.2.10-266.el8cp.x86_64 2025-05-07T10:07:25-0400 SUBDEBUG Upgrade: python3-cephfs-2:16.2.10-266.el8cp.x86_64 2025-05-07T10:07:25-0400 SUBDEBUG Installed: python3-ceph-common-2:16.2.10-266.el8cp.x86_64 2025-05-07T10:07:40-0400 SUBDEBUG Upgrade: ceph-common-2:16.2.10-266.el8cp.x86_64 2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: ceph-common-2:14.2.22-128.el8cp.x86_64 2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: python3-cephfs-2:14.2.22-128.el8cp.x86_64 2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: libcephfs2-2:14.2.22-128.el8cp.x86_64 2025-05-07T10:08:24-0400 SUBDEBUG Upgraded: python3-ceph-argparse-2:14.2.22-128.el8cp.x86_64 [supportshell-1.sush-001.prod.us-west-2.aws.redhat.com] [16:28:52+0000] ~~~ vs installed RPMs: ~~~ [dhill@supportshell-1 log]$ cat ../../etc/redhat-release Red Hat Enterprise Linux release 9.2 (Plow) [dhill@supportshell-1 log]$ cat ../../installed-rpms | grep ceph cephadm-16.2.10-266.el8cp.noarch Wed May 7 10:05:36 2025 libcephfs2-16.2.10-266.el8cp.x86_64 Wed May 7 10:07:22 2025 puppet-ceph-5.0.1-2.20230811230058.30f9f59.el8ost.noarch Mon Jan 6 13:10:05 2025 ~~~ ~~~ 2025-05-13 14:03:57.906 DEBUG PID: 819771 leapp.workflow.Download.dnf_package_download: Removing dependent packages: 2025-05-13 14:03:57.906 DEBUG PID: 819771 leapp.workflow.Download.dnf_package_download: ceph-common x86_64 2:16.2.10-266.el8cp @System 77 M ~~~ If you remove and do not upgrade ceph-common, it deletes /etc/ceph . The issue is right here in the ceph.spec file:
~~~
%dir %{_sysconfdir}/ceph/
%config %{_sysconfdir}/bash_completion.d/ceph
%config %{_sysconfdir}/bash_completion.d/rados
%config %{_sysconfdir}/bash_completion.d/rbd
%config %{_sysconfdir}/bash_completion.d/radosgw-admin
%config(noreplace) %{_sysconfdir}/ceph/rbdmap
~~~
I could see this as a RPM bug if we have %config in a %dir, we still delete %dir when uninstalling the RPM regardless of if there're %config files under that directory.
There's also this possibility https://access.redhat.com/solutions/7054004 I opened a doc bug https://issues.redhat.com/browse/OSPRH-16691 so that we can add a warning in the docs before LEAP is run. Closing this bug as part of bulk closing of bugs that have been open for more than a year without any significant updates. Please reopen with justification if you think this bug is still relevant and needs to be addressed in an upcoming release We need whoever maintains the latest RHCSv5 ceph-common RPM to ensure that /etc/ceph/ is not deleted when the RPM is removed. @tserlin can you update the ceph spec file in the RPM and publish a new latest RHCSv5 RPM help with this? This product has been discontinued or is no longer tracked in Red Hat Bugzilla. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days or the product is inactive and locked |