Created attachment 1996428 [details] Logs of one of the failing osd prepare pods Description of problem (please be detailed as possible and provide log snippests): Recently I noticed in upstream ocs-operator ci the e2e tests started failing. Upon investigation, I found during installation the osd-prepare pods go into CLBO state which is why the ci was failing. I then went on to recreate the same on my own cluster and I was able to reproduce it. I am attaching the logs of the failing osd-prepare pod & also the ocs must gather I collected. Considering nothing else changed, I suspect something has changed on ceph side as we use the ceph image quay.io/ceph/ceph:v17 in our upstream ocs-operator. It's blocking all PRs in the ocs-operator repo as the e2e tests are not working. Is there any workaround available to the best of your knowledge? No Steps to Reproduce: 1. Install upstream ocs operator 2. Create a storagecluster 3. You will see the osd-prepare pods will go to CLBO
Created attachment 1996429 [details] ocs-mustgather Adding the must-gather
The tests are using the latest Ceph Quincy image quay.io/ceph/ceph:v17, which as of four days ago v17.2.7 came out, and introduced this issue. From the Rook operator log: 2023-10-31T14:38:45.593344167Z 2023-10-31 14:38:45.593319 I | ceph-spec: detecting the ceph image version for image quay.io/ceph/ceph:v17... 2023-10-31T14:38:47.307103937Z 2023-10-31 14:38:47.307067 I | ceph-spec: detected ceph image version: "17.2.7-0 quincy" This issue is affecting all OSDs on PVCs upstream as described here: https://github.com/rook/rook/issues/13136 Until the fix is found, you can instead use the previous good version of the Ceph image (instead of using the v17 tag that will pick up the latest image that is currently broken): quay.io/ceph/ceph:v17.2.6
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:1383