Description of problem: After an upgrade from ODF 4.12 to ODF 4.14 ceph osd containers are not running because ceph-volume seems to be unable to activate the osd devices that are based on lvm from old ocs deployments (4.3/2 up to 4.4). More information is found in the BZ https://bugzilla.redhat.com/show_bug.cgi?id=2273398. Sadly, we don't have any usable ceph-volume logs, but this seems like a very strong contender. Workaround to bring the osd back up: ~~~ - Creating a backup of the osd deployment, we're going to remove the liveness probe - scaled down the rook-ceph and ocs-operators - oc edit the osd deployment and searched for the expand-bluefs section and removed the container - oc get pods to see if osd came up (still 1/2) and rshed info the container - ceph-volume lvm list - ceph-volume lvm active --no-systemd -- <osd.id> <osd fsid> // osd fsid from ceph-volume lvm list - The osd was activated and when we viewed the osd data dir, the block device was listed: - ls -l '/var/lib/ceph/osd/ceph-{id} ~~~ Ask: - What changed in ceph-volume from 4.13 to 4.14 that would cause any issues with LVM based OSDs from ealier versions of OCS?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 6.1 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:2631