Bug 2273936 - ceph-volume raw list and activate fail
Summary: ceph-volume raw list and activate fail
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: Ceph-Volume
Version: 6.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 7.1
Assignee: Guillaume Abrioux
QA Contact: Aditya Ramteke
Akash Raj
URL:
Whiteboard:
Depends On:
Blocks: 2267614 2298578 2298579
TreeView+ depends on / blocked
 
Reported: 2024-04-08 08:53 UTC by Guillaume Abrioux
Modified: 2024-08-26 11:53 UTC (History)
8 users (show)

Fixed In Version: ceph-18.2.1-130.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-06-13 14:31:26 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-8773 0 None None None 2024-04-08 08:53:40 UTC
Red Hat Knowledge Base (Solution) 7063703 0 None None None 2024-04-23 16:31:53 UTC
Red Hat Product Errata RHSA-2024:3925 0 None None None 2024-06-13 14:31:30 UTC

Description Guillaume Abrioux 2024-04-08 08:53:17 UTC
This bug was initially created as a copy of Bug #2273724

I am copying this bug because: 



Description of problem:
After an upgrade from ODF 4.12 to ODF 4.14 ceph osd containers are not running because ceph-volume seems to be unable to activate the osd devices that are based on lvm from old ocs deployments (4.3/2 up to 4.4). More information is found in the BZ https://bugzilla.redhat.com/show_bug.cgi?id=2273398.

Sadly, we don't have any usable ceph-volume logs, but this seems like a very strong contender. Workaround to bring the osd back up:

~~~
- Creating a backup of the osd deployment, we're going to remove the liveness probe
- scaled down the rook-ceph and ocs-operators
- oc edit the osd deployment and searched for the expand-bluefs section and removed the container
- oc get pods to see if osd came up (still 1/2) and rshed info the container
   - ceph-volume lvm list 
   - ceph-volume lvm active --no-systemd -- <osd.id> <osd fsid> // osd fsid from ceph-volume lvm list
   - The osd was activated and when we viewed the osd data dir, the block device was listed:
      - ls -l '/var/lib/ceph/osd/ceph-{id}
~~~

Ask:
- What changed in ceph-volume from 4.13 to 4.14 that would cause any issues with LVM based OSDs from ealier versions of OCS?

Comment 8 errata-xmlrpc 2024-06-13 14:31:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:3925


Note You need to log in before you can comment on or make changes to this bug.