Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2273934

Summary: ceph-volume raw list and activate fail
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Guillaume Abrioux <gabrioux>
Component: Ceph-VolumeAssignee: Guillaume Abrioux <gabrioux>
Status: CLOSED ERRATA QA Contact: Aditya Ramteke <aramteke>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.0CC: ceph-eng-bugs, cephqe-warriors, kdreyer, mcaldeir, msaini, rpollack, tserlin, vereddy
Target Milestone: ---Keywords: Regression
Target Release: 7.0z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-18.2.0-191.el9cp Doc Type: Bug Fix
Doc Text:
Cause: Since the introduction of bluestore-rdr, `ceph-volume raw list` tries to get the objectstore type from the BlueStore labels. This objectstore type label wasn't set prior to bluestore-rdr introduction. Consequence: `ceph-volume raw activate` fails for any OSD created prior to the version of Ceph which introduced bluestore-rdr as it calls `ceph-volume raw list` Fix: ceph-volume tries to get the label and defaults to "bluestore" when it can't retrieve it. Result: `ceph-volume raw list` `ceph-volume raw activate` don't fail.
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-05-07 12:11:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Guillaume Abrioux 2024-04-08 08:50:28 UTC
This bug was initially created as a copy of Bug #2273724

I am copying this bug because: 



Description of problem:
After an upgrade from ODF 4.12 to ODF 4.14 ceph osd containers are not running because ceph-volume seems to be unable to activate the osd devices that are based on lvm from old ocs deployments (4.3/2 up to 4.4). More information is found in the BZ https://bugzilla.redhat.com/show_bug.cgi?id=2273398.

Sadly, we don't have any usable ceph-volume logs, but this seems like a very strong contender. Workaround to bring the osd back up:

~~~
- Creating a backup of the osd deployment, we're going to remove the liveness probe
- scaled down the rook-ceph and ocs-operators
- oc edit the osd deployment and searched for the expand-bluefs section and removed the container
- oc get pods to see if osd came up (still 1/2) and rshed info the container
   - ceph-volume lvm list 
   - ceph-volume lvm active --no-systemd -- <osd.id> <osd fsid> // osd fsid from ceph-volume lvm list
   - The osd was activated and when we viewed the osd data dir, the block device was listed:
      - ls -l '/var/lib/ceph/osd/ceph-{id}
~~~

Ask:
- What changed in ceph-volume from 4.13 to 4.14 that would cause any issues with LVM based OSDs from ealier versions of OCS?

Comment 4 Manisha Saini 2024-04-12 05:01:01 UTC
Hi Guillaume Abrioux,

Could you please provide the steps to verify this BZ?

Thanks

Comment 12 errata-xmlrpc 2024-05-07 12:11:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:2743