Description of problem: For OSDs deployed on user created LVMs, it has been observed that execution of ceph-bluestore-tool bluefs commands return error complaining about not being able to read bdev labels probably. This issue is observed in 8.0 but the same commands work perfectly fine in latest 7.1 Notably the command execution does not fail if OSDs are deployed on physical disks directly even in 8.0 Version-Release number of selected component (if applicable): ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) How reproducible: 2/2 Steps to Reproduce: 1. Create one LVM of desired size on any spare disk 2. Add OSD on the newly create LVM # cephadm shell -- ceph orch daemon add osd ceph-hakumar-p7s07z-node12:data_devices=/dev/data_vg5/lv0 3. Once OSD has been deployed, login to OSD host and stop the OSD service to execute ceph-bluestore-tool commands 4. Use OSD container to execution CBT commands # cephadm shell --name osd.20 -- ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/osd/ceph-20 # cephadm shell --name osd.20 -- ceph-bluestore-tool bluefs-stats --path /var/lib/ceph/osd/ceph-20 Actual results: [ceph: root@ceph-hakumar-p7s07z-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-20 bluefs-bdev-expand inferring bluefs devices from bluestore path 2024-10-03T15:09:27.116+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20) _check_main_bdev_label not all labels read properly /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::expand_devices(std::ostream&)' thread 7f1d128da980 time 2024-10-03T15:09:27.640695+0000 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: 8824: FAILED ceph_assert(r == 0) ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x7f1d13985da8] 2: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 3: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 4: main() 5: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 6: __libc_start_main() 7: _start() *** Caught signal (Aborted) ** in thread 7f1d128da980 thread_name:ceph-bluestore- 2024-10-03T15:09:27.640+0000 7f1d128da980 -1 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::expand_devices(std::ostream&)' thread 7f1d128da980 time 2024-10-03T15:09:27.640695+0000 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: 8824: FAILED ceph_assert(r == 0) ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x7f1d13985da8] 2: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 3: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 4: main() 5: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 6: __libc_start_main() 7: _start() ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: /lib64/libc.so.6(+0x3e6f0) [0x7f1d133056f0] 2: /lib64/libc.so.6(+0x8b94c) [0x7f1d1335294c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x188) [0x7f1d13985e02] 6: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 7: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 8: main() 9: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 10: __libc_start_main() 11: _start() 2024-10-03T15:09:27.642+0000 7f1d128da980 -1 *** Caught signal (Aborted) ** in thread 7f1d128da980 thread_name:ceph-bluestore- ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: /lib64/libc.so.6(+0x3e6f0) [0x7f1d133056f0] 2: /lib64/libc.so.6(+0x8b94c) [0x7f1d1335294c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x188) [0x7f1d13985e02] 6: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 7: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 8: main() 9: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 10: __libc_start_main() 11: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. -39> 2024-10-03T15:09:27.116+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] -38> 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] -37> 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20) _check_main_bdev_label not all labels read properly -36> 2024-10-03T15:09:27.640+0000 7f1d128da980 -1 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::expand_devices(std::ostream&)' thread 7f1d128da980 time 2024-10-03T15:09:27.640695+0000 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: 8824: FAILED ceph_assert(r == 0) ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x7f1d13985da8] 2: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 3: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 4: main() 5: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 6: __libc_start_main() 7: _start() -35> 2024-10-03T15:09:27.642+0000 7f1d128da980 -1 *** Caught signal (Aborted) ** in thread 7f1d128da980 thread_name:ceph-bluestore- ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: /lib64/libc.so.6(+0x3e6f0) [0x7f1d133056f0] 2: /lib64/libc.so.6(+0x8b94c) [0x7f1d1335294c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x188) [0x7f1d13985e02] 6: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 7: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 8: main() 9: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 10: __libc_start_main() 11: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. -9> 2024-10-03T15:09:27.116+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] -4> 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] -3> 2024-10-03T15:09:27.378+0000 7f1d128da980 -1 bluestore(/var/lib/ceph/osd/ceph-20) _check_main_bdev_label not all labels read properly -1> 2024-10-03T15:09:27.640+0000 7f1d128da980 -1 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: In function 'int BlueStore::expand_devices(std::ostream&)' thread 7f1d128da980 time 2024-10-03T15:09:27.640695+0000 /builddir/build/BUILD/ceph-19.1.1/src/os/bluestore/BlueStore.cc: 8824: FAILED ceph_assert(r == 0) ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x12e) [0x7f1d13985da8] 2: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 3: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 4: main() 5: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 6: __libc_start_main() 7: _start() 0> 2024-10-03T15:09:27.642+0000 7f1d128da980 -1 *** Caught signal (Aborted) ** in thread 7f1d128da980 thread_name:ceph-bluestore- ceph version 19.1.1-102.el9cp (fef2e0002bc2e93b76655e5f4d7a9e89f844b0ae) squid (rc) 1: /lib64/libc.so.6(+0x3e6f0) [0x7f1d133056f0] 2: /lib64/libc.so.6(+0x8b94c) [0x7f1d1335294c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x188) [0x7f1d13985e02] 6: /usr/lib64/ceph/libceph-common.so.2(+0x18af66) [0x7f1d13985f66] 7: (BlueStore::expand_devices(std::ostream&)+0x65a) [0x559404d2335a] 8: main() 9: /lib64/libc.so.6(+0x29590) [0x7f1d132f0590] 10: __libc_start_main() 11: _start() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Aborted (core dumped) [ceph: root@ceph-hakumar-p7s07z-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-20 bluefs-stats 2024-10-03T15:20:00.548+0000 7fe02d704980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] 2024-10-03T15:20:00.812+0000 7fe02d704980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] 2024-10-03T15:20:01.080+0000 7fe02d704980 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label unable to decode label /var/lib/ceph/osd/ceph-20/block at offset 102: void bluestore_bdev_label_t::decode(ceph::buffer::v15_2_0::list::const_iterator&) decode past end of struct encoding: Malformed input [buffer:3] 2024-10-03T15:20:01.081+0000 7fe02d704980 -1 bluestore(/var/lib/ceph/osd/ceph-20) _check_main_bdev_label not all labels read properly error from cold_open: (5) Input/output error Expected results: [ceph: root@ceph-hakumar-lkl1la-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-16 bluefs-bdev-expand inferring bluefs devices from bluestore path 1 : device size 0x280000000 : using 0x43a1000(68 MiB) Expanding DB/WAL... [ceph: root@ceph-hakumar-lkl1la-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-16 bluefs-stats 1 : device size 0x280000000 : using 0x43a1000(68 MiB) RocksDBBlueFSVolumeSelector Usage Matrix: DEV/LEV WAL DB SLOW * * REAL FILES LOG 0 B 4 MiB 0 B 0 B 0 B 1.1 MiB 1 WAL 0 B 53 MiB 0 B 0 B 0 B 21 MiB 2 DB 0 B 4.6 MiB 0 B 0 B 0 B 114 KiB 11 SLOW 0 B 0 B 0 B 0 B 0 B 0 B 0 TOTAL 0 B 61 MiB 0 B 0 B 0 B 0 B 14 MAXIMUMS: LOG 0 B 4 MiB 0 B 0 B 0 B 1.1 MiB WAL 0 B 53 MiB 0 B 0 B 0 B 21 MiB DB 0 B 8.6 MiB 0 B 0 B 0 B 169 KiB SLOW 0 B 0 B 0 B 0 B 0 B 0 B TOTAL 0 B 61 MiB 0 B 0 B 0 B 0 B >> SIZE << 0 B 9.5 GiB 0 B Additional info: RHCS 7.1 cluster where execution of commands on OSD deployed on LVM was successful [root@ceph-hakumar-lkl1la-node12 ~]# pvcreate -ff /dev/vdb Physical volume "/dev/vdb" successfully created. [root@ceph-hakumar-lkl1la-node12 ~]# vgcreate data_vg5 /dev/vdb Volume group "data_vg5" successfully created [root@ceph-hakumar-lkl1la-node12 ~]# lvcreate -y -L 10G -n lv0 data_vg5 Logical volume "lv0" created. [root@ceph-hakumar-lkl1la-node12 ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS sr0 11:0 1 522K 0 rom vda 252:0 0 40G 0 disk ├─vda1 252:1 0 1M 0 part ├─vda2 252:2 0 200M 0 part /boot/efi ├─vda3 252:3 0 1G 0 part /boot └─vda4 252:4 0 38.8G 0 part /var/lib/containers/storage/overlay / vdb 252:16 0 25G 0 disk └─data_vg5-lv0 253:1 0 10G 0 lvm vdc 252:32 0 25G 0 disk vdd 252:48 0 25G 0 disk └─ceph--b5b559d7--1a72--459d--a617--725ed40f90ab-osd--block--4def7f51--eaaf--458c--ab97--139069849b5d 253:0 0 25G 0 lvm vde 252:64 0 25G 0 disk [root@ceph-hakumar-lkl1la-node7 ~]# ceph orch daemon add osd ceph-hakumar-lkl1la-node12:data_devices=/dev/data_vg5/lv0 Created osd(s) 16 on host 'ceph-hakumar-lkl1la-node12' [root@ceph-hakumar-lkl1la-node7 ~]# ceph osd df tree ceph-hakumar-lkl1la-node12 ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -13 0.03419 - 35 GiB 135 MiB 12 MiB 2 KiB 123 MiB 35 GiB 0.38 1.00 - host ceph-hakumar-lkl1la-node12 15 hdd 0.02440 1.00000 25 GiB 67 MiB 6.0 MiB 1 KiB 61 MiB 25 GiB 0.26 0.70 0 down osd.15 16 hdd 0.00980 1.00000 10 GiB 68 MiB 6.2 MiB 1 KiB 61 MiB 9.9 GiB 0.66 1.75 0 down osd.16 TOTAL 35 GiB 135 MiB 12 MiB 3.1 KiB 123 MiB 35 GiB 0.38 MIN/MAX VAR: 0.70/1.75 STDDEV: 0.22 [root@ceph-hakumar-lkl1la-node7 ~]# ceph version ceph version 18.2.1-251.el9cp (81688db791aa982863476facb32440cb7210c828) reef (stable) [root@ceph-hakumar-lkl1la-node12 ~]# systemctl list-units | grep osd ceph-5c23b4bc-81a6-11ef-a35d-fa163e2124fb.service loaded active running Ceph osd.16 for 5c23b4bc-81a6-11ef-a35d-fa163e2124fb [root@ceph-hakumar-lkl1la-node12 ~]# systemctl stop ceph-5c23b4bc-81a6-11ef-a35d-fa163e2124fb.service [root@ceph-hakumar-lkl1la-node12 ~]# cephadm shell --name osd.16 Inferring fsid 5c23b4bc-81a6-11ef-a35d-fa163e2124fb Inferring config /var/lib/ceph/5c23b4bc-81a6-11ef-a35d-fa163e2124fb/osd.16/config Using ceph image with id 'c2596c9113e6' and tag '<none>' created on 2024-10-03 00:49:54 +0000 UTC registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:53d25b0264148bb78913cecdda84d56208b15f0e6e9c28aa13354d0f8164ce60 [ceph: root@ceph-hakumar-lkl1la-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-16 bluefs-bdev-expand inferring bluefs devices from bluestore path 1 : device size 0x280000000 : using 0x43a1000(68 MiB) Expanding DB/WAL... [ceph: root@ceph-hakumar-lkl1la-node12 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-16 bluefs-stats 1 : device size 0x280000000 : using 0x43a1000(68 MiB) RocksDBBlueFSVolumeSelector Usage Matrix: DEV/LEV WAL DB SLOW * * REAL FILES LOG 0 B 4 MiB 0 B 0 B 0 B 1.1 MiB 1 WAL 0 B 53 MiB 0 B 0 B 0 B 21 MiB 2 DB 0 B 4.6 MiB 0 B 0 B 0 B 114 KiB 11 SLOW 0 B 0 B 0 B 0 B 0 B 0 B 0 TOTAL 0 B 61 MiB 0 B 0 B 0 B 0 B 14 MAXIMUMS: LOG 0 B 4 MiB 0 B 0 B 0 B 1.1 MiB WAL 0 B 53 MiB 0 B 0 B 0 B 21 MiB DB 0 B 8.6 MiB 0 B 0 B 0 B 169 KiB SLOW 0 B 0 B 0 B 0 B 0 B 0 B TOTAL 0 B 61 MiB 0 B 0 B 0 B 0 B >> SIZE << 0 B 9.5 GiB 0 B [root@ceph-hakumar-lkl1la-node12 ~]# cephadm ceph-volume lvm list Inferring fsid 5c23b4bc-81a6-11ef-a35d-fa163e2124fb Using ceph image with id 'c2596c9113e6' and tag '<none>' created on 2024-10-03 00:49:54 +0000 UTC registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:53d25b0264148bb78913cecdda84d56208b15f0e6e9c28aa13354d0f8164ce60 ====== osd.15 ====== [block] /dev/ceph-b5b559d7-1a72-459d-a617-725ed40f90ab/osd-block-4def7f51-eaaf-458c-ab97-139069849b5d block device /dev/ceph-b5b559d7-1a72-459d-a617-725ed40f90ab/osd-block-4def7f51-eaaf-458c-ab97-139069849b5d block uuid fyiUZ0-sCYu-Dv48-hAL6-Flkc-U4Mc-AKex0O cephx lockbox secret cluster fsid 5c23b4bc-81a6-11ef-a35d-fa163e2124fb cluster name ceph crush device class encrypted 0 osd fsid 4def7f51-eaaf-458c-ab97-139069849b5d osd id 15 osdspec affinity None type block vdo 0 devices /dev/vdd ====== osd.16 ====== [block] /dev/data_vg5/lv0 block device /dev/data_vg5/lv0 block uuid 1eFJWU-HKI0-7LNO-obMY-XhgA-lnSa-Dsg4y6 cephx lockbox secret cluster fsid 5c23b4bc-81a6-11ef-a35d-fa163e2124fb cluster name ceph crush device class encrypted 0 osd fsid 6b6ef654-8a26-4077-a0b8-2e5730fded88 osd id 16 osdspec affinity None type block vdo 0 devices /dev/vdb
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fixes, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2025:2457