Description of problem: Following upstream documentation for ceph-bluestore-tools bluefs-export Export the contents of BlueFS (i.e., RocksDB files) to an output directory. ceph-bluestore-tool bluefs-export --path osd path --out-dir dir Refer: https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/ The bluefs-export command is currently failing on latest nightly build of RHCS 5.3 (16.2.10-264.el8cp) Version-Release number of selected component (if applicable): ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable) How reproducible: 3/3 Steps to Reproduce: 1. Deploy a RHCS 5.3 cluster 2. On an OSD node, stop any OSD service at random # systemctl stop ceph-3d3ab846-2951-11ef-b4fa-fa163e72f4bd.service 3. From inside the OSD container, run the ceph-bluestore-tool bluefs-export command # cephadm shell --name osd.4 -- ceph-bluestore-tool bluefs-export --out-dir /tmp/ --path /var/lib/ceph/osd/ceph-4 Actual results: [root@ceph-hakumar-pl12lg-node4 ~]# systemctl stop ceph-3d3ab846-2951-11ef-b4fa-fa163e72f4bd.service [root@ceph-hakumar-pl12lg-node4 ~]# cephadm shell --name osd.4 Inferring fsid 3d3ab846-2951-11ef-b4fa-fa163e72f4bd Inferring config /var/lib/ceph/3d3ab846-2951-11ef-b4fa-fa163e72f4bd/osd.4/config Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:64dd6a2d61230837791b0bcf23b79b2b022f41cab332670593502ee9458fc4bc [ceph: root@ceph-hakumar-pl12lg-node4 /]# ceph-bluestore-tool bluefs-export --out-dir /tmp/ --path /var/lib/ceph/osd/ceph-4 inferring bluefs devices from bluestore path slot 1 /var/lib/ceph/osd/ceph-4/block -> /dev/dm-1 unable to mount bluefs: (14) Bad address 2024-06-13T10:44:58.778+0000 7f9351dd2540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 1:0x4c2000~50000 does not align to alloc_size 0x10000 2024-06-13T10:44:58.778+0000 7f9351dd2540 -1 bluefs mount failed to replay log: (14) Bad address Expected results: [ceph: root@ceph-sumar-regression-1pz1zj-node4 /]# ceph-bluestore-tool bluefs-export --out-dir /tmp/ --path /var/lib/ceph/osd/ceph-6 inferring bluefs devices from bluestore path slot 1 /var/lib/ceph/osd/ceph-6/block -> /dev/dm-2 db/ db/000030.sst db/000035.sst db/000036.sst db/000037.sst db/CURRENT db/IDENTITY db/LOCK db/MANIFEST-000040 db/OPTIONS-000034 db/OPTIONS-000042 db.slow/ db.wal/ db.wal/000039.log db.wal/000043.log db.wal/000044.log db.wal/000045.log db.wal/000046.log db.wal/000047.log db.wal/000048.log db.wal/000049.log db.wal/000050.log db.wal/000051.log db.wal/000052.log db.wal/000053.log db.wal/000054.log db.wal/000055.log sharding/ sharding/def Additional info: Failure on RHCS 5.3 ======================================= [root@ceph-hakumar-pl12lg-node1-installer ~]# cephadm shell -- ceph versions Inferring fsid 3d3ab846-2951-11ef-b4fa-fa163e72f4bd Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:f5ec1577dc3deeb5add748f08ecb54f55b1ebecc2d2d5a1d470c390083de9428 { "mon": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 3 }, "mgr": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 3 }, "osd": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 7 }, "mds": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 2 }, "rgw": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 2 }, "overall": { "ceph version 16.2.10-264.el8cp (8f5f7a32a6ad0fa100fdf9e8823564d26e554e9d) pacific (stable)": 17 } } [ceph: root@ceph-hakumar-pl12lg-node4 /]# ceph-bluestore-tool bluefs-export --out-dir /tmp/ --path /var/lib/ceph/osd/ceph-4 inferring bluefs devices from bluestore path slot 1 /var/lib/ceph/osd/ceph-4/block -> /dev/dm-1 unable to mount bluefs: (14) Bad address 2024-06-13T10:44:58.778+0000 7f9351dd2540 -1 bluefs _verify_alloc_granularity OP_FILE_UPDATE of 1:0x4c2000~50000 does not align to alloc_size 0x10000 2024-06-13T10:44:58.778+0000 7f9351dd2540 -1 bluefs mount failed to replay log: (14) Bad address RHCS 7.1 =========================================== [root@ceph-sumar-regression-1pz1zj-node1-installer ~]# cephadm shell -- ceph versions Inferring fsid 0ebe00a6-2945-11ef-acbc-fa163e3305ce Inferring config /var/lib/ceph/0ebe00a6-2945-11ef-acbc-fa163e3305ce/mon.ceph-sumar-regression-1pz1zj-node1-installer/config Using ceph image with id '5412073bd769' and tag '7-385' created on 2024-05-31 19:37:19 +0000 UTC registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:579e5358418e176194812eeab523289a0c65e366250688be3f465f1a633b026d { "mon": { "ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)": 3 }, "mgr": { "ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)": 2 }, "osd": { "ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)": 11 }, "mds": { "ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)": 5 }, "overall": { "ceph version 18.2.1-194.el9cp (04a992766839cd3207877e518a1238cdbac3787e) reef (stable)": 21 } } [ceph: root@ceph-sumar-regression-1pz1zj-node4 /]# ceph-bluestore-tool bluefs-export --out-dir /tmp/ --path /var/lib/ceph/osd/ceph-6 inferring bluefs devices from bluestore path slot 1 /var/lib/ceph/osd/ceph-6/block -> /dev/dm-2 db/ db/000030.sst db/000035.sst db/000036.sst db/000037.sst db/CURRENT db/IDENTITY db/LOCK db/MANIFEST-000040 db/OPTIONS-000034 db/OPTIONS-000042 db.slow/ db.wal/ db.wal/000039.log db.wal/000043.log db.wal/000044.log db.wal/000045.log db.wal/000046.log db.wal/000047.log db.wal/000048.log db.wal/000049.log db.wal/000050.log db.wal/000051.log db.wal/000052.log db.wal/000053.log db.wal/000054.log db.wal/000055.log sharding/ sharding/def
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:4118
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days