Description of problem: Following the instructions mentioned in upstream documentation here https://docs.ceph.com/en/latest/man/8/ceph-bluestore-tool/ qfsck run consistency check on BlueStore metadata comparing allocator data (from RocksDB CFB when exists and if not uses allocation-file) with ONodes state. Upon execution of the command, the CBT complains about allocator mismatch between file and metadata bluestore::NCB::read_allocation_from_drive_for_bluestore_tool::FAILURE. Allocator from file and allocator from metadata differ::ret=-1 >>> NOTE: The command execution was passing with downstream Squid version 19.1.0-60.el9cp as on 29-Aug-2024 Automation Pass logs - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-BD1Z0Q/ceph-bluestore-tool_utility_0.log >>> Also passes in Reef: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-X5LVH7/ceph-bluestore-tool_utility_0.log Version-Release number of selected component (if applicable): ceph version 19.1.0-71.el9cp (783f8688bece0437eed13f61111c3a4f633f4e38) squid (rc) How reproducible: 1/1 Steps to Reproduce: 1. Choose an OSD at random and stop its service # systemctl stop ceph-cdb7f7c4-6b34-11ef-a888-fa163efb0216.service 2. Get inside the OSD container to run ceph-bluestore-tool commands # cephadm shell --name osd.4 3. Execute 'qfsck' command # ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ qfsck Actual results: # ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ qfsck qfsck bluestore.quick-fsck 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::operator()::(2)compare_allocators:: spillover 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::compare_allocators::mismatch:: idx1=491 idx2=492 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::read_allocation_from_drive_for_bluestore_tool::FAILURE. Allocator from file and allocator from metadata differ::ret=-1 qfsck failed: (1) Operation not permitted [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph --version ceph version 19.1.0-71.el9cp (783f8688bece0437eed13f61111c3a4f633f4e38) squid (rc) Expected results: No error message. consistency should return success # cephadm shell --name osd.12 -- ceph-bluestore-tool qfsck --path /var/lib/ceph/osd/ceph-12 qfsck bluestore.quick-fsck qfsck success Additional info: [root@ceph-hakumar-pzd763-node10 ~]# systemctl stop ceph-cdb7f7c4-6b34-11ef-a888-fa163efb0216.service [root@ceph-hakumar-pzd763-node10 ~]# cephadm shell --name osd.4 Inferring fsid cdb7f7c4-6b34-11ef-a888-fa163efb0216 Inferring config /var/lib/ceph/cdb7f7c4-6b34-11ef-a888-fa163efb0216/osd.4/config Using ceph image with id '37e3ade1853a' and tag '<none>' created on 2024-09-04 14:23:38 +0000 UTC registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:6293da836007c2453cbf8ce5b948dfa7814b69ba8e7347295e77039a98af98c9 Creating an OSD daemon form without an OSD FSID value [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ fsck fsck success [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ fsck --deep true fsck success [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ show-label inferring bluefs devices from bluestore path { "/var/lib/ceph/osd/ceph-4/block": { "osd_uuid": "123a547c-4371-40af-9bb1-d859bb1b93bf", "size": 26839351296, "btime": "2024-09-05T03:27:51.529609+0000", "description": "main", "bfm_blocks": "6552576", "bfm_blocks_per_key": "128", "bfm_bytes_per_block": "4096", "bfm_size": "26839351296", "bluefs": "1", "ceph_fsid": "cdb7f7c4-6b34-11ef-a888-fa163efb0216", "ceph_version_when_created": "ceph version 19.1.0-71.el9cp (783f8688bece0437eed13f61111c3a4f633f4e38) squid (rc)", "created_at": "2024-09-05T03:27:53.170078Z", "elastic_shared_blobs": "1", "epoch": "23", "kv_backend": "rocksdb", "magic": "ceph osd volume v026", "multi": "yes", "osd_key": "AQA2Jdlm6bgLABAARTWsEAWSacK6pdOjm0rklA==", "osdspec_affinity": "osd_spec_collocated", "ready": "ready", "require_osd_release": "19", "type": "bluestore", "whoami": "4" } } [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-4/ qfsck qfsck bluestore.quick-fsck 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::operator()::(2)compare_allocators:: spillover 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::compare_allocators::mismatch:: idx1=491 idx2=492 2024-09-06T01:41:40.863+0000 7fae27a65980 -1 bluestore::NCB::read_allocation_from_drive_for_bluestore_tool::FAILURE. Allocator from file and allocator from metadata differ::ret=-1 qfsck failed: (1) Operation not permitted [ceph: root@ceph-hakumar-pzd763-node10 /]# ceph --version ceph version 19.1.0-71.el9cp (783f8688bece0437eed13f61111c3a4f633f4e38) squid (rc)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:10216
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days