Description of problem: As Mirror destination is RW, snapshot create is allowed to .snap in dir_root. Upon snapshot create, Mirror state changed to failed. No further snap syncing happens from Mirror source to destination. To resolve the state, need to delete the snapshot created in .snap dir of destination dir_root path and Perform Mirror service restart. After few minutes, syncing resumes and restores healthy state. Failure: [root@ceph2-hk-m-r1yigb-node6 .snap]# mkdir snap8 [root@ceph2-hk-m-r1yigb-node6 .snap]# pwd /mnt/cephfs/volumes/subvolgroup_1/subvol_1/.snap [root@ceph1-hk-m-r1yigb-node6 e1fee678-eb4d-11ee-a63c-fa163e2b47ff]# ceph --admin-daemon ceph-client.cephfs-mirror.ceph1-hk-m-r1yigb-node6.ficcbn.2.94067026573544.asok fs mirror peer status cephfs@1 9c50644d-915b-4ad2-b609-fd1b0a16201f { "/volumes/subvolgroup_1/subvol_1": { "state": "failed", "last_synced_snap": { "id": 14, "name": "snap7", "sync_duration": 33.130708503000001, "sync_time_stamp": "12501.118085s" }, "snaps_synced": 5, "snaps_deleted": 0, "snaps_renamed": 0 } } Version-Release number of selected component (if applicable): ceph version 18.2.1-86.el9cp How reproducible: Steps to Reproduce: 1. Setup CephFS-Mirror to a subvolume of non-default subvolumegroup/ 2. Verify snaps are synced to destination. 3. Create a snapshot in .snap dir of dir_root/subvolume path. 4. Verify Mirror status. Actual results: Mirror state is failed with below error in log, 2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) build_snap_map: snap_path=/volumes/subvolgroup_1/subvol_1/.snap/snap8 has invalid metadata in remote snapshot 2024-03-26T12:11:16.688+0000 7f19a28bb640 10 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) build_snap_map: remote snap_map={5=snap2,7=snap3,9=snap4,10=snap5,12=snap6,14=snap7} 2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) do_sync_snaps: failed to build remote snap map 2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) sync_snaps: failed to sync snapshots for dir_root=/volumes/subvolgroup_1/subvol_1 Expected results: 1. snapshot create on destination's .snap dir of mirrored path should not be allowed or there should be some means of notification to user on the consequences 2. User-level message to resolve the mirror failed state, which is to delete the conflicting snapshot as the respective path is part of mirror destination. Additional info:
I think this may be "expected" behavior, though it's definitely a good RFE. Venky, Jos?
This is just a misunderstanding of how a snapshot is to be created. The Step3 mentioned in Comment1 doesn't create a snapshot, instead it just creates a directory in the target cluster and this action would corrupt the .snap directory in the target cluster. Just clear the manually created stuff in the target .snap directory and the queued as well as the new snapshot creations succeed. This is not a bug.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:10216