Bug 2271767 - CephFS - Mirror] RFE Snapshot creation on Mirror destination subvolume should report an error or warning for the consequence which is mirror failure
Summary: CephFS - Mirror] RFE Snapshot creation on Mirror destination subvolume should...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 7.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 8.0
Assignee: Jos Collin
QA Contact: sumr
URL:
Whiteboard:
: 2272017 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-03-27 09:33 UTC by sumr
Modified: 2024-11-25 09:01 UTC (History)
10 users (show)

Fixed In Version: ceph-19.1.1-15.el9cp
Doc Type: Enhancement
Doc Text:
.Enhanced output remote metadata information in peer status With this enhancement, the peer status output shows `state`, `failed`, and 'failure_reason' when there is invalid metadata in a remote snapshot.
Clone Of:
Environment:
Last Closed: 2024-11-25 09:00:59 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 65226 0 None None None 2024-07-31 05:35:32 UTC
Ceph Project Bug Tracker 65317 0 None None None 2024-07-31 05:35:47 UTC
Github ceph ceph pull 56816 0 None open cephfs_mirror: update peer status for invalid metadata in remote snapshot 2024-07-31 05:46:25 UTC
Red Hat Issue Tracker RHCEPH-8656 0 None None None 2024-03-27 09:33:46 UTC
Red Hat Product Errata RHBA-2024:10216 0 None None None 2024-11-25 09:01:02 UTC

Description sumr 2024-03-27 09:33:06 UTC
Description of problem:
As Mirror destination is RW, snapshot create is allowed to .snap in dir_root.
Upon snapshot create, Mirror state changed to failed. No further snap syncing happens from Mirror source to destination.

To resolve the state, need to delete the snapshot created in .snap dir of destination dir_root path and Perform Mirror service restart. After few minutes, syncing resumes and restores healthy state. 

Failure:
[root@ceph2-hk-m-r1yigb-node6 .snap]# mkdir snap8
[root@ceph2-hk-m-r1yigb-node6 .snap]# pwd
/mnt/cephfs/volumes/subvolgroup_1/subvol_1/.snap

[root@ceph1-hk-m-r1yigb-node6 e1fee678-eb4d-11ee-a63c-fa163e2b47ff]# ceph --admin-daemon ceph-client.cephfs-mirror.ceph1-hk-m-r1yigb-node6.ficcbn.2.94067026573544.asok fs mirror peer status cephfs@1 9c50644d-915b-4ad2-b609-fd1b0a16201f
{
    "/volumes/subvolgroup_1/subvol_1": {
        "state": "failed",
        "last_synced_snap": {
            "id": 14,
            "name": "snap7",
            "sync_duration": 33.130708503000001,
            "sync_time_stamp": "12501.118085s"
        },
        "snaps_synced": 5,
        "snaps_deleted": 0,
        "snaps_renamed": 0
    }
}

Version-Release number of selected component (if applicable): ceph version 18.2.1-86.el9cp


How reproducible: 


Steps to Reproduce:
1. Setup CephFS-Mirror to a subvolume of non-default subvolumegroup/
2. Verify snaps are synced to destination.
3. Create a snapshot in .snap dir of dir_root/subvolume path.
4. Verify Mirror status.

Actual results: Mirror state is failed with below error in log,

2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) build_snap_map: snap_path=/volumes/subvolgroup_1/subvol_1/.snap/snap8 has invalid metadata in remote snapshot
2024-03-26T12:11:16.688+0000 7f19a28bb640 10 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) build_snap_map: remote snap_map={5=snap2,7=snap3,9=snap4,10=snap5,12=snap6,14=snap7}
2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) do_sync_snaps: failed to build remote snap map
2024-03-26T12:11:16.688+0000 7f19a28bb640 -1 cephfs::mirror::PeerReplayer(9c50644d-915b-4ad2-b609-fd1b0a16201f) sync_snaps: failed to sync snapshots for dir_root=/volumes/subvolgroup_1/subvol_1


Expected results: 
1. snapshot create on destination's .snap dir of mirrored path should not be allowed or there should be some means of notification to user on the consequences
2. User-level message to resolve the mirror failed state, which is to delete the conflicting snapshot as the respective path is part of mirror destination.


Additional info:

Comment 3 Greg Farnum 2024-03-27 14:46:53 UTC
I think this may be "expected" behavior, though it's definitely a good RFE. Venky, Jos?

Comment 4 Jos Collin 2024-03-28 04:36:03 UTC
This is just a misunderstanding of how a snapshot is to be created. The Step3 mentioned in Comment1 doesn't create a snapshot, instead it just creates a directory in the target cluster and this action would corrupt the .snap directory in the target cluster. Just clear the manually created stuff in the target .snap directory and the queued as well as the new snapshot creations succeed.

This is not a bug.

Comment 26 errata-xmlrpc 2024-11-25 09:00:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 8.0 security, bug fix, and enhancement updates), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:10216


Note You need to log in before you can comment on or make changes to this bug.