Bug 2096443

Summary: [cephfs][snap_schedule] Even though it is failed to mkdir, snap_schedule says it is created without restarting mgrs
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: julpark
Component: DocumentationAssignee: Akash Raj <akraj>
Documentation sub component: File System Guide QA Contact: Hemanth Kumar <hyelloji>
Status: ASSIGNED --- Docs Contact:
Severity: medium    
Priority: unspecified CC: asriram, ceph-eng-bugs, gfarnum, hyelloji, mchangir, rmandyam, vshankar
Version: 5.2   
Target Milestone: ---   
Target Release: 6.1z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description julpark 2022-06-13 20:26:10 UTC
Description of problem:

Even though it is failed to mkdir, snap_schedule says it is created

Version-Release number of selected component (if applicable):

ceph version 16.2.8-42.el8c

How reproducible:

Enable snap_schedule and add retention to it and check mgr log

Steps to Reproduce:
1.Enable snap_schedule
2.Add retention to the dir
3.Check status

Actual results:

status shows it is created

Expected results:

It should not show it is created


Additional info:

[root@ceph-julpark-vztrk6-node1-installer cephuser]# cat /var/log/ceph/e8fc1f3e-eb38-11ec-951f-fa163e86391a/ceph-mgr.ceph-julpark-vztrk6-node1-installer.krotji.snap_schedule.log | grep Traceback -A 5 -B 5
2022-06-13 19:59:00,927 [Thread-6] [DEBUG] [mgr_util] CephFS initializing...
2022-06-13 19:59:00,931 [Thread-6] [DEBUG] [mgr_util] CephFS mounting...
2022-06-13 19:59:00,947 [Thread-6] [DEBUG] [mgr_util] Connection to cephfs 'cephfs' complete
2022-06-13 19:59:00,964 [Thread-6] [DEBUG] [mgr_util] [put] connection: <mgr_util.CephfsConnectionPool.Connection object at 0x7fb4a13113c8> usage: 1
2022-06-13 19:59:00,964 [Thread-6] [ERROR] [snap_schedule.fs.schedule_client] create_scheduled_snapshot raised an exception:
2022-06-13 19:59:00,965 [Thread-6] [ERROR] [snap_schedule.fs.schedule_client] Traceback (most recent call last):
  File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py", line 285, in create_scheduled_snapshot
    fs_handle.mkdir(snap_name, 0o755)
  File "cephfs.pyx", line 1023, in cephfs.LibCephFS.mkdir
cephfs.ObjectNotFound: error in mkdir /mnt/cephfs_kernellbswu5zrxp/yh2eozhm4l//.snap/scheduled-2022-06-13-19_59_00_UTC: No such file or directory [Errno 2]

--
2022-06-13 19:59:00,966 [Thread-6] [DEBUG] [mgr_util] [get] connection: <mgr_util.CephfsConnectionPool.Connection object at 0x7fb4a13113c8> usage: 0
2022-06-13 19:59:00,966 [Thread-6] [DEBUG] [mgr_util] self.fs_id=1, fs_id=1
2022-06-13 19:59:00,966 [Thread-6] [DEBUG] [mgr_util] [get] connection (<mgr_util.CephfsConnectionPool.Connection object at 0x7fb4a13113c8>) can be reused
2022-06-13 19:59:00,967 [Thread-6] [DEBUG] [mgr_util] [put] connection: <mgr_util.CephfsConnectionPool.Connection object at 0x7fb4a13113c8> usage: 1
2022-06-13 19:59:00,967 [Thread-6] [ERROR] [snap_schedule.fs.schedule_client] prune_snapshots raised an exception:
2022-06-13 19:59:00,967 [Thread-6] [ERROR] [snap_schedule.fs.schedule_client] Traceback (most recent call last):
  File "/usr/share/ceph/mgr/snap_schedule/fs/schedule_client.py", line 310, in prune_snapshots
    with fs_handle.opendir(f'{path}/.snap') as d_handle:
  File "cephfs.pyx", line 942, in cephfs.LibCephFS.opendir
cephfs.ObjectNotFound: opendir failed: No such file or directory [Errno 2]



{
    "mon": {
        "ceph version 16.2.8-42.el8cp (c15e56a8d2decae9230567653130d1e31a36fe0a) pacific (stable)": 3
    },
    "mgr": {
        "ceph version 16.2.8-42.el8cp (c15e56a8d2decae9230567653130d1e31a36fe0a) pacific (stable)": 2
    },
    "osd": {
        "ceph version 16.2.8-42.el8cp (c15e56a8d2decae9230567653130d1e31a36fe0a) pacific (stable)": 12
    },
    "mds": {
        "ceph version 16.2.8-42.el8cp (c15e56a8d2decae9230567653130d1e31a36fe0a) pacific (stable)": 3
    },
    "overall": {
        "ceph version 16.2.8-42.el8cp (c15e56a8d2decae9230567653130d1e31a36fe0a) pacific (stable)": 20
    }
}

[{"fs": "cephfs", "subvol": null, "path": "/mnt/cephfs_kernellbswu5zrxp/yh2eozhm4l/", "rel_path": "/mnt/cephfs_kernellbswu5zrxp/yh2eozhm4l/", "schedule": "1h", "retention": {"h": 5}, "start": "2022-06-13T15:59:00", "created": "2022-06-13T19:57:38", "first": null, "last": null, "last_pruned": null, "created_count": 0, "pruned_count": 0, "active": false}]

Comment 1 Venky Shankar 2022-06-15 10:38:48 UTC
Milind, please take this one. Its the stats that are messed up.