Post file system recovery, files which have missing backtraces are recovered in lost+found directory. Users could choose to copy/backup entries from lost+found (file names being inode numbers). Right now, the MDS gates unlinking from lost+found directory with -EROFS which disallows users to cleanup lost+found directory.
Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.
Hi Venky, As part of creating lost+found directory we have tried below steps [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados ls -p cephfs.cephfs.data 10000000201.00000000 100000001fe.00000000 [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr 10000000201.00000000 backtrace -p cephfs.cephfs.data [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr 100000001fe.00000000 backtrace -p cephfs.cephfs.data [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs fail cephfs cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed. [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it After reset also we are not able to create lost+found directory. Could help me with the steps for the same Regards, Amarnath
(In reply to Amarnath from comment #8) > Hi Venky, > > As part of creating lost+found directory we have tried below steps > > [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados ls -p > cephfs.cephfs.data > 10000000201.00000000 > 100000001fe.00000000 > [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr > 10000000201.00000000 backtrace -p cephfs.cephfs.data > > [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr > 100000001fe.00000000 backtrace -p cephfs.cephfs.data > > [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs fail cephfs > cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks > marked failed. > > > [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs reset cephfs > --yes-i-really-mean-it Just by resetting the file system will automagically create the lost+found directory. You need to run through the metadata recovery steps as detailed here > https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects Since the bactrace xattr is missing, the data scan tool would dump the file in lost+found directory (under /) with the file name as the inode number.
Hi All, Thanks Venky, We are able to create lost+found folder. We are able to delete the contents of it [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ls -lrt total 1 drwxr-xr-x. 2 root root 0 Jul 18 02:31 test -rw-r--r--. 1 root root 25 Jul 18 02:31 test_lost.txt [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# rados ls -p cephfs.cephfs.data 100000060dd.00000000 [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# rados rmxattr 100000060dd.00000000 backtrace -p cephfs.cephfs.data [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cephfs-table-tool all reset session Error ((22) Invalid argument) 2023-07-18T02:40:07.954-0400 7f9899e0cfc0 -1 main: Bad rank selection: all' [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cephfs-table-tool cephfs:0 reset session { "0": { "data": {}, "result": 0 } } [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it Error EINVAL: all MDS daemons must be inactive before resetting filesystem: set the cluster_down flag and use `ceph mds fail` to make this so [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph mds fail Invalid command: missing required parameter role_or_gid(<string>) mds fail <role_or_gid> : Mark MDS failed: trigger a failover if a standby is available Error EINVAL: invalid command [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail Invalid command: missing required parameter fs_name(<string>) fs fail <fs_name> : bring the file system down and all of its ranks Error EINVAL: invalid command [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail caphfs Error ENOENT: Filesystem not found: 'caphfs' [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail cephfs cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed. [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ls -lrt total 1 drwxr-xr-x. 2 root root 0 Dec 31 1969 lost+found [root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cd lost+found/ [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ls -lrt total 1 -r-x------. 1 root root 25 Jul 18 02:32 100000060dd [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# rm -rf 100000060dd [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ls -lrt total 0 [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# Verified on [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ceph versions { "mon": { "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 3 }, "mgr": { "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 2 }, "osd": { "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 12 }, "mds": { "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 5 }, "overall": { "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 22 } } [root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# Regards, Amarnath
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:4473