2222231 – mds: allow entries to be removed from lost+found directory

Bug 2222231 - mds: allow entries to be removed from lost+found directory

Summary: mds: allow entries to be removed from lost+found directory

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	CephFS
Sub Component:
Version:	5.2
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	6.1z1
Assignee:	Venky Shankar
QA Contact:	Amarnath
Docs Contact:	Akash Raj
URL:
Whiteboard:
Depends On:
Blocks:	2221020 2222232
TreeView+	depends on / blocked

Reported:	2023-07-12 10:53 UTC by Venky Shankar
Modified:	2024-02-14 10:43 UTC (History)
CC List:	6 users (show)
Fixed In Version:	ceph-17.2.6-93.el9cp
Doc Type:	Bug Fix
Doc Text:	.The recovered files under the `lost+found` directory can now be deleted in Ceph File System With this fix, after recovering a Ceph File System post the disaster recovery the recovered files under the `lost+found` directory can be deleted.
Clone Of:
Clones:	2222232 (view as bug list)
Environment:
Last Closed:	2023-08-03 16:45:10 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	59569	None	None	None	2023-07-12 10:53:13 UTC
Red Hat Issue Tracker	RHCEPH-6996	None	None	None	2023-07-12 10:53:46 UTC
Red Hat Product Errata	RHBA-2023:4473	None	None	None	2023-08-03 16:46:02 UTC

Description Venky Shankar 2023-07-12 10:53:14 UTC

Post file system recovery, files which have missing backtraces are recovered in lost+found directory. Users could choose to copy/backup entries from lost+found (file names being inode numbers). Right now, the MDS gates unlinking from lost+found directory with -EROFS which disallows users to cleanup lost+found directory.

Comment 1 RHEL Program Management 2023-07-12 10:53:23 UTC

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 8 Amarnath 2023-07-17 12:31:21 UTC

Hi Venky,

As part of creating lost+found directory we have tried below steps

[root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados ls -p cephfs.cephfs.data
10000000201.00000000
100000001fe.00000000
[root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr 10000000201.00000000 backtrace -p cephfs.cephfs.data

[root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr 100000001fe.00000000 backtrace -p cephfs.cephfs.data

[root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs fail cephfs
cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed.


[root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it

After reset also we are not able to create lost+found directory.

Could help me with the steps for the same

Regards,
Amarnath

Comment 9 Venky Shankar 2023-07-18 04:16:41 UTC

(In reply to Amarnath from comment #8)
> Hi Venky,
> 
> As part of creating lost+found directory we have tried below steps
> 
> [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados ls -p
> cephfs.cephfs.data
> 10000000201.00000000
> 100000001fe.00000000
> [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr
> 10000000201.00000000 backtrace -p cephfs.cephfs.data
> 
> [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# rados rmxattr
> 100000001fe.00000000 backtrace -p cephfs.cephfs.data
> 
> [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs fail cephfs
> cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks
> marked failed.
> 
> 
> [root@ceph-amk-61-test-xtknkr-node7 ceph-fuse]# ceph fs reset cephfs
> --yes-i-really-mean-it

Just by resetting the file system will automagically create the lost+found directory. You need to run through the metadata recovery steps as detailed here

> https://docs.ceph.com/en/latest/cephfs/disaster-recovery-experts/#recovery-from-missing-metadata-objects

Since the bactrace xattr is missing, the data scan tool would dump the file in lost+found directory (under /) with the file name as the inode number.

Comment 10 Amarnath 2023-07-18 08:00:28 UTC

Hi All,

Thanks Venky,

We are able to create lost+found folder.

We are able to delete the contents of it

[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ls -lrt
total 1
drwxr-xr-x. 2 root root  0 Jul 18 02:31 test
-rw-r--r--. 1 root root 25 Jul 18 02:31 test_lost.txt
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# rados ls -p cephfs.cephfs.data
100000060dd.00000000
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# rados rmxattr 100000060dd.00000000 backtrace -p cephfs.cephfs.data
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# 
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# 
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cephfs-table-tool all reset session
Error ((22) Invalid argument)
2023-07-18T02:40:07.954-0400 7f9899e0cfc0 -1 main: Bad rank selection: all'

[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cephfs-table-tool cephfs:0 reset session
{
    "0": {
        "data": {},
        "result": 0
    }
}

[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it
Error EINVAL: all MDS daemons must be inactive before resetting filesystem: set the cluster_down flag and use `ceph mds fail` to make this so
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph mds fail
Invalid command: missing required parameter role_or_gid(<string>)
mds fail <role_or_gid> :  Mark MDS failed: trigger a failover if a standby is available
Error EINVAL: invalid command
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail
Invalid command: missing required parameter fs_name(<string>)
fs fail <fs_name> :  bring the file system down and all of its ranks
Error EINVAL: invalid command
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail caphfs
Error ENOENT: Filesystem not found: 'caphfs'
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs fail cephfs
cephfs marked not joinable; MDS cannot join the cluster. All MDS ranks marked failed.
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ceph fs reset cephfs --yes-i-really-mean-it
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# ls -lrt
total 1
drwxr-xr-x. 2 root root 0 Dec 31  1969 lost+found
[root@ceph-amk-fs-tier2-js5wzs-node8 ceph-fuse]# cd lost+found/
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ls -lrt
total 1
-r-x------. 1 root root 25 Jul 18 02:32 100000060dd
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# 
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# 
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# rm -rf 100000060dd 
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ls -lrt
total 0
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# 

Verified on 
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# ceph versions
{
    "mon": {
        "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 3
    },
    "mgr": {
        "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 2
    },
    "osd": {
        "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 12
    },
    "mds": {
        "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 5
    },
    "overall": {
        "ceph version 17.2.6-98.el9cp (b53d9dfff6b1021fbab8e28f2b873d7d49cf5665) quincy (stable)": 22
    }
}
[root@ceph-amk-fs-tier2-js5wzs-node8 lost+found]# 

Regards,
Amarnath

Comment 12 errata-xmlrpc 2023-08-03 16:45:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:4473

Note You need to log in before you can comment on or make changes to this bug.