Bug 2108656

Summary:	mds: FAILED ceph_assert(dir->get_projected_version() == dir->get_version())
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Patrick Donnelly <pdonnell>
Component:	CephFS	Assignee:	Xiubo Li <xiubli>
Status:	CLOSED ERRATA	QA Contact:	Amarnath <amk>
Severity:	high	Docs Contact:
Priority:	urgent
Version:	5.1	CC:	akraj, amk, ceph-eng-bugs, cephqe-warriors, tserlin, vereddy, xiubli
Target Milestone:	---
Target Release:	5.2
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:	ceph-16.2.8-80.el8cp	Doc Type:	Bug Fix
Doc Text:	.MDSs no longer crash when fetching unlinked directories Previously, when fetching unlinked directories, the projected version would be incorrectly initialized, causing MDSs to crash when performing sanity checks. With this fix, the projected version and the inode version are initialized when fetching an unlinked directory, allowing the MDSs to perform sanity checks without crashing.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-08-09 17:39:24 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2071085, 2102272

Description Patrick Donnelly 2022-07-19 16:12:30 UTC

Description of problem:

See: https://tracker.ceph.com/issues/53597

This resolves a critical crash already hit in customer environment, see bz#2071085.

Comment 4 Amarnath 2022-07-21 13:12:12 UTC

Hi Xiubo,

As part of QE Verification for this, I am following the test_migrate_unlinked_dir test method that has been pushed as part of this MR.
Steps Followed : 

1. Create FS and set max_mds to 2 
    [cephuser@ceph-fix-amk-qqjwnv-node7 ~]$ sudo -i
	[root@ceph-fix-amk-qqjwnv-node7 ~]# ceph -s
	cluster:
	id:     8382e3e2-07ec-11ed-ac3e-fa163e005463
	health: HEALTH_OK

	services:
	mon: 3 daemons, quorum ceph-fix-amk-qqjwnv-node1-installer,ceph-fix-amk-qqjwnv-node2,ceph-fix-amk-qqjwnv-node3 (age 30h)
	mgr: ceph-fix-amk-qqjwnv-node1-installer.yuneqm(active, since 30h), standbys: ceph-fix-amk-qqjwnv-node2.ncahej
	mds: 2/2 daemons up, 1 standby
	osd: 12 osds: 12 up (since 30h), 12 in (since 30h)

	data:
	volumes: 1/1 healthy
	pools:   3 pools, 65 pgs
	objects: 42 objects, 108 KiB
	usage:   152 MiB used, 180 GiB / 180 GiB avail
	pgs:     65 active+clean

	[root@ceph-fix-amk-qqjwnv-node7 ~]# ceph fs status
	cephfs - 0 clients
	======
	RANK  STATE                     MDS                       ACTIVITY     DNS    INOS   DIRS   CAPS  
	0    active  cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd  Reqs:    0 /s    14     13     12      0   
	1    active  cephfs.ceph-fix-amk-qqjwnv-node5.lfboep  Reqs:    0 /s    11     15     13      0   
	   POOL           TYPE     USED  AVAIL  
	cephfs.cephfs.meta  metadata   480k  56.9G  
	cephfs.cephfs.data    data       0   56.9G  
			  STANDBY MDS                
	cephfs.ceph-fix-amk-qqjwnv-node6.xftvxh  
	MDS version: ceph version 16.2.8-77.el8cp (bf5436ca8dca124b6f4b3ddd729d112a54f70e29) pacific (stable)

2. Created a dir in mnt and mounted it using ceph-fuse
    [root@ceph-fix-amk-qqjwnv-node7 ~]# cd /mnt/cephfs_fuse/
    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# 
    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# ls -lrt
     total 0
    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# mkdir test
    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# ls -lrt
    total 1
    drwxr-xr-x. 2 root root 0 Jul 21 08:18 test
    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# touch test/placeholder
3. setf attribute ceph.dir.pin to rank 1
   [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# setfattr -n ceph.dir.pin -v 1 /mnt/cephfs_fuse/test/
4. Create a directory inside /mnt/cephfs_fuse/test with /mnt/cephfs_fuse/test/to-be-unlinked  
5. Open the directory using the python program
    import time
    import os
    os.mkdir("/mnt/cephfs_fuse/test/to-be-unlinked")
    fd = os.open("/mnt/cephfs_fuse/test/to-be-unlinked", os.O_RDONLY)
    while True:
       time.sleep(1)

5. rmdir /mnt/cephfs_fuse/test/to-be-unlinked
6. Check the mds stray count on rank 1
   [ceph: root@ceph-fix-amk-qqjwnv-node5 /]# ceph daemon mds.cephfs.ceph-fix-amk-qqjwnv-node5.lfboep perf dump mds_cache num_strays 
		{
			"mds_cache": {
				"num_strays": 1
			}
		}
Before Deletion of directory it was 0
7. set max_mds to 1 and check the strays have been moved to rank 0
	[ceph: root@ceph-fix-amk-qqjwnv-node4 /]# ceph daemon mds.cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd perf dump mds_cache num_strays
		{
			"mds_cache": {
				"num_strays": 0
			}
		}
		[ceph: root@ceph-fix-amk-qqjwnv-node4 /]# ceph daemon mds.cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd perf dump mds_cache num_strays
		{
			"mds_cache": {
				"num_strays": 1
			}
		}

Can you please review this steps?
Or do we need to perform upgrade with Stray enrtires and check the crash in MDS

Regards,
Amarnath

Comment 5 Xiubo Li 2022-07-22 00:16:30 UTC

(In reply to Amarnath from comment #4)
> Hi Xiubo,
> 
> As part of QE Verification for this, I am following the
> test_migrate_unlinked_dir test method that has been pushed as part of this
> MR.
> Steps Followed : 
> 
> 1. Create FS and set max_mds to 2 
>     [cephuser@ceph-fix-amk-qqjwnv-node7 ~]$ sudo -i
> 	[root@ceph-fix-amk-qqjwnv-node7 ~]# ceph -s
> 	cluster:
> 	id:     8382e3e2-07ec-11ed-ac3e-fa163e005463
> 	health: HEALTH_OK
> 
> 	services:
> 	mon: 3 daemons, quorum
> ceph-fix-amk-qqjwnv-node1-installer,ceph-fix-amk-qqjwnv-node2,ceph-fix-amk-
> qqjwnv-node3 (age 30h)
> 	mgr: ceph-fix-amk-qqjwnv-node1-installer.yuneqm(active, since 30h),
> standbys: ceph-fix-amk-qqjwnv-node2.ncahej
> 	mds: 2/2 daemons up, 1 standby
> 	osd: 12 osds: 12 up (since 30h), 12 in (since 30h)
> 
> 	data:
> 	volumes: 1/1 healthy
> 	pools:   3 pools, 65 pgs
> 	objects: 42 objects, 108 KiB
> 	usage:   152 MiB used, 180 GiB / 180 GiB avail
> 	pgs:     65 active+clean
> 
> 	[root@ceph-fix-amk-qqjwnv-node7 ~]# ceph fs status
> 	cephfs - 0 clients
> 	======
> 	RANK  STATE                     MDS                       ACTIVITY     DNS 
> INOS   DIRS   CAPS  
> 	0    active  cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd  Reqs:    0 /s    14  
> 13     12      0   
> 	1    active  cephfs.ceph-fix-amk-qqjwnv-node5.lfboep  Reqs:    0 /s    11  
> 15     13      0   
> 	   POOL           TYPE     USED  AVAIL  
> 	cephfs.cephfs.meta  metadata   480k  56.9G  
> 	cephfs.cephfs.data    data       0   56.9G  
> 			  STANDBY MDS                
> 	cephfs.ceph-fix-amk-qqjwnv-node6.xftvxh  
> 	MDS version: ceph version 16.2.8-77.el8cp
> (bf5436ca8dca124b6f4b3ddd729d112a54f70e29) pacific (stable)
> 
> 2. Created a dir in mnt and mounted it using ceph-fuse
>     [root@ceph-fix-amk-qqjwnv-node7 ~]# cd /mnt/cephfs_fuse/
>     [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# 
>     [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# ls -lrt
>      total 0
>     [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# mkdir test
>     [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# ls -lrt
>     total 1
>     drwxr-xr-x. 2 root root 0 Jul 21 08:18 test
>     [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# touch test/placeholder
> 3. setf attribute ceph.dir.pin to rank 1
>    [root@ceph-fix-amk-qqjwnv-node7 cephfs_fuse]# setfattr -n ceph.dir.pin -v
> 1 /mnt/cephfs_fuse/test/
> 4. Create a directory inside /mnt/cephfs_fuse/test with
> /mnt/cephfs_fuse/test/to-be-unlinked  
> 5. Open the directory using the python program
>     import time
>     import os
>     os.mkdir("/mnt/cephfs_fuse/test/to-be-unlinked")
>     fd = os.open("/mnt/cephfs_fuse/test/to-be-unlinked", os.O_RDONLY)
>     while True:
>        time.sleep(1)
> 
> 5. rmdir /mnt/cephfs_fuse/test/to-be-unlinked
> 6. Check the mds stray count on rank 1
>    [ceph: root@ceph-fix-amk-qqjwnv-node5 /]# ceph daemon
> mds.cephfs.ceph-fix-amk-qqjwnv-node5.lfboep perf dump mds_cache num_strays 
> 		{
> 			"mds_cache": {
> 				"num_strays": 1
> 			}
> 		}
> Before Deletion of directory it was 0
> 7. set max_mds to 1 and check the strays have been moved to rank 0
> 	[ceph: root@ceph-fix-amk-qqjwnv-node4 /]# ceph daemon
> mds.cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd perf dump mds_cache num_strays
> 		{
> 			"mds_cache": {
> 				"num_strays": 0
> 			}
> 		}
> 		[ceph: root@ceph-fix-amk-qqjwnv-node4 /]# ceph daemon
> mds.cephfs.ceph-fix-amk-qqjwnv-node4.xrhqhd perf dump mds_cache num_strays
> 		{
> 			"mds_cache": {
> 				"num_strays": 1
> 			}
> 		}
> 
> Can you please review this steps?
> Or do we need to perform upgrade with Stray enrtires and check the crash in
> MDS
> 
> Regards,
> Amarnath

Hi Amarnath,

Yeah, please go ahead and this looks good to me.

Thanks!
Xiubo

Comment 11 Amarnath 2022-07-27 19:46:24 UTC

Verified on ceph Version : 16.2.8-80.el8cp
I see Strays are getting migrated to the active MDS when there is a failure in MDS1

Client Node Commands :
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# ceph fs status
cephfs - 1 clients
======
RANK  STATE                       MDS                          ACTIVITY     DNS    INOS   DIRS   CAPS  
 0    active  cephfs.ceph-fs-dashboard-zl2us3-node3.mtyixc  Reqs:    0 /s    27     22     17      9   
 1    active  cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw  Reqs:    0 /s    25     22     19      7   
       POOL           TYPE     USED  AVAIL  
cephfs.cephfs.meta  metadata  2252k  56.8G  
cephfs.cephfs.data    data    48.0k  56.8G  
                STANDBY MDS                   
cephfs.ceph-fs-dashboard-zl2us3-node4.vhsgud  
MDS version: ceph version 16.2.8-80.el8cp (22ecb3f5fcb872bb1d7739004f01a8eded7b397c) pacific (stable)
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# ceph orch host ls



[root@ceph-fs-dashboard-zl2us3-node9 /]# cd /mnt/ceph-fuse
[root@ceph-fs-dashboard-zl2us3-node9 ceph-fuse]# 
[root@ceph-fs-dashboard-zl2us3-node9 ceph-fuse]# ls -lrt
total 1
drwxr-xr-x. 3 root root 182 Jul 27 12:49 volumes
[root@ceph-fs-dashboard-zl2us3-node9 ceph-fuse]# cd volumes/
[root@ceph-fs-dashboard-zl2us3-node9 volumes]# cd subvolgroup_1/
[root@ceph-fs-dashboard-zl2us3-node9 subvolgroup_1]# cd subvol_1/
[root@ceph-fs-dashboard-zl2us3-node9 subvol_1]# cd 24e4b482-3307-46b1-8d31-0d10a01e7ed7/
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# ls -lrt
total 1
-rw-r--r--. 1 root root 14 Jul 27 12:53 test.txt
-rw-r--r--. 1 root root 36 Jul 27 13:03 test_2.txt
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# mkdir pin_test
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# touch pin_test/placeholder
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# pwd
/mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7

Setting Pin to the directory with MDS.1
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# setfattr -n ceph.dir.pin -v 1 /mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7/pin_test/
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# mkdir /mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7/pin_test/to-be-unlinked  
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# cd /mnt/ceph-fuse/open_dir

Run Open directory and remove directory from other place
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# cat /mnt/ceph-fuse/open_dir.py 
import time
import os
os.mkdir("/mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7/pin_test/to-be-unlinked/1")
fd = os.open("/mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7/pin_test/to-be-unlinked/1", os.O_RDONLY)
while True:
   time.sleep(1)
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# rmdir /mnt/ceph-fuse/volumes/subvolgroup_1/subvol_1/24e4b482-3307-46b1-8d31-0d10a01e7ed7/pin_test/to-be-unlinked/1

Set Max_MDS to 1
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# ceph fs set max_mds 1

MDS.0:
[root@ceph-fs-dashboard-zl2us3-node3 ~]# cephadm shell
Inferring fsid b8423362-0d82-11ed-af08-fa163e7db892
Inferring config /var/lib/ceph/b8423362-0d82-11ed-af08-fa163e7db892/mon.ceph-fs-dashboard-zl2us3-node3/config
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:22d12ad3f3fe4dff25ec3c81348ea0018247dc654957c1dcf5e8da88fca43ed2
[ceph: root@ceph-fs-dashboard-zl2us3-node3 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node3.mtyixc perf dump mds_cache num_strays
{
    "mds_cache": {
        "num_strays": 0
    }
}
[ceph: root@ceph-fs-dashboard-zl2us3-node3 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node3.mtyixc perf dump mds_cache num_strays
{
    "mds_cache": {
        "num_strays": 0
    }
}
After running rmdir on open directory
[ceph: root@ceph-fs-dashboard-zl2us3-node3 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node3.mtyixc perf dump mds_cache num_strays
{
    "mds_cache": {
        "num_strays": 1
    }
}

MDS.1
[root@ceph-fs-dashboard-zl2us3-node5 ~]# cephadm shell
Inferring fsid b8423362-0d82-11ed-af08-fa163e7db892
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:22d12ad3f3fe4dff25ec3c81348ea0018247dc654957c1dcf5e8da88fca43ed2
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 0
    }
}
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 1
    }
}
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 1
    }
}
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 1
    }
}
Stray got migrated to MDS.0 node
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 0
    }
}
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{
    "mds_cache": {
        "num_strays": 0
    }
}

MDS Stopped
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# ceph daemon mds.cephfs.ceph-fs-dashboard-zl2us3-node5.svpkjw perf dump mds_cache num_strays 
{}
[ceph: root@ceph-fs-dashboard-zl2us3-node5 /]# 


No crash observed
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# ceph crash ls
[root@ceph-fs-dashboard-zl2us3-node9 24e4b482-3307-46b1-8d31-0d10a01e7ed7]# 


Regards,
Amarnath

Comment 17 errata-xmlrpc 2022-08-09 17:39:24 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage Security, Bug Fix, and Enhancement Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5997