Bug 1974882

Summary:	slow performance on parallel rm operations to the same PVC RWX based on CephFS
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	kelwhite
Component:	CephFS	Assignee:	Xiubo Li <xiubli>
Status:	CLOSED ERRATA	QA Contact:	Amarnath <amk>
Severity:	low	Docs Contact:	Ranjini M N <rmandyam>
Priority:	high
Version:	5.0	CC:	agunn, amk, bniver, ceph-eng-bugs, gfarnum, hchiramm, hyelloji, kdreyer, madam, mduasope, muagarwa, ocs-bugs, pdonnell, rmandyam, sostapov, tserlin, vereddy, xiubli, ykaul
Target Milestone:	---	Keywords:	ABIAssurance, Performance
Target Release:	5.1
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	ceph-16.2.7-14.el8cp	Doc Type:	Bug Fix
Doc Text:	.The global `mds_lock` is now switched to fair mutex for better user experience Previously, the Ceph Metadata Server (MDS) daemon used the std::mutex for the global `mds_lock` causing the lock waiters to be stuck for several seconds during heavy load. This would lead to users experiencing slow operations for `rmdir` or `mkdir` commands. With this update, the MDS daemon’s global `mds_lock` is switched to a fair mutex to guarantee the lock waiters are woken up and scheduled in FIFO mode resulting in a better user experience and improving the performance of clients in heavy loads.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2022-04-04 10:21:12 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	2031073

Comment 2 Yaniv Kaul 2021-07-06 11:15:34 UTC

Brett, can someone look at this?

Comment 4 Mudit Agarwal 2021-07-07 02:14:45 UTC

Not a 4.8 blocker, moving out.

Comment 5 Scott Ostapovicz 2021-07-07 12:58:55 UTC

It looks like something went wrong with the assignment, as it is now assignerfd to an invalid user.  Assigning to Humble so he can triage the performance degradation from OCS release to OCS release first.

Comment 51 Amarnath 2022-02-23 17:24:08 UTC

Hi @Xiubo,
I tried the below commands in 4.2 and 5.1 builds.
I see better performance in 4.2 Builds
May i know if i am missing anything

OS : RHEL 8.5

Commands : 
mkdir /mnt/kcephfs.A/
mkdir /mnt/kcephfs.B
mkdir /mnt/kcephfs.C

ceph fs subvolume create cephfs subvol_A
ceph fs subvolume create cephfs subvol_B
ceph fs subvolume create cephfs subvol_C

ceph fs subvolume getpath cephfs subvol_A
ceph fs subvolume getpath cephfs subvol_B
ceph fs subvolume getpath cephfs subvol_C

mount -t ceph 10.0.209.255,10.0.210.106,10.0.211.95:/volumes/_nogroup/subvol_A/8c97cbcf-8806-451b-8796-afbd04d20b41 /mnt/kcephfs.A/ -o name=ceph-amk-bz-l93gok-node7,secretfile=/etc/ceph/ceph-amk-bz-l93gok-node7.secret
mount -t ceph 10.0.209.255,10.0.210.106,10.0.211.95:/volumes/_nogroup/subvol_B/e76cf51d-36f9-4f8a-beb7-f69b45bb74c8 /mnt/kcephfs.B/ -o name=ceph-amk-bz-l93gok-node7,secretfile=/etc/ceph/ceph-amk-bz-l93gok-node7.secret
mount -t ceph 10.0.209.255,10.0.210.106,10.0.211.95:/volumes/_nogroup/subvol_C/53e3ad08-1176-4d92-b5ef-917328a5b123 /mnt/kcephfs.C/ -o name=ceph-amk-bz-l93gok-node7,secretfile=/etc/ceph/ceph-amk-bz-l93gok-node7.secret


[root@ceph-upgrade-5-0-zcrq6x-node7 ~]# for d in A B C; do (cd /mnt/kcephfs.$d && for i in {0001..1000}; do mkdir -p removal$d.test$i; done ) & done; wait
[1] 74019
[2] 74020
[3] 74021
[1]   Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[3]+  Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[2]+  Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[root@ceph-upgrade-5-0-zcrq6x-node7 ~]# for i in A B C; do (cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i*) & done; wait
[1] 77022
[2] 77023
[3] 77024

real	0m1.512s
user	0m0.109s
sys	0m0.225s

real	0m1.516s
user	0m0.110s
sys	0m0.229s
[1]   Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )
[3]+  Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )

real	0m1.566s
user	0m0.116s
sys	0m0.205s
[2]+  Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )
[root@ceph-upgrade-5-0-zcrq6x-node7 ~]# ceph version
ceph version 16.2.7-69.el8cp (3eaf40c02886a02f9b172579ac6048bad587b63b) pacific (stable)


========================================================================================================================================================================================================================

[root@ceph-amk-bz-l93gok-node7 ~]# for d in A B C; do (cd /mnt/kcephfs.$d && for i in {0001..1000}; do mkdir -p removal$d.test$i; done ) & done; wait
[1] 83007
[2] 83008
[3] 83009
[1]   Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[2]-  Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[3]+  Done                    ( cd /mnt/kcephfs.$d && for i in {0001..1000};
do
    mkdir -p removal$d.test$i;
done )
[root@ceph-amk-bz-l93gok-node7 ~]# for i in A B C; do (cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i*) & done; wait
[1] 86010
[2] 86011
[3] 86012

real	0m1.661s
user	0m0.112s
sys	0m0.275s

real	0m1.796s
user	0m0.119s
sys	0m0.213s

real	0m1.816s
user	0m0.103s
sys	0m0.246s
[1]   Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )
[2]-  Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )
[3]+  Done                    ( cd /mnt/kcephfs.$i && time strace -Tv -o ~/removal${i}.log -- rm -rf removal$i* )
[root@xiubli

Comment 54 errata-xmlrpc 2022-04-04 10:21:12 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1174