Bug 1909949

Summary: nfs mounts are inaccessible due to hang
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Yogesh Mane <ymane>
Component: CephFSAssignee: Patrick Donnelly <pdonnell>
Status: CLOSED ERRATA QA Contact: Yogesh Mane <ymane>
Severity: high Docs Contact: Amrita <asakthiv>
Priority: medium    
Version: 5.0CC: asakthiv, ceph-eng-bugs, dfuller, hyelloji, kdreyer, sweil, vereddy
Target Milestone: ---   
Target Release: 5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nfs-ganesha-3.3-3.el8cp, nfs-ganesha-3.3-3.el7cp Doc Type: Known Issue
Doc Text:
.NFS mounts are now accessible with multiple exports Previously, when multiple CephFS exports were created, read/write to the exports would hang. As a result the NFS mounts were inaccessible. To workaround this issue, single exports are supported for Ganesha version 3.3-2 and below. With this release, multiple CephFS exports are supported when Ganesha version 3.3-3 and above is used.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:27:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1851102    
Bug Blocks: 1959686    
Attachments:
Description Flags
nfs-server logs
none
mgr_logs none

Comment 5 Varsha 2020-12-23 07:22:01 UTC
Please share logs and steps to reproduce it.

Comment 6 Yogesh Mane 2020-12-23 16:41:52 UTC
Created attachment 1741590 [details]
nfs-server logs

Comment 7 Yogesh Mane 2020-12-23 16:42:12 UTC
1.Create multiple nfs exports
2.Mount nfs on 4 clients with different exports.
3.Create 3 directories & run IOs on each directory & root directory on each mount ( directories d1,d2,d3)
4.Again create  4 new directories & run IOs on each directory on each mount ( directories d4,d5,d6,d7)
5.Delete these 4 directories simultaneously. Deletion will take time .Stop deletion in between after some time say 1 minute.
6.Try to access directories, hang occurs.
7.login again to client & try to access mount, hang occurs

Step 3) Directory & Command to run on that directory 
root= for n in {1..7000}; do     dd if=/dev/urandom of=uile$( printf %03d "$n" ) bs=4k count=1; done
d1= for n in {1..1000}; do     dd if=/dev/urandom of=file$( printf %03d "$n" ) bs=10M count=100; done
d2= for n in {1..30); do     dd if=/dev/urandom of=mile$( printf %03d "$n" ) bs=30M count=1000; done
d3= for n in {1..100000}; do     dd if=/dev/urandom of=tile$( printf %03d "$n" ) bs=10M count=10; done

Step 4) Commands used to run IOs on different mount

d4= for n in {1..2000000}; do     dd if=/dev/urandom of=uile$( printf %03d "$n" ) bs=4k count=1; done
d5= for n in {1..1000}; do     dd if=/dev/urandom of=file$( printf %03d "$n" ) bs=10M count=100; done
d6= for n in {1..30); do     dd if=/dev/urandom of=mile$( printf %03d "$n" ) bs=30M count=1000; done
d7= for n in {1..100000}; do     dd if=/dev/urandom of=tile$( printf %03d "$n" ) bs=10M count=10; done

Step 5) Start deleting directories (d4,d5,d6,d7 simultaneously)
rm -rf d4
rm -rf d5
rm -rf d6
rm -rf d7

Comment 8 Yogesh Mane 2020-12-24 00:38:03 UTC
1 Correction in step 3

root= for n in {1..2000000}; do     dd if=/dev/urandom of=uile$( printf %03d "$n" ) bs=4k count=1; done

And for step 3) IOs were stopped when files in root dierctory were around 7000 files

And for step 4) IOs were stopped when ceph storage "size" was around 333G & "RAW USED"  was around 1000G.

Comment 9 Varsha 2020-12-29 11:59:40 UTC
Currently, the ganesha logs are no longer saved as syslog. Please share the mgr and nfs-ganesha container logs: podman logs <container_id>.

.

Comment 10 Yogesh Mane 2020-12-29 12:31:35 UTC
Created attachment 1742902 [details]
mgr_logs

These are mgr_logs. I have given nfs logs in #comment6

Comment 11 Varsha 2021-01-27 17:47:28 UTC
It is mostly due to this bug https://bugzilla.redhat.com/show_bug.cgi?id=1851102. Using Ganesha version 3.3-3 and above should resolve it.

Comment 17 Varsha 2021-05-25 13:32:31 UTC
Looks good

Comment 20 errata-xmlrpc 2021-08-30 08:27:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294