Bug 1403757

Summary: [Ganesha] : find hangs when coupled with new writes.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ambarish <asoman>
Component: nfs-ganeshaAssignee: Frank Filz <ffilz>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, dang, ffilz, jthottan, kkeithle, msaini, pasik, rcyriac, rhinduja, rhs-bugs, sanandpa, sheggodu, skoduri, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: RHGS 3.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: nfs-ganesha-2.7.3-3 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1382912 Environment:
Last Closed: 2019-10-30 12:15:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1475699, 1581587, 1695078    
Bug Blocks: 1696807    

Comment 9 Manisha Saini 2018-08-06 18:03:18 UTC
Tested this usecase against nfs-ganesha-2.5.5-9.el7rhgs.x86_64 bits.

Steps-

1.Create 6 node ganesha cluster
2.Create 6*3 Distributed-Replicate Volume.
3.Mount the volume to 4 different clients via 4 different VIP's (v4)
4.Create 3 directories on mount point.
5.Run dd from 3 clients on 3 different directories( create 1 lakh files/client in loop ) and find's in loop from client 4.


With the readdir feature,there is a lot of improvement in 3.4.0 as compared to previous ganesha versions

When find's are running in parallel with new writes,still observing hungs but when this issue was reported, find were hung for ~36 Hours (not even started) when tested against older Ganesha bits.

This has now been improved to an extent from ~36 hours to ~2.5 Hours.But still,this can further be optimized.

Raised a separate BZ to track further improvements on this - BZ 1612894.

Moving this BZ to verified state.

Comment 10 Daniel Gryniewicz 2018-08-27 12:24:37 UTC
This should be moved out of 3.4, since dirent chunk is removed.

Comment 21 Manisha Saini 2019-06-07 17:50:13 UTC
Verified this BZ with

]# rpm -qa | grep ganesha
nfs-ganesha-2.7.3-3.el7rhgs.x86_64
nfs-ganesha-debuginfo-2.7.3-3.el7rhgs.x86_64
glusterfs-ganesha-6.0-3.el7rhgs.x86_64
nfs-ganesha-gluster-2.7.3-3.el7rhgs.x86_64

Steps performed:
1.Create 4 node Ganesha cluster
2.Create 4*3 Distributed-Replicate Volume.
3.Export the volume via Ganesha
4.Mount the volume to 4 different clients via 4 servers VIP's (v4.1)
5.From 3 clients run dd in loop

for i in {1..1000000}
do
        echo $i
        dd if=/dev/urandom of=/mnt/ganesha/stressc1$i conv=fdatasync bs=100 count=10000
done

6.From client 4,run find's when IO's are running in parallel

 while true;do find . -mindepth 1 -type f;done

7.Monitor the setup for any hangs.

No hangs were observed.Find's were running in parallel while IO's is in process from other clients.
Moving this BZ to verified state.

Comment 23 errata-xmlrpc 2019-10-30 12:15:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:3252