Hide Forgot
Description of problem: ===================== Hit this issue on same setup - https://bugzilla.redhat.com/show_bug.cgi?id=1730654 du -sh is giving inconsistent output when ls -lRt and find's (named/unnamed) were running. Note: Linux untars got error out but lookups were running from other clients.No new IO's were triggered when output of du -sh was captured. Version-Release number of selected component (if applicable): ============================== # rpm -qa | grep ganesha nfs-ganesha-2.7.3-5.el7rhgs.x86_64 nfs-ganesha-debuginfo-2.7.3-5.el7rhgs.x86_64 nfs-ganesha-gluster-2.7.3-5.el7rhgs.x86_64 glusterfs-ganesha-6.0-7.el7rhgs.x86_64 How reproducible: ================ 1/1 Steps to Reproduce: ================= 1.Create 8 node ganesha cluster 2.Create 8*3 Distributed-Replicate Volume 3.Export the volume via ganesha 4.Mount the volume on 5 clients via v4.1 5.Run the following workload Client 1: Linux untars for large dirs Client 2: du -sh in loop Client 3: ls -lRt in loop Client 4: find . -mindepth 1 -type f -name _04_* in loop Client 5: find . -mindepth 1 -type f in loop Actual results: ============= Linux untar got error out - BZ 1730654 Took 3 iterations of du -sh from 2 clients on same setup (No new IO's were triggered) Client 1: --------- [root@f12-h08-000-1029u ganesha]# du -sh 49G . [root@f12-h08-000-1029u ganesha]# du -sh 85G . [root@f12-h08-000-1029u ganesha]# du -sh 439G Client 2: -------- [root@f12-h12-000-1029u ganesha]# while true;do du -sh;done | 43G . 34G Expected results: =========== du -sh output should be consistent Additional info:
Looking this over, I think there's enough debugging, as long as NFS_READDIR is at FULL_DEBUG.
Ran the test mentioned in comment 0 of the BZ with the testbuild for nfs and kernel provided in comment 17 # rpm -qa | grep ganesha nfs-ganesha-gluster-2.7.3-6.el7rhgs.TESTFIX1.x86_64 nfs-ganesha-2.7.3-6.el7rhgs.TESTFIX1.x86_64 glusterfs-ganesha-6.0-9.el7rhgs.TESTFIX.bz1730654.x86_64 nfs-ganesha-debuginfo-2.7.3-6.el7rhgs.TESTFIX1.x86_64 # rpm -qa | grep kernel kernel-3.10.0-1062.el7.bz1732427.x86_64 kernel-3.10.0-1058.el7.x86_64 kernel-3.10.0-1061.el7.x86_64 abrt-addon-kerneloops-2.1.11-55.el7.x86_64 kernel-tools-3.10.0-1062.el7.bz1732427.x86_64 kernel-tools-libs-3.10.0-1062.el7.bz1732427.x86_64 Ran the test over weekend.While linux untar was in process,there was minor inconsistency observed in du (Attached is the screenshot).There were no files deleted while test was in process.Let me know if this is expected?? Once linux untar got completed,du -sh was giving consistent output when ran with parallel lookups Terminal output- ------- 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . -------
Created attachment 1594257 [details] Du output when linux untar was running in parallel
Created attachment 1594263 [details] Du output when linux untar was running in parallel
Verified this BZ with # rpm -qa | grep ganesha nfs-ganesha-2.7.3-7.el7rhgs.x86_64 glusterfs-ganesha-6.0-11.el7rhgs.x86_64 nfs-ganesha-gluster-2.7.3-7.el7rhgs.x86_64 Steps: ======== 1.Create 4 node ganesha cluster 2.Create 4*3 Distributed-Replicate Volume 3.Export the volume via ganesha 4.Mount the volume on 3 clients via v4.1 5.Run the following workload Client 1: Linux untars for large dirs Client 2: du -sh in loop Client 3: ls -lRt in loop ======= 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . 11G . ======== Du -sh output is consistent.Moving this BZ to verified state
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:3252