Description of problem: Ganesha process got crashed while deleting diskfill utility contents from windows nfs mount Version-Release number of selected component (if applicable): # rpm -qa | grep ganesha nfs-ganesha-2.5.5-3.el7rhgs.x86_64 nfs-ganesha-gluster-2.5.5-3.el7rhgs.x86_64 glusterfs-ganesha-3.12.2-6.el7rhgs.x86_64 How reproducible: 2/2 Steps to Reproduce: 1.Create 4 node ganesha cluster 2.Create 4*3 Distributed-Replicate volume and export the volume via ganesha 3.Mount the volume on windows client #mount \\10.70.46.37\Ganesha1 N: 4.Mount the same volume on linux client and change permission of nfs share to chmod 777 5.Now copy the diskfill utility zip file on windows NFS mount point 6.Try extracting content of zip folder Ganesha got crashed after step 6- https://bugzilla.redhat.com/show_bug.cgi?id=1562766 7.Restart the ganesha service on the node which got crashed.Now pcs cluster is healthy 8.Delete the zip file and extracted content from nfs windows mount Actual results: At step 8,Ganesha got crashed again (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fccc1722700 (LWP 8247)] 0x00007fcd63c5abcc in free () from /lib64/libc.so.6 (gdb) bt #0 0x00007fcd63c5abcc in free () from /lib64/libc.so.6 #1 0x00007fcd5f61a3f8 in glusterfs_close_my_fd () from /usr/lib64/ganesha/libfsalgluster.so #2 0x00007fcd5f61a564 in glusterfs_close2 () from /usr/lib64/ganesha/libfsalgluster.so #3 0x000055de9c01ea66 in mdcache_close2 () #4 0x000055de9bfd2b57 in dec_nlm_state_ref () #5 0x000055de9bfa1a17 in nlm4_Share () #6 0x000055de9bf5f3cb in nfs_rpc_execute () #7 0x000055de9bf60a2a in worker_run () #8 0x000055de9bfefc59 in fridgethr_start_routine () #9 0x00007fcd64607dd5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007fcd63cd3b3d in clone () from /lib64/libc.so.6 (gdb) generate-core-file Expected results: Ganesha should not crash while deleting files on NFS windows mount Additional info:
I think I can recreate this, will work on a fix.
I have a set of 4 patches developed while I tried to work this out: https://review.gerrithub.io/#/c/406501/ https://review.gerrithub.io/#/c/406502/ https://review.gerrithub.io/#/c/406503/ https://review.gerrithub.io/#/c/406504/ This specific issue is finally fixed by https://review.gerrithub.io/#/c/406502/ However it also depends on the others (the first IS just improved debug).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2610