Description of problem: Ganesha crashes on one node during volume restart when performance.client-io-threads is off. Version-Release number of selected component (if applicable): [root@dhcp42-59 ~]# rpm -qa|grep ganesha nfs-ganesha-2.4.0-2.el6rhs.x86_64 nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64 glusterfs-ganesha-3.8.4-2.el6rhs.x86_64 How reproducible: Consistent Steps to Reproduce: 1.Create a ganesha cluster, create a volume and enable ganesha on it. 2.Set performance.client-io-threads to off. 3.Stop the volume and then start the volume. 4.Observe that most of the times, ganesha crashes on one of the nodes with below messages in ganesha.log 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : ganesha.nfsd-23973[dbus_heartbeat] unregister_fsal :FSAL :CRIT :Unregister FSAL GLUSTER with non-zero refcount=1 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : ganesha.nfsd-23973[dbus_heartbeat] glusterfs_unload :FSAL :CRIT :FSAL Gluster unable to unload. Dying ... 5.No bt is seen: [Inferior 1 (process 6944) exited with code 02] (gdb) bt No stack. (gdb) 6. Also apart from this there is another observation which is seen only on RHEL 6: everytime it crashes on one node, i see an unwanted entry getting created under exports folder. [root@dhcp42-59 exports]# pwd /var/run/gluster/shared_storage/nfs-ganesha/exports [root@dhcp42-59 exports]# ls -ltr total 1 ----------. 1 root root 0 Oct 5 19:45 sedbj8MSj ----------. 1 root root 0 Oct 5 19:53 sedT7DQPC ----------. 1 root root 0 Oct 5 19:59 sedCneQYU ----------. 1 root root 0 Oct 5 20:01 sed1lcMM7 ----------. 1 root root 0 Oct 5 20:02 sedSpRC1X ----------. 1 root root 0 Oct 5 20:02 sedAFz6LR ----------. 1 root root 0 Oct 5 20:13 sedVI3QsZ ----------. 1 root root 509 Oct 5 20:26 sed0uVXuV ----------. 1 root root 0 Oct 5 20:30 sedcl5uS8 -rw-r--r--. 1 root root 509 Oct 5 20:33 export.ozone.conf ----------. 1 root root 0 Oct 5 20:33 sedSFcZDV Actual results: Ganesha crashes on one node during volume restart when performance.client-io-threads is off and unwanted entries are seen under exports folder in case of only RHEL 6 whenever we this crash is seen. Expected results: There should not be any crashes. Additional info:
(In reply to Shashank Raj from comment #0) > Description of problem: > > Ganesha crashes on one node during volume restart when > performance.client-io-threads is off. > > Version-Release number of selected component (if applicable): > > [root@dhcp42-59 ~]# rpm -qa|grep ganesha > nfs-ganesha-2.4.0-2.el6rhs.x86_64 > nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64 > glusterfs-ganesha-3.8.4-2.el6rhs.x86_64 > > > How reproducible: > > Consistent > > Steps to Reproduce: > 1.Create a ganesha cluster, create a volume and enable ganesha on it. > 2.Set performance.client-io-threads to off. > 3.Stop the volume and then start the volume. > 4.Observe that most of the times, ganesha crashes on one of the nodes with > below messages in ganesha.log > > 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : > ganesha.nfsd-23973[dbus_heartbeat] unregister_fsal :FSAL :CRIT :Unregister > FSAL GLUSTER with non-zero refcount=1 > 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : > ganesha.nfsd-23973[dbus_heartbeat] glusterfs_unload :FSAL :CRIT :FSAL > Gluster unable to unload. Dying ... > > 5.No bt is seen: > > [Inferior 1 (process 6944) exited with code 02] > (gdb) bt > No stack. > (gdb) > > 6. Also apart from this there is another observation which is seen only on > RHEL 6: > Even I hit similar issue in upstream one of the nodes. AFAIK it was package issue(conflicting / mismatching packages) when I cleaned up everything, then it worked fine. I suspect this behavior to similar to that only > everytime it crashes on one node, i see an unwanted entry getting created > under exports folder. > > [root@dhcp42-59 exports]# pwd > /var/run/gluster/shared_storage/nfs-ganesha/exports > [root@dhcp42-59 exports]# ls -ltr > total 1 > ----------. 1 root root 0 Oct 5 19:45 sedbj8MSj > ----------. 1 root root 0 Oct 5 19:53 sedT7DQPC > ----------. 1 root root 0 Oct 5 19:59 sedCneQYU > ----------. 1 root root 0 Oct 5 20:01 sed1lcMM7 > ----------. 1 root root 0 Oct 5 20:02 sedSpRC1X > ----------. 1 root root 0 Oct 5 20:02 sedAFz6LR > ----------. 1 root root 0 Oct 5 20:13 sedVI3QsZ > ----------. 1 root root 509 Oct 5 20:26 sed0uVXuV > ----------. 1 root root 0 Oct 5 20:30 sedcl5uS8 > -rw-r--r--. 1 root root 509 Oct 5 20:33 export.ozone.conf > ----------. 1 root root 0 Oct 5 20:33 sedSFcZDV > These are temporary files created by "sed -i" operation in the script. If operation was unsuccessful by some means(may be here it is due to abort of ganesha process) > Actual results: > > Ganesha crashes on one node during volume restart when > performance.client-io-threads is off and unwanted entries are seen under > exports folder in case of only RHEL 6 whenever we this crash is seen. > > Expected results: > > There should not be any crashes. > > Additional info:
The patch got merged downstream https://code.engineering.redhat.com/gerrit/#/c/86103/ and available in latest gluster bits
With following steps the issue mentioned is not seen: 1.Create a ganesha cluster, create a volume and enable ganesha on it. 2.Set performance.client-io-threads to off. 3.Stop the volume and then start the volume. 4.Observe for any crashes , error messages in logs: No crash is seen with multiple start and stop of gluster volume with client-io-thread set to off. With client-io-thread on and volume stop there are crashes seen which is tracked in another BZ. Moving this BZ to verified. nfs-ganesha-2.4.1-1.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0493.html