Bug 1381940
Summary: | Ganesha crashes on one node during volume restart when performance.client-io-threads is off. | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shashank Raj <sraj> |
Component: | nfs-ganesha | Assignee: | Jiffin <jthottan> |
Status: | CLOSED ERRATA | QA Contact: | surabhi <sbhaloth> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.2 | CC: | jthottan, kkeithle, mzywusko, ndevos, rcyriac, rhinduja, rhs-bugs, sbhaloth, skoduri, storage-qa-internal |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | RHGS 3.2.0 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.4-3 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-03-23 06:24:05 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1351528 |
Description
Shashank Raj
2016-10-05 11:52:40 UTC
(In reply to Shashank Raj from comment #0) > Description of problem: > > Ganesha crashes on one node during volume restart when > performance.client-io-threads is off. > > Version-Release number of selected component (if applicable): > > [root@dhcp42-59 ~]# rpm -qa|grep ganesha > nfs-ganesha-2.4.0-2.el6rhs.x86_64 > nfs-ganesha-gluster-2.4.0-2.el6rhs.x86_64 > glusterfs-ganesha-3.8.4-2.el6rhs.x86_64 > > > How reproducible: > > Consistent > > Steps to Reproduce: > 1.Create a ganesha cluster, create a volume and enable ganesha on it. > 2.Set performance.client-io-threads to off. > 3.Stop the volume and then start the volume. > 4.Observe that most of the times, ganesha crashes on one of the nodes with > below messages in ganesha.log > > 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : > ganesha.nfsd-23973[dbus_heartbeat] unregister_fsal :FSAL :CRIT :Unregister > FSAL GLUSTER with non-zero refcount=1 > 05/10/2016 20:13:12 : epoch dcf30000 : dhcp42-96.lab.eng.blr.redhat.com : > ganesha.nfsd-23973[dbus_heartbeat] glusterfs_unload :FSAL :CRIT :FSAL > Gluster unable to unload. Dying ... > > 5.No bt is seen: > > [Inferior 1 (process 6944) exited with code 02] > (gdb) bt > No stack. > (gdb) > > 6. Also apart from this there is another observation which is seen only on > RHEL 6: > Even I hit similar issue in upstream one of the nodes. AFAIK it was package issue(conflicting / mismatching packages) when I cleaned up everything, then it worked fine. I suspect this behavior to similar to that only > everytime it crashes on one node, i see an unwanted entry getting created > under exports folder. > > [root@dhcp42-59 exports]# pwd > /var/run/gluster/shared_storage/nfs-ganesha/exports > [root@dhcp42-59 exports]# ls -ltr > total 1 > ----------. 1 root root 0 Oct 5 19:45 sedbj8MSj > ----------. 1 root root 0 Oct 5 19:53 sedT7DQPC > ----------. 1 root root 0 Oct 5 19:59 sedCneQYU > ----------. 1 root root 0 Oct 5 20:01 sed1lcMM7 > ----------. 1 root root 0 Oct 5 20:02 sedSpRC1X > ----------. 1 root root 0 Oct 5 20:02 sedAFz6LR > ----------. 1 root root 0 Oct 5 20:13 sedVI3QsZ > ----------. 1 root root 509 Oct 5 20:26 sed0uVXuV > ----------. 1 root root 0 Oct 5 20:30 sedcl5uS8 > -rw-r--r--. 1 root root 509 Oct 5 20:33 export.ozone.conf > ----------. 1 root root 0 Oct 5 20:33 sedSFcZDV > These are temporary files created by "sed -i" operation in the script. If operation was unsuccessful by some means(may be here it is due to abort of ganesha process) > Actual results: > > Ganesha crashes on one node during volume restart when > performance.client-io-threads is off and unwanted entries are seen under > exports folder in case of only RHEL 6 whenever we this crash is seen. > > Expected results: > > There should not be any crashes. > > Additional info: The patch got merged downstream https://code.engineering.redhat.com/gerrit/#/c/86103/ and available in latest gluster bits With following steps the issue mentioned is not seen: 1.Create a ganesha cluster, create a volume and enable ganesha on it. 2.Set performance.client-io-threads to off. 3.Stop the volume and then start the volume. 4.Observe for any crashes , error messages in logs: No crash is seen with multiple start and stop of gluster volume with client-io-thread set to off. With client-io-thread on and volume stop there are crashes seen which is tracked in another BZ. Moving this BZ to verified. nfs-ganesha-2.4.1-1.el7rhgs.x86_64 nfs-ganesha-gluster-2.4.1-1.el7rhgs.x86_64 glusterfs-ganesha-3.8.4-5.el7rhgs.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2017-0493.html |