Hide Forgot
Description of problem: Ganesha process crashes on all the nodes with segfault error with latest libntirpc packages, while enabling nfs-ganesha on the cluster. Version-Release number of selected component (if applicable): [root@dhcp43-116 ~]# rpm -qa|grep ganesha nfs-ganesha-next.20160813.2f47e8a-1.el7.centos.x86_64 glusterfs-ganesha-3.8.2-0.1.gitd33aa0b.el7rhgs.x86_64 nfs-ganesha-gluster-next.20160813.2f47e8a-1.el7.centos.x86_64 [root@dhcp43-116 ~]# rpm -qa|grep libntirpc libntirpc-duplex13.20160726.d375195-1.el7.centos.x86_64 How reproducible: Always Steps to Reproduce: 1. Install the latest nfs-ganesha and libntirpc packages from the below locations: http://artifacts.ci.centos.org/nfs-ganesha/nightly/next/7/x86_64/x86_64/ http://artifacts.ci.centos.org/nfs-ganesha/nightly/libntirpc/duplex13/7/x86_64/ 2.Try to setup ganesha on the cluster 3.Observe that after enabling ganesha on the cluster, ganesha process on all the nodes crashes with segfault error [root@dhcp43-116 ~]# gluster nfs-ganesha enable Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted pool. Do you still want to continue? (y/n) y This will take a few minutes to complete. Please wait .. nfs-ganesha : success [root@dhcp43-116 ~]# service nfs-ganesha status Redirecting to /bin/systemctl status nfs-ganesha.service ● nfs-ganesha.service - NFS-Ganesha file server Loaded: loaded (/usr/lib/systemd/system/nfs-ganesha.service; disabled; vendor preset: disabled) Active: failed (Result: signal) since Wed 2016-08-24 17:14:34 IST; 4min 18s ago Docs: http://github.com/nfs-ganesha/nfs-ganesha/wiki Main PID: 9943 (code=killed, signal=SEGV) Aug 24 17:14:34 dhcp43-116.lab.eng.blr.redhat.com systemd[1]: Starting NFS-Ganes... Aug 24 17:14:34 dhcp43-116.lab.eng.blr.redhat.com systemd[1]: Started NFS-Ganesh... Aug 24 17:14:34 dhcp43-116.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha.servic... Aug 24 17:14:34 dhcp43-116.lab.eng.blr.redhat.com systemd[1]: Unit nfs-ganesha.s... Aug 24 17:14:34 dhcp43-116.lab.eng.blr.redhat.com systemd[1]: nfs-ganesha.servic... Hint: Some lines were ellipsized, use -l to show in full. Following segfault error messages are seen on all the nodes: [592907.834912] ganesha.nfsd[9943]: segfault at 36 ip 00007f0e91926783 sp 00007ffcc871c810 error 6 in libntirpc.so.1.4.0[7f0e91913000+3f000] [592900.525640] ganesha.nfsd[6192]: segfault at 36 ip 00007f723fbee783 sp 00007ffce0a17150 error 6 in libntirpc.so.1.4.0[7f723fbdb000+3f000] [592894.520603] ganesha.nfsd[30173]: segfault at 36 ip 00007fd4302fc783 sp 00007ffeae644a70 error 6 in libntirpc.so.1.4.0[7fd4302e9000+3f000] [582583.186948] ganesha.nfsd[31148]: segfault at 36 ip 00007fe01f583783 sp 00007fff8e71f350 error 6 in libntirpc.so.1.4.0[7fe01f570000+3f000] pcs status: [root@dhcp43-116 ~]# pcs status Cluster name: G1471855029.47 Last updated: Wed Aug 24 17:21:11 2016 Last change: Wed Aug 24 17:16:28 2016 by root via cibadmin on dhcp43-116.lab.eng.blr.redhat.com Stack: corosync Current DC: dhcp43-88.lab.eng.blr.redhat.com (version 1.1.13-10.el7_2.4-44eb2dd) - partition with quorum 4 nodes and 16 resources configured Online: [ dhcp42-237.lab.eng.blr.redhat.com dhcp42-47.lab.eng.blr.redhat.com dhcp43-116.lab.eng.blr.redhat.com dhcp43-88.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp42-237.lab.eng.blr.redhat.com dhcp42-47.lab.eng.blr.redhat.com dhcp43-116.lab.eng.blr.redhat.com dhcp43-88.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp42-237.lab.eng.blr.redhat.com dhcp42-47.lab.eng.blr.redhat.com dhcp43-116.lab.eng.blr.redhat.com dhcp43-88.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Stopped: [ dhcp42-237.lab.eng.blr.redhat.com dhcp42-47.lab.eng.blr.redhat.com dhcp43-116.lab.eng.blr.redhat.com dhcp43-88.lab.eng.blr.redhat.com ] dhcp43-116.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped dhcp43-88.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped dhcp42-47.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped dhcp42-237.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Stopped Failed Actions: * nfs-grace_monitor_0 on dhcp43-88.lab.eng.blr.redhat.com 'unknown error' (1): call=18, status=complete, exitreason='none', last-rc-change='Wed Aug 24 17:15:51 2016', queued=0ms, exec=336ms * nfs-grace_monitor_0 on dhcp42-47.lab.eng.blr.redhat.com 'unknown error' (1): call=18, status=complete, exitreason='none', last-rc-change='Wed Aug 24 17:15:51 2016', queued=0ms, exec=354ms * nfs-grace_monitor_0 on dhcp43-116.lab.eng.blr.redhat.com 'unknown error' (1): call=18, status=complete, exitreason='none', last-rc-change='Wed Aug 24 17:15:51 2016', queued=0ms, exec=322ms * nfs-grace_monitor_0 on dhcp42-237.lab.eng.blr.redhat.com 'unknown error' (1): call=18, status=complete, exitreason='none', last-rc-change='Wed Aug 24 17:15:51 2016', queued=0ms, exec=365ms PCSD Status: dhcp43-116.lab.eng.blr.redhat.com: Online dhcp43-88.lab.eng.blr.redhat.com: Online dhcp42-47.lab.eng.blr.redhat.com: Online dhcp42-237.lab.eng.blr.redhat.com: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/disabled [root@dhcp43-116 ~]# Actual results: Ganesha process crashes on all the nodes with segfault error with latest libntirpc packages Expected results: there should not be any crashes Additional info: No core generated. sosreport and ganesha logs will be attached
sosreport and logs can be found under http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1369674
This issue is not seen with latest nfs-ganesha and libntirpc packages: [root@dhcp43-116 ~]# rpm -qa|grep ganesha nfs-ganesha-gluster-next.20160827.7641daf-1.el7.centos.x86_64 glusterfs-ganesha-3.8.3-0.6.git7956718.el7.centos.x86_64 nfs-ganesha-debuginfo-next.20160827.7641daf-1.el7.centos.x86_64 nfs-ganesha-next.20160827.7641daf-1.el7.centos.x86_64 [root@dhcp43-116 ~]# rpm -qa|grep libntirpc libntirpc-duplex13.20160825.d375195-1.el7.centos.x86_64 Can be closed. Will reopen, if i hit it again.
Thanks Shashank. Closing this bug.