Description of problem: While enabling ganesha on 8 nodes,sometimes "gluster nfs-gamesha enable" commands gives output as failed even when ganesha cluster is up and running in backend.In glusterd.log "unlocking failed" messages are observed This issue is intermittent but have encountered this 3-4 times. # time gluster nfs-ganesha enable Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted pool. Do you still want to continue? (y/n) y This will take a few minutes to complete. Please wait .. nfs-ganesha: failed real 3m56.254s user 0m0.105s sys 0m0.180s Glusterd logs ---------------- [2018-05-07 09:42:50.357894] I [MSGID: 106474] [glusterd-ganesha.c:433:check_host_list] 0-management: ganesha host found Hostname is dhcp47-193.lab.eng.blr.redhat.com [2018-05-07 09:45:20.231419] I [glusterd-locks.c:730:gd_mgmt_v3_unlock_timer_cbk] 0-management: In gd_mgmt_v3_unlock_timer_cbk [2018-05-07 09:46:14.626849] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp37-121.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627387] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp37-103.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627483] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp37-218.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627583] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp37-136.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627644] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp46-116.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627703] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp46-184.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627779] E [MSGID: 106116] [glusterd-mgmt.c:124:gd_mgmt_v3_collate_errors] 0-management: Unlocking failed on dhcp47-2.lab.eng.blr.redhat.com. Please check log file for details. [2018-05-07 09:46:14.627952] E [MSGID: 106152] [glusterd-syncop.c:1641:gd_unlock_op_phase] 0-management: Failed to unlock on some peer(s) [2018-05-07 09:46:14.628158] W [glusterd-locks.c:845:glusterd_mgmt_v3_unlock] (-->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe1379) [0x7fa08bdaa379] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe09ca) [0x7fa08bda99ca] -->/usr/lib64/glusterfs/3.12.2/xlator/mgmt/glusterd.so(+0xe8935) [0x7fa08bdb1935] ) 0-management: Lock for global All not held [2018-05-07 09:46:14.628197] E [MSGID: 106118] [glusterd-syncop.c:1667:gd_unlock_op_phase] 0-management: Unable to release lock for All ------------------ Version-Release number of selected component (if applicable): # rpm -qa | grep ganesha glusterfs-ganesha-3.12.2-8.el7rhgs.x86_64 nfs-ganesha-gluster-2.5.5-6.el7rhgs.x86_64 nfs-ganesha-2.5.5-6.el7rhgs.x86_64 How reproducible: Intermittent Steps to Reproduce: 1.Create 8 node ganesha cluster Actual results: Ganesha enable command fails with "Unlocking failed " messages in glusterd.log but when checked in backend,cluster is up and running -------------------- # gluster nfs-ganesha disable Disabling NFS-Ganesha will tear down the entire ganesha cluster across the trusted pool. Do you still want to continue? (y/n) y This will take a few minutes to complete. Please wait .. nfs-ganesha : success [root@dhcp47-193 ganesha]# time gluster nfs-ganesha enable Enabling NFS-Ganesha requires Gluster-NFS to be disabled across the trusted pool. Do you still want to continue? (y/n) y This will take a few minutes to complete. Please wait .. nfs-ganesha: failed real 3m56.254s user 0m0.105s sys 0m0.180s # pcs status Cluster name: ganesha-ha Stack: corosync Current DC: dhcp47-193.lab.eng.blr.redhat.com (version 1.1.18-11.el7-2b07d5c5a9) - partition with quorum Last updated: Mon May 7 15:16:31 2018 Last change: Mon May 7 15:15:57 2018 by root via cibadmin on dhcp47-193.lab.eng.blr.redhat.com 8 nodes configured 48 resources configured Online: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ dhcp37-103.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com ] Resource Group: dhcp37-121.lab.eng.blr.redhat.com-group dhcp37-121.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp37-121.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp37-121.lab.eng.blr.redhat.com dhcp37-121.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp37-121.lab.eng.blr.redhat.com Resource Group: dhcp37-103.lab.eng.blr.redhat.com-group dhcp37-103.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp37-103.lab.eng.blr.redhat.com dhcp37-103.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp37-103.lab.eng.blr.redhat.com dhcp37-103.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp37-103.lab.eng.blr.redhat.com Resource Group: dhcp37-218.lab.eng.blr.redhat.com-group dhcp37-218.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp37-218.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp37-218.lab.eng.blr.redhat.com dhcp37-218.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp37-218.lab.eng.blr.redhat.com Resource Group: dhcp37-136.lab.eng.blr.redhat.com-group dhcp37-136.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp37-136.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp37-136.lab.eng.blr.redhat.com dhcp37-136.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp37-136.lab.eng.blr.redhat.com Resource Group: dhcp47-193.lab.eng.blr.redhat.com-group dhcp47-193.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp47-193.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp47-193.lab.eng.blr.redhat.com dhcp47-193.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp47-193.lab.eng.blr.redhat.com Resource Group: dhcp46-116.lab.eng.blr.redhat.com-group dhcp46-116.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-116.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-116.lab.eng.blr.redhat.com dhcp46-116.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-116.lab.eng.blr.redhat.com Resource Group: dhcp46-184.lab.eng.blr.redhat.com-group dhcp46-184.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp46-184.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp46-184.lab.eng.blr.redhat.com dhcp46-184.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp46-184.lab.eng.blr.redhat.com Resource Group: dhcp47-2.lab.eng.blr.redhat.com-group dhcp47-2.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp47-2.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp47-2.lab.eng.blr.redhat.com dhcp47-2.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp47-2.lab.eng.blr.redhat.com Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled Expected results: Ganesha enable command should give correct output Additional info: Raising it against component "ganesha".Change the component if required.Attaching sosreport shortly
Hi Manisha, I'm suspecting this bug is same as https://bugzilla.redhat.com/show_bug.cgi?id=1568436. I raised this bug to fix the issue we discussed in the mail thread with subject "nfs-ganesha enable issue". Please let me whether it is same or something different.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2607
This is the ganesha setup usecase which is covered as part of each ganesha testcase. Hence setting the qe_test_coverage flag + with no Testcase ID.