Description of problem: Adding a node to existing 4 node ganesha cluster is failing Version-Release number of selected component (if applicable): # cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.9 Beta (Santiago) glusterfs-ganesha-3.8.4-14.el6rhs.x86_64 How reproducible: Consistently Steps to Reproduce: 1.Create a 4 Node ganesha cluster. 2.Perform pre-requisite for adding a node to existing cluster 3.Perform Add node from 1 of the node in existing cluster #/usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp42-191.lab.eng.blr.redhat.com 10.70.42.135 PCS Status on 5th Node # pcs status Cluster name: ganesha-ha-360 WARNING: no stonith devices and stonith-enabled is not false Stack: cman Current DC: dhcp42-191.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition WITHOUT quorum Last updated: Wed Feb 22 20:18:59 2017 Last change: Wed Feb 22 20:13:27 2017 by root via crmd on dhcp42-191.lab.eng.blr.redhat.com 5 nodes and 0 resources configured Node dhcp42-237.lab.eng.blr.redhat.com: UNCLEAN (offline) Node dhcp43-151.lab.eng.blr.redhat.com: UNCLEAN (offline) Node dhcp43-171.lab.eng.blr.redhat.com: UNCLEAN (offline) Node dhcp43-235.lab.eng.blr.redhat.com: UNCLEAN (offline) Online: [ dhcp42-191.lab.eng.blr.redhat.com ] No resources Daemon Status: cman: active/disabled corosync: active/disabled pacemaker: active/enabled pcsd: active/enabled Actual results: Add node is not successfull Expected results: Add node should be successfull Additional info: While running add node a warning is displayed to restart cluster after add node ============================ # /usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp42-191.lab.eng.blr.redhat.com 10.70.42.135 Starting ganesha.nfsd: [ OK ] Disabling SBD service... dhcp42-191.lab.eng.blr.redhat.com: sbd disabled dhcp42-237.lab.eng.blr.redhat.com: Corosync updated dhcp43-151.lab.eng.blr.redhat.com: Corosync updated dhcp43-235.lab.eng.blr.redhat.com: Corosync updated dhcp43-171.lab.eng.blr.redhat.com: Corosync updated Setting up corosync... dhcp42-191.lab.eng.blr.redhat.com: Updated cluster.conf... dhcp42-191.lab.eng.blr.redhat.com: Starting Cluster... Synchronizing pcsd certificates on nodes dhcp42-191.lab.eng.blr.redhat.com... dhcp42-191.lab.eng.blr.redhat.com: Success Restarting pcsd on the nodes in order to reload the certificates... dhcp42-191.lab.eng.blr.redhat.com: Success Warning: Using udpu transport on a RHEL 6 cluster, cluster restart is required to apply node addition dhcp42-191.lab.eng.blr.redhat.com: Starting Cluster... Removing group: dhcp42-237.lab.eng.blr.redhat.com-group (and all resources within group) Stopping all resources in group: dhcp42-237.lab.eng.blr.redhat.com-group... ========================== Doing pcs cluster stop --all and pcs cluster start -all ,reflects correct status on 5th node
upstream 3.10 patch : https://review.gluster.org/16721 (Note : the mainline source doesn't have ganesha-ha.sh as its now moved to storhaug project)
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/98581/
Verified this bug on glusterfs-ganesha-3.8.4-16.el6rhs.x86_64 Node is being added successfully to existing ganesha cluster with correct pcs status Before adding node to cluster [root@dhcp42-191 ~]# pcs status Cluster name: ganesha-ha-360 Stack: cman Current DC: dhcp43-235.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition with quorum Last updated: Fri Mar 3 05:30:53 2017 Last change: Fri Mar 3 05:27:46 2017 by root via crm_node on dhcp42-191.lab.eng.blr.redhat.com 4 nodes and 24 resources configured Online: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Resource Group: dhcp42-191.lab.eng.blr.redhat.com-group dhcp42-191.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp42-191.lab.eng.blr.redhat.com dhcp42-191.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp42-191.lab.eng.blr.redhat.com dhcp42-191.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp42-191.lab.eng.blr.redhat.com Resource Group: dhcp42-237.lab.eng.blr.redhat.com-group dhcp42-237.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp42-237.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp42-237.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp42-237.lab.eng.blr.redhat.com Resource Group: dhcp43-151.lab.eng.blr.redhat.com-group dhcp43-151.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp43-151.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp43-151.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp43-151.lab.eng.blr.redhat.com Resource Group: dhcp43-235.lab.eng.blr.redhat.com-group dhcp43-235.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp43-235.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp43-235.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp43-235.lab.eng.blr.redhat.com Daemon Status: cman: active/disabled corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled After adding node to cluster (New node-dhcp43-171.lab.eng.blr.redhat.com) [root@dhcp43-171 ganesha]# pcs status Cluster name: ganesha-ha-360 Stack: cman Current DC: dhcp42-191.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition with quorum Last updated: Fri Mar 3 05:34:24 2017 Last change: Fri Mar 3 05:34:08 2017 by root via cibadmin on dhcp42-191.lab.eng.blr.redhat.com 5 nodes and 30 resources configured Online: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Full list of resources: Clone Set: nfs_setup-clone [nfs_setup] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Clone Set: nfs-mon-clone [nfs-mon] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Clone Set: nfs-grace-clone [nfs-grace] Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ] Resource Group: dhcp42-191.lab.eng.blr.redhat.com-group dhcp42-191.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp42-191.lab.eng.blr.redhat.com dhcp42-191.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp42-191.lab.eng.blr.redhat.com dhcp42-191.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp42-191.lab.eng.blr.redhat.com Resource Group: dhcp42-237.lab.eng.blr.redhat.com-group dhcp42-237.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp42-237.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp42-237.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp42-237.lab.eng.blr.redhat.com Resource Group: dhcp43-151.lab.eng.blr.redhat.com-group dhcp43-151.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp43-151.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp43-151.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp43-151.lab.eng.blr.redhat.com Resource Group: dhcp43-235.lab.eng.blr.redhat.com-group dhcp43-235.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp43-235.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp43-235.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp43-235.lab.eng.blr.redhat.com Resource Group: dhcp43-171.lab.eng.blr.redhat.com-group dhcp43-171.lab.eng.blr.redhat.com-nfs_block (ocf::heartbeat:portblock): Started dhcp43-171.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com-cluster_ip-1 (ocf::heartbeat:IPaddr): Started dhcp43-171.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com-nfs_unblock (ocf::heartbeat:portblock): Started dhcp43-171.lab.eng.blr.redhat.com Failed Actions: * dhcp43-171.lab.eng.blr.redhat.com-nfs_block_monitor_10000 on dhcp43-171.lab.eng.blr.redhat.com 'not running' (7): call=77, status=complete, exitreason='none', last-rc-change='Fri Mar 3 05:34:21 2017', queued=0ms, exec=0ms Daemon Status: cman: active/disabled corosync: active/disabled pacemaker: active/enabled pcsd: active/enabled AS the issue is no more observed with this build,Hence moving this bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0484.html