Bug 1425748 - [GANESHA] Adding a node to existing ganesha cluster is failing on rhel 6.9
Summary: [GANESHA] Adding a node to existing ganesha cluster is failing on rhel 6.9
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: common-ha
Version: rhgs-3.2
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: ---
: RHGS 3.2.0
Assignee: Jiffin
QA Contact: Manisha Saini
URL:
Whiteboard:
Depends On:
Blocks: 1351528 1425919
TreeView+ depends on / blocked
 
Reported: 2017-02-22 10:07 UTC by Manisha Saini
Modified: 2017-03-23 05:11 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.8.4-16
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1425919 (view as bug list)
Environment:
Last Closed: 2017-03-23 05:11:36 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:0484 0 normal SHIPPED_LIVE Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update 2017-03-23 09:06:37 UTC

Description Manisha Saini 2017-02-22 10:07:18 UTC
Description of problem:
Adding a node to existing 4 node ganesha cluster is failing  

Version-Release number of selected component (if applicable):
# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 6.9 Beta (Santiago)

glusterfs-ganesha-3.8.4-14.el6rhs.x86_64


How reproducible:
Consistently

Steps to Reproduce:
1.Create a 4 Node ganesha cluster.
2.Perform pre-requisite for adding a node to existing cluster
3.Perform Add node from 1 of the node in existing cluster

#/usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp42-191.lab.eng.blr.redhat.com 10.70.42.135

PCS Status on 5th Node

# pcs status
Cluster name: ganesha-ha-360
WARNING: no stonith devices and stonith-enabled is not false
Stack: cman
Current DC: dhcp42-191.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition WITHOUT quorum
Last updated: Wed Feb 22 20:18:59 2017		Last change: Wed Feb 22 20:13:27 2017 by root via crmd on dhcp42-191.lab.eng.blr.redhat.com

5 nodes and 0 resources configured

Node dhcp42-237.lab.eng.blr.redhat.com: UNCLEAN (offline)
Node dhcp43-151.lab.eng.blr.redhat.com: UNCLEAN (offline)
Node dhcp43-171.lab.eng.blr.redhat.com: UNCLEAN (offline)
Node dhcp43-235.lab.eng.blr.redhat.com: UNCLEAN (offline)
Online: [ dhcp42-191.lab.eng.blr.redhat.com ]

No resources


Daemon Status:
  cman: active/disabled
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: active/enabled

Actual results:
Add node is not successfull

Expected results:
Add node should be successfull

Additional info:

While running add node a warning is displayed to restart cluster after add node

============================
# /usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp42-191.lab.eng.blr.redhat.com 10.70.42.135
Starting ganesha.nfsd: [  OK  ]
Disabling SBD service...
dhcp42-191.lab.eng.blr.redhat.com: sbd disabled
dhcp42-237.lab.eng.blr.redhat.com: Corosync updated
dhcp43-151.lab.eng.blr.redhat.com: Corosync updated
dhcp43-235.lab.eng.blr.redhat.com: Corosync updated
dhcp43-171.lab.eng.blr.redhat.com: Corosync updated
Setting up corosync...
dhcp42-191.lab.eng.blr.redhat.com: Updated cluster.conf...
dhcp42-191.lab.eng.blr.redhat.com: Starting Cluster...
Synchronizing pcsd certificates on nodes dhcp42-191.lab.eng.blr.redhat.com...
dhcp42-191.lab.eng.blr.redhat.com: Success

Restarting pcsd on the nodes in order to reload the certificates...
dhcp42-191.lab.eng.blr.redhat.com: Success
Warning: Using udpu transport on a RHEL 6 cluster, cluster restart is required to apply node addition
dhcp42-191.lab.eng.blr.redhat.com: Starting Cluster...
Removing group: dhcp42-237.lab.eng.blr.redhat.com-group (and all resources within group)
Stopping all resources in group: dhcp42-237.lab.eng.blr.redhat.com-group...


==========================


Doing pcs cluster stop --all and pcs cluster start -all ,reflects correct status on 5th node

Comment 4 Atin Mukherjee 2017-02-23 03:54:20 UTC
upstream 3.10 patch : https://review.gluster.org/16721

(Note : the mainline source doesn't have ganesha-ha.sh as its now moved to storhaug project)

Comment 6 Atin Mukherjee 2017-02-24 12:10:16 UTC
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/98581/

Comment 8 Manisha Saini 2017-03-02 18:36:20 UTC
Verified this bug on glusterfs-ganesha-3.8.4-16.el6rhs.x86_64
Node is being added successfully to existing ganesha cluster with correct pcs status

Before adding node to cluster

[root@dhcp42-191 ~]# pcs status
Cluster name: ganesha-ha-360
Stack: cman
Current DC: dhcp43-235.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition with quorum
Last updated: Fri Mar  3 05:30:53 2017		Last change: Fri Mar  3 05:27:46 2017 by root via crm_node on dhcp42-191.lab.eng.blr.redhat.com

4 nodes and 24 resources configured

Online: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Resource Group: dhcp42-191.lab.eng.blr.redhat.com-group
     dhcp42-191.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-191.lab.eng.blr.redhat.com
     dhcp42-191.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-191.lab.eng.blr.redhat.com
     dhcp42-191.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-191.lab.eng.blr.redhat.com
 Resource Group: dhcp42-237.lab.eng.blr.redhat.com-group
     dhcp42-237.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-237.lab.eng.blr.redhat.com
     dhcp42-237.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-237.lab.eng.blr.redhat.com
     dhcp42-237.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-237.lab.eng.blr.redhat.com
 Resource Group: dhcp43-151.lab.eng.blr.redhat.com-group
     dhcp43-151.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp43-151.lab.eng.blr.redhat.com
     dhcp43-151.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp43-151.lab.eng.blr.redhat.com
     dhcp43-151.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp43-151.lab.eng.blr.redhat.com
 Resource Group: dhcp43-235.lab.eng.blr.redhat.com-group
     dhcp43-235.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp43-235.lab.eng.blr.redhat.com
     dhcp43-235.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp43-235.lab.eng.blr.redhat.com
     dhcp43-235.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp43-235.lab.eng.blr.redhat.com

Daemon Status:
  cman: active/disabled
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled


After adding node to cluster (New node-dhcp43-171.lab.eng.blr.redhat.com)

[root@dhcp43-171 ganesha]# pcs status
Cluster name: ganesha-ha-360
Stack: cman
Current DC: dhcp42-191.lab.eng.blr.redhat.com (version 1.1.15-5.el6-e174ec8) - partition with quorum
Last updated: Fri Mar  3 05:34:24 2017		Last change: Fri Mar  3 05:34:08 2017 by root via cibadmin on dhcp42-191.lab.eng.blr.redhat.com

5 nodes and 30 resources configured

Online: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp42-191.lab.eng.blr.redhat.com dhcp42-237.lab.eng.blr.redhat.com dhcp43-151.lab.eng.blr.redhat.com dhcp43-171.lab.eng.blr.redhat.com dhcp43-235.lab.eng.blr.redhat.com ]
 Resource Group: dhcp42-191.lab.eng.blr.redhat.com-group
     dhcp42-191.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-191.lab.eng.blr.redhat.com
     dhcp42-191.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-191.lab.eng.blr.redhat.com
     dhcp42-191.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-191.lab.eng.blr.redhat.com
 Resource Group: dhcp42-237.lab.eng.blr.redhat.com-group
     dhcp42-237.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp42-237.lab.eng.blr.redhat.com
     dhcp42-237.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp42-237.lab.eng.blr.redhat.com
     dhcp42-237.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp42-237.lab.eng.blr.redhat.com
 Resource Group: dhcp43-151.lab.eng.blr.redhat.com-group
     dhcp43-151.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp43-151.lab.eng.blr.redhat.com
     dhcp43-151.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp43-151.lab.eng.blr.redhat.com
     dhcp43-151.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp43-151.lab.eng.blr.redhat.com
 Resource Group: dhcp43-235.lab.eng.blr.redhat.com-group
     dhcp43-235.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp43-235.lab.eng.blr.redhat.com
     dhcp43-235.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp43-235.lab.eng.blr.redhat.com
     dhcp43-235.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp43-235.lab.eng.blr.redhat.com
 Resource Group: dhcp43-171.lab.eng.blr.redhat.com-group
     dhcp43-171.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp43-171.lab.eng.blr.redhat.com
     dhcp43-171.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp43-171.lab.eng.blr.redhat.com
     dhcp43-171.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp43-171.lab.eng.blr.redhat.com

Failed Actions:
* dhcp43-171.lab.eng.blr.redhat.com-nfs_block_monitor_10000 on dhcp43-171.lab.eng.blr.redhat.com 'not running' (7): call=77, status=complete, exitreason='none',
    last-rc-change='Fri Mar  3 05:34:21 2017', queued=0ms, exec=0ms


Daemon Status:
  cman: active/disabled
  corosync: active/disabled
  pacemaker: active/enabled
  pcsd: active/enabled


AS the issue is no more observed with this build,Hence moving this bug to verified.

Comment 10 errata-xmlrpc 2017-03-23 05:11:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0484.html


Note You need to log in before you can comment on or make changes to this bug.