Bug 1406410

Summary: [GANESHA] Adding node to ganesha cluster is not assigning the correct VIP to the new node
Product: [Community] GlusterFS Reporter: Soumya Koduri <skoduri>
Component: common-haAssignee: Soumya Koduri <skoduri>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: urgent    
Version: mainlineCC: bugs, kkeithle, msaini, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: glusterfs-3.10.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1406401
: 1408110 (view as bug list) Environment:
Last Closed: 2017-03-06 17:40:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1406401, 1408110    

Description Soumya Koduri 2016-12-20 13:08:40 UTC
+++ This bug was initially created as a clone of Bug #1406401 +++

Description of problem:
When a new node is being added to ganesha cluster,it should get the VIP same as mentioned in add node command.Instead the new node is being assigned with VIP of 1 of the node in existing cluster.

Version-Release number of selected component (if applicable):
# rpm -qa | grep ganesha
nfs-ganesha-2.4.1-3.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-9.el7rhgs.x86_64
nfs-ganesha-gluster-2.4.1-3.el7rhgs.x86_64

How reproducible:
Consistently

Steps to Reproduce:
1.Create 4 node ganesha cluster and enable ganesha on it.
2.Add the new node to the existing ganesha cluster.
# /usr/libexec/ganesha/ganesha-ha.sh --add /var/run/gluster/shared_storage/nfs-ganesha/ dhcp47-59.lab.eng.blr.redhat.com 10.70.44.157

Node 1:
[root@dhcp46-219 ganesha]# ip addr         VIP 10.70.44.156
Node 2:
[root@dhcp47-45 ~]# ip addr                VIP 10.70.44.154
Node 3:
[root@dhcp47-3 nfs-ganesha]# ip addr       VIP 10.70.44.155
Node 4:
[root@dhcp46-241 ~]# ip addr               VIP 10.70.44.153

New Node which is being added to ganesha cluster:

[root@dhcp47-59 nfs-ganesha]# ip addr      VIP 10.70.44.154

======

[root@dhcp47-59 nfs-ganesha]# pcs status
Cluster name: ganesha-ha-360
Stack: corosync
Current DC: dhcp46-241.lab.eng.blr.redhat.com (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
Last updated: Tue Dec 20 18:00:33 2016		Last change: Tue Dec 20 17:36:01 2016 by root via crm_attribute on dhcp47-59.lab.eng.blr.redhat.com

5 nodes and 30 resources configured

Online: [ dhcp46-219.lab.eng.blr.redhat.com dhcp46-241.lab.eng.blr.redhat.com dhcp47-3.lab.eng.blr.redhat.com dhcp47-45.lab.eng.blr.redhat.com dhcp47-59.lab.eng.blr.redhat.com ]

Full list of resources:

 Clone Set: nfs_setup-clone [nfs_setup]
     Started: [ dhcp46-219.lab.eng.blr.redhat.com dhcp46-241.lab.eng.blr.redhat.com dhcp47-3.lab.eng.blr.redhat.com dhcp47-45.lab.eng.blr.redhat.com dhcp47-59.lab.eng.blr.redhat.com ]
 Clone Set: nfs-mon-clone [nfs-mon]
     Started: [ dhcp46-219.lab.eng.blr.redhat.com dhcp46-241.lab.eng.blr.redhat.com dhcp47-3.lab.eng.blr.redhat.com dhcp47-45.lab.eng.blr.redhat.com dhcp47-59.lab.eng.blr.redhat.com ]
 Clone Set: nfs-grace-clone [nfs-grace]
     Started: [ dhcp46-219.lab.eng.blr.redhat.com dhcp46-241.lab.eng.blr.redhat.com dhcp47-3.lab.eng.blr.redhat.com dhcp47-45.lab.eng.blr.redhat.com dhcp47-59.lab.eng.blr.redhat.com ]
 Resource Group: dhcp46-219.lab.eng.blr.redhat.com-group
     dhcp46-219.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-219.lab.eng.blr.redhat.com
     dhcp46-219.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-219.lab.eng.blr.redhat.com
     dhcp46-219.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-219.lab.eng.blr.redhat.com
 Resource Group: dhcp46-241.lab.eng.blr.redhat.com-group
     dhcp46-241.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp46-241.lab.eng.blr.redhat.com
     dhcp46-241.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp46-241.lab.eng.blr.redhat.com
     dhcp46-241.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp46-241.lab.eng.blr.redhat.com
 Resource Group: dhcp47-3.lab.eng.blr.redhat.com-group
     dhcp47-3.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-3.lab.eng.blr.redhat.com
     dhcp47-3.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-3.lab.eng.blr.redhat.com
     dhcp47-3.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-3.lab.eng.blr.redhat.com
 Resource Group: dhcp47-45.lab.eng.blr.redhat.com-group
     dhcp47-45.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-45.lab.eng.blr.redhat.com
     dhcp47-45.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-45.lab.eng.blr.redhat.com
     dhcp47-45.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-45.lab.eng.blr.redhat.com
 Resource Group: dhcp47-59.lab.eng.blr.redhat.com-group
     dhcp47-59.lab.eng.blr.redhat.com-nfs_block	(ocf::heartbeat:portblock):	Started dhcp47-59.lab.eng.blr.redhat.com
     dhcp47-59.lab.eng.blr.redhat.com-cluster_ip-1	(ocf::heartbeat:IPaddr):	Started dhcp47-59.lab.eng.blr.redhat.com
     dhcp47-59.lab.eng.blr.redhat.com-nfs_unblock	(ocf::heartbeat:portblock):	Started dhcp47-59.lab.eng.blr.redhat.com

Failed Actions:
* dhcp47-45.lab.eng.blr.redhat.com-cluster_ip-1_monitor_15000 on dhcp47-45.lab.eng.blr.redhat.com 'not running' (7): call=123, status=complete, exitreason='none',
    last-rc-change='Tue Dec 20 17:36:01 2016', queued=0ms, exec=0ms
* dhcp46-241.lab.eng.blr.redhat.com-nfs_block_monitor_10000 on dhcp46-241.lab.eng.blr.redhat.com 'not running' (7): call=36, status=complete, exitreason='none',
    last-rc-change='Tue Dec 20 14:41:24 2016', queued=0ms, exec=0ms
* nfs-grace_monitor_5000 on dhcp47-59.lab.eng.blr.redhat.com 'not running' (7): call=69, status=complete, exitreason='none',
    last-rc-change='Tue Dec 20 17:35:56 2016', queued=0ms, exec=0ms


Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

======

# cat ganesha-ha.conf
# Provide a unique name for the cluster.
HA_NAME="ganesha-ha-360"
# The subset of nodes of the Gluster Trusted Storage Pool that forms the ganesha
# HA cluster. Hostname should specified, IP addresses are not allowed.
# Maximum number of 16 nodes are supported.
HA_CLUSTER_NODES="dhcp46-219.lab.eng.blr.redhat.com,dhcp46-241.lab.eng.blr.redhat.com,dhcp47-3.lab.eng.blr.redhat.com,dhcp47-45.lab.eng.blr.redhat.com,dhcp47-59.lab.eng.blr.redhat.com"
# Virtual IPs of each of the nodes specified above.
VIP_dhcp46-241.lab.eng.blr.redhat.com="10.70.44.153"
VIP_dhcp47-45.lab.eng.blr.redhat.com="10.70.44.154"
VIP_dhcp47-3.lab.eng.blr.redhat.com="10.70.44.155"
VIP_dhcp46-219.lab.eng.blr.redhat.com="10.70.44.156"
VIP_dhcp47-59.lab.eng.blr.redhat.com="10.70.44.157"

Actual results:
New Node has the VIP 10.70.44.154 which is already assigned to 2nd node in existing ganesha cluster

Expected results:
New node should have VIP 10.70.44.157 which is being assigned in add node command

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2016-12-20 07:56:50 EST ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.2.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

Comment 1 Worker Ant 2016-12-20 13:09:49 UTC
REVIEW: http://review.gluster.org/16213 (common-ha: Correct the VIP assigned to the new node added) posted (#2) for review on master by soumya k (skoduri)

Comment 2 Worker Ant 2016-12-22 07:29:21 UTC
COMMIT: http://review.gluster.org/16213 committed in master by Atin Mukherjee (amukherj) 
------
commit de576c08ef17706d25efecff7b57cc8c0294cf6f
Author: Soumya Koduri <skoduri>
Date:   Tue Dec 20 18:22:02 2016 +0530

    common-ha: Correct the VIP assigned to the new node added
    
    There is a regression introduced with patch#16115. An incorrect
    VIP gets assigned to the new node being added to the cluster.
    This patch fixes the same.
    
    Change-Id: I468c7d16bf7e4efa04692db83b1c5ee58fbb7d5f
    BUG: 1406410
    Signed-off-by: Soumya Koduri <skoduri>
    Reviewed-on: http://review.gluster.org/16213
    Smoke: Gluster Build System <jenkins.org>
    Reviewed-by: Kaleb KEITHLEY <kkeithle>
    Reviewed-by: jiffin tony Thottan <jthottan>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 3 Shyamsundar 2017-03-06 17:40:20 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/