Bug 1394769 - Server Node not able to connect after stopping and starting the network port
Summary: Server Node not able to connect after stopping and starting the network port
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd
Version: rhgs-3.2
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Gaurav Yadav
QA Contact: Byreddy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-14 12:26 UTC by Karan Sandha
Modified: 2017-08-03 03:53 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-08-03 03:53:48 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Karan Sandha 2016-11-14 12:26:01 UTC
Description of problem:
glusterd not able to connect the servers if we bring down and then bring up a network port of a server. 

Version-Release number of selected component (if applicable):
[root@dhcp47-143 ~]# gluster --version
glusterfs 3.8.4 built on Oct 24 2016 11:13:47
Repository revision: git://git.gluster.com/glusterfs.git
Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com>
GlusterFS comes with ABSOLUTELY NO WARRANTY.
You may redistribute copies of GlusterFS under the terms of the GNU General Public License.


How reproducible:
2/2
log placed at rhsqe-repo.lab.eng.blr.redhat.com:/var/www/html/sosreports/<bug>

Steps to Reproduce:
1. Create a gluster cluster of three servers by peer probing them 
2. Now bring one network port down from server 3 :- ifdown <port name> eg: eth0
3. now check for gluster peer status on server 1
[root@dhcp47-141 home]# gluster peer status
Number of Peers: 2

Hostname: dhcp47-143.lab.eng.blr.redhat.com
Uuid: 526679d1-7035-4cdd-8c9f-d27a060d7022
State: Peer in Cluster (Connected)

Hostname: dhcp47-144.lab.eng.blr.redhat.com
Uuid: bd889f19-a6a0-4487-8093-8e427c7297d5
State: Peer in Cluster (Disconnected)

4. now bring the port UP:- ifup <portname> eg. eth0
5. now check for peer status on all the servers.
server 1
[root@dhcp47-141 home]# gluster peer status
Number of Peers: 2

Hostname: dhcp47-143.lab.eng.blr.redhat.com
Uuid: 526679d1-7035-4cdd-8c9f-d27a060d7022
State: Peer in Cluster (Connected)

Hostname: dhcp47-144.lab.eng.blr.redhat.com
Uuid: bd889f19-a6a0-4487-8093-8e427c7297d5
State: Peer in Cluster (Disconnected)

Server3:-
[root@dhcp47-144 home]# gluster peer s
Number of Peers: 2

Hostname: dhcp47-141.lab.eng.blr.redhat.com
Uuid: f5cf48a8-02b0-49de-b881-21f91aeae829
State: Peer in Cluster (Connected)

Hostname: dhcp47-143.lab.eng.blr.redhat.com
Uuid: 526679d1-7035-4cdd-8c9f-d27a060d7022
State: Peer in Cluster (Connected)

Actual results:
Two servers are in disconnected state and third showing connected to other two

Expected results:
All servers should be connected with each other.

Additional info:

Comment 2 Atin Mukherjee 2016-11-14 15:55:47 UTC
Initially I suspected it to be a friend-sm issue but the reason the faulty glusterd is not able to connect to other glusterds is because of :

[2016-11-14 15:53:55.710314] E [MSGID: 101075] [common-utils.c:308:gf_resolve_ip6] 0-resolver: getaddrinfo failed (Name or service not known)

Not sure why getaddrinfo is failing here.

Comment 3 Byreddy 2016-11-15 04:19:44 UTC
Tried 8 times in my setup the same scenario, it's not reproducing single time, all the times peer status showed correctly as per the expectation.

Comment 4 Atin Mukherjee 2016-11-15 04:31:49 UTC
Now with comment 3, it would be worth to see how this set up has been configured and what's the difference between this and Byreddy's set up.

Comment 10 Gaurav Yadav 2017-08-03 03:53:48 UTC
I tried reproducing issue in my setup with the same scenario after multiple trials. I am seeing all the times peer status correctly.

As it is not reproducing issue I am closing this issue.


Note You need to log in before you can comment on or make changes to this bug.