1457981 – client fails to connect to the brick due to an incorrect port reported back by glusterd

Bug 1457981 - client fails to connect to the brick due to an incorrect port reported back by glusterd

Summary: client fails to connect to the brick due to an incorrect port reported back b...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Assignee:	Atin Mukherjee
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1451756
TreeView+	depends on / blocked

Reported:	2017-06-01 16:43 UTC by Atin Mukherjee
Modified:	2017-09-05 17:32 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.12.0
Clone Of:	1451756
Environment:
Last Closed:	2017-09-05 17:32:34 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Atin Mukherjee 2017-06-01 16:43:52 UTC

+++ This bug was initially created as a clone of Bug #1451756 +++

Description of problem:
Had HC installation with three nodes with 3.2 glusterfs build. I have upgraded one of the node to 3.3 glusterfs build and i saw that heal info count never becomes zero. File is getting healed but the pending xattrs are getting set again. Another observation is that the private structure of the client xlator shows both connecting and connected as 1.

client 0:
===============
cat /rhev/data-center/mnt/glusterSD/10.70.36.78\:_data/.meta/graphs/active/data-client-1/private 
[xlator.protocol.client.data-client-1.priv]
fd.0.remote_fd = -1
fd.1.remote_fd = -1
fd.2.remote_fd = -1
fd.3.remote_fd = -1
connecting = 1
connected = 1
total_bytes_read = 3889072373
ping_timeout = 30
total_bytes_written = 12075388236
ping_msgs_sent = 0
msgs_sent = 152360

client 1:
====================
cat /rhev/data-center/mnt/glusterSD/10.70.36.78\:_data/.meta/graphs/active/data-client-1/private 
[xlator.protocol.client.data-client-1.priv]
fd.0.remote_fd = 1
connecting = 0
connected = 1
total_bytes_read = 1166694134
ping_timeout = 30
total_bytes_written = 37377716
ping_msgs_sent = 760
msgs_sent = 108352

cat /rhev/data-center/mnt/glusterSD/10.70.36.78\:_data/.meta/graphs/active/data-client-1/private 
[xlator.protocol.client.data-client-1.priv]
fd.0.remote_fd = -1
fd.1.remote_fd = -1
connecting = 1
connected = 1
total_bytes_read = 794759321
ping_timeout = 30
total_bytes_written = 13378418443
ping_msgs_sent = 0
msgs_sent = 306251

Version-Release number of selected component (if applicable):


How reproducible:
Hit it once

Steps to Reproduce:
1. Installed HC with 3.2 glusterfs build
2. Upgraded one of the node which is client-1 to glusterfs 3.3 build
3. once the upgrade is done heal info never becomes zero.

Actual results:
File gets healed but pending xattrs are set again and i see that private structure of the client xlator shows both connecting and connected as 1

Expected results:
File should get healed successfully and private structure should not display both connecting and connected as 1.

Additional info:

--- Additional comment from Red Hat Bugzilla Rules Engine on 2017-05-17 09:02:58 EDT ---

This bug is automatically being proposed for the current release of Red Hat Gluster Storage 3 under active development, by setting the release flag 'rhgs‑3.3.0' to '?'. 

If this bug should be proposed for a different release, please manually change the proposed release flag.

--- Additional comment from RamaKasturi on 2017-05-17 09:13:50 EDT ---

sosreports can be found in the link below:
=================================================
http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/HC/1451756/

--- Additional comment from Raghavendra G on 2017-05-31 02:13:19 EDT ---

Logs are found in the sos-report:
sosreport-yarrow.lab.eng.blr.redhat.com-20170517173214.tar.xz

However looks like entire log is not present. The log file is a symbolic link pointing to a tailed log:
[raghu@unused glusterfs]$ ls -l rhev-data-center-mnt-glusterSD-10.70.36.78\:_data.log 
lrwxrwxrwx. 1 raghu users 106 May 17 17:34 rhev-data-center-mnt-glusterSD-10.70.36.78:_data.log -> ../../../sos_strings/gluster/var.log.glusterfs.rhev-data-center-mnt-glusterSD-10.70.36.78:_data.log.tailed


The tailed log captures messages only after disconnect. Need full logs to analyze the issue.

--- Additional comment from Raghavendra G on 2017-06-01 05:03:58 EDT ---

glusterd says the brick is listening on port 49155

# gluster volume status data
Brick 10.70.36.77:/gluster_bricks/data/data 49155     0          Y       22436

Client logs say client is connecting to port 49155:
[2017-06-01 08:56:05.149017] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:05.152941] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting socket
[2017-06-01 08:56:09.153591] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:09.157781] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting socket
[2017-06-01 08:56:13.158146] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:13.162099] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting socket
[2017-06-01 08:56:17.162637] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:17.166604] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting socket
[2017-06-01 08:56:21.167186] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:21.171153] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting socket
[2017-06-01 08:56:25.172089] I [rpc-clnt.c:2001:rpc_clnt_reconfig] 0-data-client-1: changing port to 49155 (from 0)
[2017-06-01 08:56:25.176367] E [socket.c:2318:socket_connect_finish] 0-data-client-1: connection to 10.70.36.77:49155 failed (Connection refused); disconnecting

But the brick is listening is on 49152:

# ps ax | grep -i brick | grep data
root     22436  0.5  0.0 1217564 26164 ?       Ssl  12:31   0:40 /usr/sbin/glusterfsd -s 10.70.36.77 --volfile-id data.10.70.36.77.gluster_bricks-data-data -p /var/lib/glusterd/vols/data/run/10.70.36.77-gluster_bricks-data-data.pid -S /var/run/gluster/77a68376a313c246358b3d10487b0e94.socket --brick-name /gluster_bricks/data/data -l /var/log/glusterfs/bricks/gluster_bricks-data-data.log --xlator-option *-posix.glusterd-uuid=f889a05f-648e-4890-b1e6-637e33b69803 --brick-port 49152 --xlator-option data-server.listen-port=49152

So, this is an issue with glusterd portmapper. Its giving wrong port for the brick.

--- Additional comment from Atin Mukherjee on 2017-06-01 06:18:56 EDT ---

from the statedump:

glusterd.peer2.rpc.msgs_sent=164
glusterd.pmap_port=49152
glusterd.pmap[49152].type=4
glusterd.pmap[49152].brickname=/gluster_bricks/engine/engine /gluster_bricks/data/data   <===== this is wrong, we can never have two bricks for same port if brick mux is not enabled which is not the case here.
glusterd.pmap_port=49153
glusterd.pmap[49153].type=4
glusterd.pmap[49153].brickname=/gluster_bricks/engine/engine
glusterd.pmap_port=49154
glusterd.pmap[49154].type=4
glusterd.pmap[49154].brickname=/gluster_bricks/vmstore/vmstore /gluster_bricks/vmstore/vmstore <===== same here as above
glusterd.pmap_port=49155
glusterd.pmap[49155].type=4
glusterd.pmap[49155].brickname=/gluster_bricks/data/data <=== this brick has an entry for port 49152 as well
glusterd.pmap_port=49156
glusterd.pmap[49156].type=0
glusterd.pmap[49156].brickname=(null)

from the brick logfile gluster_bricks-data-data.log:

[2017-05-31 09:48:40.595526] I [MSGID: 100030] [glusterfsd.c:2412:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.8.4 (args: /usr/sbin/glusterfsd -s 10.70.36.77 --volfile-id data.10.70.36.77.gluster_bricks-data-data -p /var/lib/glusterd/vols/data/run/10.70.36.77-gluster_bricks-data-data.pid -S /var/run/gluster/77a68376a313c246358b3d10487b0e94.socket --brick-name /gluster_bricks/data/data -l /var/log/glusterfs/bricks/gluster_bricks-data-data.log --xlator-option *-posix.glusterd-uuid=f889a05f-648e-4890-b1e6-637e33b69803 --brick-port 49152 --xlator-option data-server.listen-port=49152)
[2017-05-31 09:48:40.630861] I [MSGID: 100030] [glusterfsd.c:2412:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.8.4 (args: /usr/sbin/glusterfsd -s 10.70.36.77 --volfile-id data.10.70.36.77.gluster_bricks-data-data -p /var/lib/glusterd/vols/data/run/10.70.36.77-gluster_bricks-data-data.pid -S /var/run/gluster/77a68376a313c246358b3d10487b0e94.socket --brick-name /gluster_bricks/data/data -l /var/log/glusterfs/bricks/gluster_bricks-data-data.log --xlator-option *-posix.glusterd-uuid=f889a05f-648e-4890-b1e6-637e33b69803 --brick-port 49155 --xlator-option data-server.listen-port=49155)

The above clearly indicates that we have tried to start this brick twice, once with 49152 port and in the second attempt with 49155 which is the actual problem here. And this is the reason why we see same brick entries for two different ports. Additionally, we have one port having two different entries which is also  wrong.


If we map this time entry with glusterd log here are the sequences:

[2017-05-31 09:48:40.563893] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-05-31 09:48:40.564975] I [MSGID: 106163] [glusterd-handshake.c:1274:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31001
[2017-05-31 09:48:40.574124] I [MSGID: 106490] [glusterd-handler.c:2610:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 75ab54f6-c570-4eda-8d8a-ed37f96d43ca
[2017-05-31 09:48:40.577823] I [MSGID: 106493] [glusterd-handler.c:3865:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.36.76 (0), ret: 0, op_ret: 0
[2017-05-31 09:48:40.580988] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume data. Starting local bricks.
[2017-05-31 09:48:40.585199] I [rpc-clnt.c:1058:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-05-31 09:48:40.586056] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume engine. Starting local bricks.
[2017-05-31 09:48:40.588661] I [rpc-clnt.c:1058:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-05-31 09:48:40.588818] C [MSGID: 106003] [glusterd-server-quorum.c:341:glusterd_do_volume_quorum_action] 0-management: Server quorum regained for volume vmstore. Starting local bricks.
[2017-05-31 09:48:40.591559] I [rpc-clnt.c:1058:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-05-31 09:48:40.592001] I [rpc-clnt.c:1058:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600

Further analysis to progress.

Comment 1 Worker Ant 2017-06-01 16:45:00 UTC

REVIEW: https://review.gluster.org/17447 (glusterd: fix brick start race) posted (#1) for review on master by Atin Mukherjee (amukherj)

Comment 2 Worker Ant 2017-06-04 13:13:55 UTC

REVIEW: https://review.gluster.org/17447 (glusterd: fix brick start race) posted (#2) for review on master by Atin Mukherjee (amukherj)

Comment 3 Worker Ant 2017-06-05 13:01:47 UTC

REVIEW: https://review.gluster.org/17447 (glusterd: fix brick start race) posted (#3) for review on master by Atin Mukherjee (amukherj)

Comment 4 Worker Ant 2017-06-05 15:03:11 UTC

REVIEW: https://review.gluster.org/17447 (glusterd: fix brick start race) posted (#4) for review on master by Atin Mukherjee (amukherj)

Comment 5 Worker Ant 2017-06-06 02:24:31 UTC

COMMIT: https://review.gluster.org/17447 committed in master by Jeff Darcy (jeff.us) 
------
commit 7b58ec260152bdcf840ac622dbb883ce8b593f63
Author: Atin Mukherjee <amukherj>
Date:   Thu Jun 1 22:05:51 2017 +0530

    glusterd: fix brick start race
    
    This commit tries to handle a race where we might end up trying to spawn
    the brick process twice with two different set of ports resulting into
    glusterd portmapper having the same brick entry in two different ports
    which will result into clients to fail connect to bricks because of
    incorrect ports been communicated back by glusterd.
    
    In glusterd_brick_start () checking brickinfo->status flag to identify
    whether a brick has been started by glusterd or not is not sufficient as
    there might be cases where while glusterd restarts
    glusterd_restart_bricks () will be called through glusterd_spawn_daemons
    () in synctask and immediately glusterd_do_volume_quorum_action () with
    server-side-quorum set to on will again try to start the brick and in
    case if the RPC_CLNT_CONNECT event for the same brick  hasn't been processed by
    glusterd by that time, brickinfo->status will still be marked as
    GF_BRICK_STOPPED resulting into a reattempt to start the brick with a different
    port and that would result portmap go for a toss and resulting clients to fetch
    incorrect port.
    
    Fix would be to introduce another enum value called GF_BRICK_STARTING in
    brickinfo->status which will be set when a brick start is attempted by
    glusterd and will be set to started through RPC_CLNT_CONNECT event. For
    brick multiplexing, on attach brick request given the brickinfo->status
    flag is marked to started directly this value will not have any effect.
    Also this patch removes started_here flag as it looks to be redundant as
    brickinfo->status.
    
    Change-Id: I9dda1a9a531b67734a6e8c7619677867b520dcb2
    BUG: 1457981
    Signed-off-by: Atin Mukherjee <amukherj>
    Reviewed-on: https://review.gluster.org/17447
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Jeff Darcy <jeff.us>

Comment 6 Shyamsundar 2017-09-05 17:32:34 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.0, please open a new bug report.

glusterfs-3.12.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/announce/2017-September/000082.html
[2] https://www.gluster.org/pipermail/gluster-users/

Note You need to log in before you can comment on or make changes to this bug.