+++ This bug was initially created as a clone of Bug #1537346 +++ Description of problem: sometimes after reboot one sn nodes The output of command “gluster v heal mstate info” shows [root@testsn-1:/var/log/glusterfs/bricks] # gluster v heal mstate info Brick testsn-0.local:/mnt/bricks/mstate/brick /testas-0/var/lib/ntp/drift /testas-2/var/lib/ntp/drift /.install-done /testas-0/var/lib/ntp /testmn-1/var/lib/ntp /testas-2/var/lib/ntp /testmn-0/var/lib/ntp /testmn-1/var/lib/ntp/drift /testas-1/var/lib/ntp /testas-1/var/lib/ntp/drift /testmn-0/var/lib/ntp/drift Status: Connected Number of entries: 11 Brick testsn-1.local:/mnt/bricks/mstate/brick Status: Transport endpoint is not connected Number of entries: - glustershd can not connect to local brick process! when i check the glustershd process i find it always fail when trying to connect to glusterfsd process with port 49155. [2018-01-18 10:42:29.891811] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-mstate-client-1: changing port to 49155 (from 0) [2018-01-18 10:42:29.892120] E [socket.c:2369:socket_connect_finish] 0-mstate-client-1: connection to 192.168.1.3:49155 failed (Connection refused); disconnecting however, from local mstate glusterfsd process, it is listenning on port 49153! Version-Release number of selected component (if applicable): glusterfs3.12.3 How reproducible: reboot sn node Steps to Reproduce: 1.reboot sn node 2. 3. Actual results: glustershd can not connected to one local glusterfsd brick process this can be seen from the following netstat command output; [root@testsn-1:/var/log/glusterfs/bricks] # ps -ef | grep glustershd root 1295 1 0 Jan18 ? 00:00:18 /usr/sbin/glusterfs -s testsn-1.local --volfile-id gluster/glustershd -p /var/run/gluster/glustershd/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/178dba826edae38df4ba67f25beeb1e6.socket --xlator-option *replicate*.node-uuid=9ccea6b1-4d81-4020-a4ba-ee6821268ba8 root 19900 27911 0 04:10 pts/1 00:00:00 grep glustershd [root@testsn-1:/var/log/glusterfs/bricks] # netstat -p | grep 1295 tcp 0 0 testsn-1.local:49098 testsn-0.local:49154 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49099 testsn-0.local:49152 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49140 testsn-1.local:24007 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49097 testsn-0.local:49153 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49096 testsn-0.local:49155 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49120 testsn-1.local:49156 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49121 testsn-1.local:49152 ESTABLISHED 1295/glusterfs tcp 0 0 testsn-1.local:49126 testsn-1.local:49154 ESTABLISHED 1295/glusterfs unix 3 [ ] STREAM CONNECTED 36264 1295/glusterfs /var/run/gluster/178dba826edae38df4ba67f25beeb1e6.socket unix 2 [ ] DGRAM 36258 1295/glusterfs Expected results: glustershd should be able to connected to local brick process Additional info: # gluster v status mstate Status of volume: mstate Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick testsn-0.local:/mnt/bricks/mstate/bri ck 49154 0 Y 1113 Brick testsn-1.local:/mnt/bricks/mstate/bri ck 49155 0 Y 1117 Self-heal Daemon on localhost N/A N/A Y 1295 Self-heal Daemon on testsn-2.local N/A N/A Y 1813 Self-heal Daemon on testsn-0.local N/A N/A Y 1135 Task Status of Volume mstate ------------------------------------------------------------------------------ There are no active volume tasks It is quite strange that the mstate brick process listen port is showed as 49155 in “gluster v heal status mstate” but showed 49153 in ps command! [root@testsn-1:/var/log/glusterfs/bricks] # ps -ef | grep -i glusterfsd | grep mstate root 1117 1 0 Jan18 ? 00:00:05 /usr/sbin/glusterfsd -s testsn-1.local --volfile-id mstate.testsn-1.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/testsn-1.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/b520b934b415e6a68776cc4852901a77.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=9ccea6b1-4d81-4020-a4ba-ee6821268ba8 --brick-port 49153 --xlator-option mstate-server.listen-port=49153 --xlator-option transport.socket.bind-address=testsn-1.local --- Additional comment from Worker Ant on 2018-01-22 21:29:06 EST --- REVIEW: https://review.gluster.org/19263 (glusterd: process pmap sign in only when port is marked as free) posted (#3) for review on master by Atin Mukherjee
REVIEW: https://review.gluster.org/19263 (glusterd: process pmap sign in only when port is marked as free) posted (#4) for review on master by Atin Mukherjee
COMMIT: https://review.gluster.org/19263 committed in master by \"Atin Mukherjee\" <amukherj> with a commit message- glusterd: process pmap sign in only when port is marked as free Because of some crazy race in volume start code path because of friend handshaking with volumes with quorum enabled we might end up into a situation where glusterd would start a brick and get a disconnect and then immediately try to start the same brick instance based on another friend update request. And then if for the very first brick even if the process doesn't come up at the end sign in event gets sent and we end up having two duplicate portmap entries for the same brick. Since in brick start we mark the previous port as free, its better to consider a sign in request as no op if the corresponding port type is marked as free. Change-Id: I995c348c7b6988956d24b06bf3f09ab64280fc32 BUG: 1537362 Signed-off-by: Atin Mukherjee <amukherj>
(In reply to Worker Ant from comment #2) > COMMIT: https://review.gluster.org/19263 committed in master by \"Atin > Mukherjee\" <amukherj> with a commit message- glusterd: process > pmap sign in only when port is marked as free > > Because of some crazy race in volume start code path because of friend > handshaking with volumes with quorum enabled we might end up into a situation > where glusterd would start a brick and get a disconnect and then immediately > try > to start the same brick instance based on another friend update request. And > then if for the very first brick even if the process doesn't come up at the > end > sign in event gets sent and we end up having two duplicate portmap entries > for > the same brick. Since in brick start we mark the previous port as free, its > better to consider a sign in request as no op if the corresponding port type > is > marked as free. > > Change-Id: I995c348c7b6988956d24b06bf3f09ab64280fc32 > BUG: 1537362 > Signed-off-by: Atin Mukherjee <amukherj> this issue appeared again, in my local test env gluster volume heal log info Brick sn-0.local:/mnt/bricks/log/brick Status: Transport endpoint is not connected Number of entries: - Brick sn-1.local:/mnt/bricks/log/brick /master/fsaudit/internal-ssh.log / /master/fsaudit/auth.log /blackboxes /master/fsaudit/alarms /rcploglib/sn-0/DHAAgent.level /rcploglib/sn-0/DVMAgent.level /master/syslog Status: Connected Number of entries: 8 [root@sn-0:/root] # gluster volume heal mstate info Brick sn-0.local:/mnt/bricks/mstate/brick Status: Transport endpoint is not connected Number of entries: - Brick sn-1.local:/mnt/bricks/mstate/brick / Status: Connected Number of entries: 1 [root@sn-0:/root] #ps aux | grep glustershd | grep -v grep | wc -l glustershd process exist. 1 [root@sn-0:/root] # _gluster volume status log Status of volume: log Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick sn-0.local:/mnt/bricks/log/brick 49157 0 Y 1402 Brick sn-1.local:/mnt/bricks/log/brick 49155 0 Y 3817 Self-heal Daemon on localhost N/A N/A Y 1428 Self-heal Daemon on sn-1.local N/A N/A Y 20316 Self-heal Daemon on sn-2.local N/A N/A Y 15565 Task Status of Volume log ------------------------------------------------------------------------------ There are no active volume tasks [root@sn-0:/root] #_gluster volume status mstate Status of volume: mstate Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick sn-0.local:/mnt/bricks/mstate/brick 49158 0 Y 1403 Brick sn-1.local:/mnt/bricks/mstate/brick 49153 0 Y 2967 Self-heal Daemon on localhost N/A N/A Y 1428 Self-heal Daemon on sn-1.local N/A N/A Y 20316 Self-heal Daemon on sn-2.local N/A N/A Y 15565 Task Status of Volume mstate ------------------------------------------------------------------------------ There are no active volume tasks [root@sn-0:/root]
On sn-0 node, mstate and log glusterfsd process started twice the port in use is the first one, but glustershd keeps connecting using the latter one: /////////////////log volume/////////////////////////////// [2018-04-15 18:06:00.280807] I [MSGID: 100030] [glusterfsd.c:2521:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id log.sn-0.local.mnt-bricks-log-brick -p /var/run/gluster/vols/log/sn-0.local-mnt-bricks-log-brick.pid -S /var/run/gluster/0af76d9541921cccad34e5fd3041a44d.socket --brick-name /mnt/bricks/log/brick -l /var/log/glusterfs/bricks/mnt-bricks-log-brick.log --xlator-option *-posix.glusterd-uuid=ac000adb-afb9-4826-8dba-9e4a3b641068 --brick-port 49154 --xlator-option log-server.listen-port=49154 --xlator-option transport.socket.bind-address=sn-0.local) [2018-04-15 18:06:00.767856] I [MSGID: 100030] [glusterfsd.c:2521:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id log.sn-0.local.mnt-bricks-log-brick -p /var/run/gluster/vols/log/sn-0.local-mnt-bricks-log-brick.pid -S /var/run/gluster/0af76d9541921cccad34e5fd3041a44d.socket --brick-name /mnt/bricks/log/brick -l /var/log/glusterfs/bricks/mnt-bricks-log-brick.log --xlator-option *-posix.glusterd-uuid=ac000adb-afb9-4826-8dba-9e4a3b641068 --brick-port 49157 --xlator-option log-server.listen-port=49157 --xlator-option transport.socket.bind-address=sn-0.local) The real working port is 49154 but glustershd keeps trying to connect to 49157 so it does not succeed [2018-04-15 18:21:25.854249] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-log-client-0: changing port to 49157 (from 0) [2018-04-15 18:21:25.854597] E [socket.c:2369:socket_connect_finish] 0-log-client-0: connection to 192.168.1.12:49157 failed (Connection refused); disconnecting socket //////////////////////////// msate volume//////////////////////////// [2018-04-15 18:06:00.383103] I [MSGID: 100030] [glusterfsd.c:2521:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=ac000adb-afb9-4826-8dba-9e4a3b641068 --brick-port 49155 --xlator-option mstate-server.listen-port=49155 --xlator-option transport.socket.bind-address=sn-0.local) [2018-04-15 18:06:00.869951] I [MSGID: 100030] [glusterfsd.c:2521:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.12.3 (args: /usr/sbin/glusterfsd -s sn-0.local --volfile-id mstate.sn-0.local.mnt-bricks-mstate-brick -p /var/run/gluster/vols/mstate/sn-0.local-mnt-bricks-mstate-brick.pid -S /var/run/gluster/79d00394b1085adcf7dc1fd4c57f71bc.socket --brick-name /mnt/bricks/mstate/brick -l /var/log/glusterfs/bricks/mnt-bricks-mstate-brick.log --xlator-option *-posix.glusterd-uuid=ac000adb-afb9-4826-8dba-9e4a3b641068 --brick-port 49158 --xlator-option mstate-server.listen-port=49158 --xlator-option transport.socket.bind-address=sn-0.local) The real listening port for glusterfsd is 49155 but glustershd keeps trying to connect through 49158, so it always fail [2018-04-15 18:21:33.855829] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-mstate-client-0: changing port to 49158 (from 0) [2018-04-15 18:21:33.856229] E [socket.c:2369:socket_connect_finish] 0-mstate-client-0: connection to 192.168.1.12:49158 failed (Connection refused); disconnecting socket
What branch of glusterfs are you running? If its a private branch have you pulled in this fix?
the version i am using is glusterfs 3.12.3 i have added https://review.gluster.org/#/c/19263/ but occasionally this issue still exists i think the real root cause is that the glusterfsd process started twice and according to previous analysis I can see is every time when I restart all the gluster process (simulating a reboot scenario) I could see respective brick disconnect events being received at glusterd. #0 __glusterd_brick_rpc_notify (rpc=rpc@entry=0x7fffe8008810, mydata=mydata@entry=0x7fffe800e0d0, event=event@entry=RPC_CLNT_DISCONNECT, data=data@entry=0x0) at glusterd-handler.c:6007 #1 0x00007ffff2936169 in glusterd_big_locked_notify (rpc=0x7fffe8008810, mydata=0x7fffe800e0d0, event=RPC_CLNT_DISCONNECT, data=0x0, notify_fn=0x7ffff2932440 <__glusterd_brick_rpc_notify>) at glusterd-handler.c:68 #2 0x00007ffff78c0343 in rpc_clnt_handle_disconnect (conn=0x7fffe8008840, clnt=0x7fffe8008810) at rpc-clnt.c:876 #3 rpc_clnt_notify (trans=0x7fffe8008ae0, mydata=0x7fffe8008840, event=RPC_TRANSPORT_DISCONNECT, data=<optimized out>) at rpc-clnt.c:939 #4 0x00007ffff78bca73 in rpc_transport_notify (this=this@entry=0x7fffe8008ae0, event=event@entry=RPC_TRANSPORT_DISCONNECT, data=data@entry=0x7fffe8008ae0) at rpc-transport.c:545 #5 0x00007fffefc252bf in socket_event_poll_err (idx=<optimized out>, gen=<optimized out>, this=0x7fffe8008ae0) at socket.c:1210 #6 socket_event_handler (fd=9, idx=<optimized out>, gen=<optimized out>, data=0x7fffe8008ae0, poll_in=0, poll_out=<optimized out>, poll_err=16) at socket.c:2488 #7 0x00007ffff7b5097c in event_dispatch_epoll_handler (event=0x7fffee9dce84, event_pool=0x64f040) at event-epoll.c:583 #8 event_dispatch_epoll_worker (data=0x70b820) at event-epoll.c:659 #9 0x00007ffff693f36d in start_thread () from /lib64/libpthread.so.0 #10 0x00007ffff61ecbbf in clone () from /lib64/libc.so.6 The question what I have for RPC team is - why do we see a disconnect event in this case? Any pointers, Mohit/Raghu/Milind? The side effect of this disconnect event is what caused this issue. In __glusterd_brick_rpc_notify we set the brickinfo->start_triggered to false. So when two glusterd_brick_start threads were racing with each other, in between we got a disconnect because of which this flag got reset and we ended up trying to spawn the same brick twice.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v4.1.0, please open a new bug report. glusterfs-v4.1.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] http://lists.gluster.org/pipermail/announce/2018-June/000102.html [2] https://www.gluster.org/pipermail/gluster-users/