Bug 1549023

Summary: Observing continuous "disconnecting socket" error messages on client glusterd logs
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Manisha Saini <msaini>
Component: glusterdAssignee: Atin Mukherjee <amukherj>
Status: CLOSED ERRATA QA Contact: Manisha Saini <msaini>
Severity: unspecified Docs Contact:
Priority: high    
Version: rhgs-3.4CC: amukherj, kaushal, msaini, rhinduja, rhs-bugs, srakonde, storage-qa-internal, vbellur
Target Milestone: ---   
Target Release: RHGS 3.4.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.2-8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-04 06:42:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1548829, 1549996    
Bug Blocks: 1503137    

Description Manisha Saini 2018-02-26 09:25:21 UTC
Description of problem:

Observing continuous "disconnecting socket" error message on client glusterd logs on all the 4 clients.All bricks are up and running for the volume.

On debugging the setup with Dev,seems client is trying to access the brick having stale port.

Version-Release number of selected component (if applicable):
glusterfs-server-3.12.2-4.el7rhgs.x86_64

How reproducible:
Reporting first instance

Steps to Reproduce:
1.Create 4*3 distributed-replicate volume.
2.Set some client access permission on volume via auth.allow volume set option to export the fuse subdir

gluster v set Ganeshavol1 auth.allow "/dir1(10.70.46.125),/dir1(10.70.46.20),/dir3(10.70.47.33),/dir4(10.70.47.170),/(*)"

3.Mount the subdirs and volumes on 4 different clients.
4.Start running IO's like bonnie,file rename etc from clients havinf fuse sundir mount

Actual results:
On client side,there are continues ongoing disconnecting socket Error messages.


Expected results:
No disconnect socket message should be observed on client side.


Additional info:
Before testing Fuse subdir,Same setup was been used to test Ganesha and GNFS testing.
I have recreated a new volume for Fuse subdir testing but with the same volume name and brick name.

======================================
Snippet from glusterddump 

glusterd.pmap[49154].brickname=/gluster/brick1/b1
glusterd.pmap_port=49155
glusterd.pmap[49155].type=4
glusterd.pmap[49155].brickname=                   /gluster/brick1/b1

======================================

# gluster v status
Status of volume: Ganeshavol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick dhcp47-193.lab.eng.blr.redhat.com:/gl
uster/brick1/b1                             49153     0          Y       7984 
Brick dhcp46-116.lab.eng.blr.redhat.com:/gl
uster/brick1/b1                             49152     0          Y       26930
Brick dhcp46-184.lab.eng.blr.redhat.com:/gl
uster/brick1/b1                             49156     0          Y       6500 
Brick dhcp47-2.lab.eng.blr.redhat.com:/glus
ter/brick1/b1                               49154     0          Y       12190
Brick dhcp47-193.lab.eng.blr.redhat.com:/gl
uster/brick2/b2                             49153     0          Y       7984 
Brick dhcp46-116.lab.eng.blr.redhat.com:/gl
uster/brick2/b2                             49152     0          Y       26930
Brick dhcp46-184.lab.eng.blr.redhat.com:/gl
uster/brick2/b2                             49156     0          Y       6500 
Brick dhcp47-2.lab.eng.blr.redhat.com:/glus
ter/brick2/b2                               49154     0          Y       12190
Brick dhcp47-193.lab.eng.blr.redhat.com:/gl
uster/brick3/b3                             49153     0          Y       7984 
Brick dhcp46-116.lab.eng.blr.redhat.com:/gl
uster/brick3/b3                             49152     0          Y       26930
Brick dhcp46-184.lab.eng.blr.redhat.com:/gl
uster/brick3/b3                             49156     0          Y       6500 
Brick dhcp47-2.lab.eng.blr.redhat.com:/glus
ter/brick3/b3                               49154     0          Y       12190
Self-heal Daemon on localhost               N/A       N/A        Y       12241
Self-heal Daemon on dhcp47-193.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       8034 
Self-heal Daemon on dhcp46-184.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       6552 
Self-heal Daemon on dhcp46-116.lab.eng.blr.
redhat.com                                  N/A       N/A        Y       26978
 
Task Status of Volume Ganeshavol1
------------------------------------------------------------------------------
There are no active volume tasks


Attaching sosreports and statedumps in a while...
Also the setup is in same state for further debugging

Comment 2 Manisha Saini 2018-02-26 09:31:05 UTC
Client side glusterd volume logs-

[2018-02-26 09:30:10.504195] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:10.509405] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:14.509735] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:14.515452] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:18.515330] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:18.520898] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:22.521159] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:22.526885] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:26.527157] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:26.533034] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:30.539459] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:30.545487] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket
[2018-02-26 09:30:34.545203] I [rpc-clnt.c:1986:rpc_clnt_reconfig] 0-Ganeshavol1-client-0: changing port to 49155 (from 0)
[2018-02-26 09:30:34.550923] E [socket.c:2369:socket_connect_finish] 0-Ganeshavol1-client-0: connection to 10.70.47.193:49155 failed (Connection refused); disconnecting socket

Comment 7 Manisha Saini 2018-02-27 06:49:46 UTC
Clearing the needinfo. Setup Details are provided on email and also changed the permissions of logs to be accessible

Comment 15 Kaushal 2018-05-07 16:18:43 UTC
After initial analysis, I can conclude that the problem you're seeing now is happening due to stale entries in pmap_registry.

From gdb,
```
(gdb) p $8.ports[49152]
$13 = {type = GF_PMAP_PORT_BRICKSERVER,
  brickname = 0x7f10d40b1230 "/var/lib/glusterd/ss_brick /gluster/brick1/b1 /gluster/brick2/b1 /gluster/brick1/b2 /gluster/brick2/b2 /gluster/brick1/b3 /gluster/brick2/b3 /gluster/brick1/b4 /gluster/brick1/b5 /gluster/brick1/b6 /g"..., xprt = 0x7f10dc083eb0}
(gdb) p $8.ports[49153]
$14 = {type = GF_PMAP_PORT_BRICKSERVER, 
  brickname = 0x7f10d40724c0 "/gluster/brick1/b1 /gluster/brick2/b1", ' ' <repeats 20 times>, "/gluster/brick2/b2 /gluster/brick1/b3 /gluster/brick2/b3", ' ' <repeats 20 times>, "/gluster/brick1/b5", ' ' <repeats 49 times>..., 
  xprt = 0x0}
```

Note that, pmap_registry contains stale pmap_entry for port 49153 for the affected bricks.

pmap_registry_search does a reverse search from the highest port to the lowest port. This causes the stale port to be returned to the clients when they do a pmap query for one of the bricks in the stale entry. Which leads to the client trying to connect to the stale port, and getting the connection refused errors.

I've not yet root caused how the stale entries remained.
The only path I see which could lead to a situation as above, pmap_entry with xprt = NULL, but still having brickname entries and with type=BRICKSERVER, is if glusterd_rpcsvc_notify called pmap_registry_remove, on a DISCONNECT event when the brick process stops.

Normally, glusterd_rpcsvc_notify DISCONNECT event should have had a related __glusterd_brick_rpc_notify DISCONNECT event, which inturn would have also called pmap_registry_remove but with brick_disconnet=true, causing forcefull removal of pmap_entry. But in this case, that doesn't seem to have happened, and I don't know why.

glusterd_rpcsvc_notify calls pmap_registry_remove with brick_disconnect=false. The leads to pmap_registry_remove setting pmap_entry.xprt to NULL, but will not reset the pmap_entry.type or cleanup pmap_entry.brickname as it still has entries, leading to the stale pmap_entry.

Ensuring glusterd_rpcsvc_notify calls pmap_registry_remove with brick_disconnect=true (which it should have been doing already), should fix the above observed problem.

Comment 16 Atin Mukherjee 2018-05-11 03:50:42 UTC
Sanju - Can you please check if commit e2ee5cb fixes this problem?

Comment 17 Atin Mukherjee 2018-05-14 05:18:05 UTC
Have we done any SIGKILL of the brick process i.e. kill -9 <brick process> during this testing?

Comment 18 Manisha Saini 2018-05-14 06:20:42 UTC
(In reply to Atin Mukherjee from comment #17)
> Have we done any SIGKILL of the brick process i.e. kill -9 <brick process>
> during this testing?

No..It was freshly created volume w/o any brick kill operation.

Comment 19 Kaushal 2018-05-15 06:29:43 UTC
The problem is not with the current/new volume, but the previous volume. How was it stopped?

Comment 20 Atin Mukherjee 2018-05-15 09:00:28 UTC
I went through all the code path and can confirm this csn happen only when a brick is SIGKILLed. Kaushal’s analysis although is technically correct but I don’t think we can eventually land up into that case. 

As discussed with Manisha yesterday, we need a proper reproducer of this problem with out which nothing can be done here and hence Manisha is going to try the scenario out and update the bug.

Comment 28 Manisha Saini 2018-07-27 09:55:06 UTC
Verified this with


# rpm -qa | grep gluster
glusterfs-3.12.2-14.el7rhgs.x86_64
glusterfs-ganesha-3.12.2-14.el7rhgs.x86_64
gluster-nagios-addons-0.2.10-2.el7rhgs.x86_64
glusterfs-fuse-3.12.2-14.el7rhgs.x86_64
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
python2-gluster-3.12.2-14.el7rhgs.x86_64
nfs-ganesha-gluster-2.5.5-8.el7rhgs.x86_64
glusterfs-client-xlators-3.12.2-14.el7rhgs.x86_64
glusterfs-server-3.12.2-14.el7rhgs.x86_64
glusterfs-rdma-3.12.2-14.el7rhgs.x86_64
vdsm-gluster-4.19.43-2.3.el7rhgs.noarch
glusterfs-cli-3.12.2-14.el7rhgs.x86_64
glusterfs-libs-3.12.2-14.el7rhgs.x86_64
glusterfs-geo-replication-3.12.2-14.el7rhgs.x86_64
glusterfs-api-3.12.2-14.el7rhgs.x86_64
glusterfs-debuginfo-3.12.2-14.el7rhgs.x86_64
libvirt-daemon-driver-storage-gluster-3.9.0-14.el7_5.6.x86_64


Tried creating/deleting volume multiple times.I am unable to hit this issue.Moving this BZ to verified state.

Comment 30 errata-xmlrpc 2018-09-04 06:42:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607