Bug 1309215 - Mgmt-path-SSL-enabled-cluster ends in disconnected state after multiple 'socket poller: error in polling loop' errors
Mgmt-path-SSL-enabled-cluster ends in disconnected state after multiple 'sock...
Status: CLOSED WORKSFORME
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: core (Show other bugs)
3.1
Unspecified Unspecified
unspecified Severity medium
: ---
: ---
Assigned To: Mohit Agrawal
storage-qa-internal@redhat.com
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-17 04:02 EST by Sweta Anandpara
Modified: 2018-02-06 23:26 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-02-06 23:26:27 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Sweta Anandpara 2016-02-17 04:02:26 EST
Description of problem:
Had a 2 node cluster. Enabled SSL on management path, by following the steps in https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html/Administration_Guide/ch09s03.html. Ran tiering automation suite, consisting of 30 odd test cases. After a successful run of first 6 test cases, the 'gluster pool list' shows one of the nodes as disconnected, resulting in failure of every subsequent test case. Peer probe fails. 

Multiple socket_poller error are seen in the logs.

[2016-02-16 16:51:01.564991] I [socket.c:347:ssl_setup_connection] 0-socket.management: peer CN = dhcp42-245
[2016-02-16 16:51:01.566302] E [socket.c:2501:socket_poller] 0-socket.management: error in polling loop
[2016-02-16 16:51:03.543447] I [socket.c:347:ssl_setup_connection] 0-socket.management: peer CN = dhcp42-217
[2016-02-16 16:51:03.549040] E [socket.c:2501:socket_poller] 0-socket.management: error in polling loop
[2016-02-16 16:51:26.210562] I [socket.c:347:ssl_setup_connection] 0-socket.management: peer CN = client81
[2016-02-16 16:51:26.285161] I [socket.c:347:ssl_setup_connection] 0-socket.management: peer CN = client81
[2016-02-16 16:51:26.289947] E [socket.c:2501:socket_poller] 0-socket.management: error in polling loop
[2016-02-16 16:52:27.766917] E [socket.c:2501:socket_poller] 0-socket.management: error in polling loop


Version-Release number of selected component (if applicable):
glusterfs-3.7.5-19.el7rhgs.x86_64

How reproducible: 2:2

Additional info:

[root@dhcp42-245 ~]# rpm -qa | grep gluster
glusterfs-libs-3.7.5-19.el7rhgs.x86_64
python-gluster-3.7.5-19.el7rhgs.noarch
glusterfs-3.7.5-19.el7rhgs.x86_64
glusterfs-api-3.7.5-19.el7rhgs.x86_64
glusterfs-fuse-3.7.5-19.el7rhgs.x86_64
glusterfs-rdma-3.7.5-19.el7rhgs.x86_64
gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64
glusterfs-geo-replication-3.7.5-19.el7rhgs.x86_64
vdsm-gluster-4.16.30-1.3.el7rhgs.noarch
gluster-nagios-common-0.2.3-1.el7rhgs.noarch
glusterfs-client-xlators-3.7.5-19.el7rhgs.x86_64
glusterfs-cli-3.7.5-19.el7rhgs.x86_64
glusterfs-server-3.7.5-19.el7rhgs.x86_64
[root@dhcp42-245 ~]# 
[root@dhcp42-245 ~]# 
[root@dhcp42-245 ~]# gluster peer status
Number of Peers: 1

Hostname: 10.70.42.217
Uuid: 1c9025bb-9a31-445d-909d-9f8a866c7934
State: Peer in Cluster (Connected)
[root@dhcp42-245 ~]# 
[root@dhcp42-245 ~]# cd /etc/ssl
[root@dhcp42-245 ssl]# ll
total 12
lrwxrwxrwx. 1 root root   16 Feb 16 15:35 certs -> ../pki/tls/certs
-rw-r--r--. 1 root root 3288 Feb 16 16:29 glusterfs.ca
-rw-r--r--. 1 root root 1675 Feb 16 16:16 glusterfs.key
-rw-r--r--. 1 root root 1099 Feb 16 16:17 glusterfs.pem
[root@dhcp42-245 ssl]# 
[root@dhcp42-245 ssl]# ll /var/lib/glusterd/secure-access 
-rw-r--r--. 1 root root 0 Feb 16 16:21 /var/lib/glusterd/secure-access
[root@dhcp42-245 ssl]# 
[root@dhcp42-245 ssl]# 



root@dhcp42-217 ~]# 
[root@dhcp42-217 ~]# cd /etc/ssl
[root@dhcp42-217 ssl]# ll
total 12
lrwxrwxrwx. 1 root root   16 Feb 16 15:36 certs -> ../pki/tls/certs
-rw-r--r--. 1 root root 3288 Feb 16 16:29 glusterfs.ca
-rw-r--r--. 1 root root 1675 Feb 16 16:16 glusterfs.key
-rw-r--r--. 1 root root 1099 Feb 16 16:17 glusterfs.pem
[root@dhcp42-217 ssl]# 
[root@dhcp42-217 ssl]# ll /var/lib/glusterd/secure-access 
-rw-r--r--. 1 root root 0 Feb 16 16:21 /var/lib/glusterd/secure-access
[root@dhcp42-217 ssl]# 



[root@client81 mnt]# ll /etc/ssl/
total 12
lrwxrwxrwx. 1 root root   16 Dec 14 17:49 certs -> ../pki/tls/certs
-rw-r--r--. 1 root root 3288 Feb 16 21:33 glusterfs.ca
-rw-r--r--. 1 root root 1679 Feb 16 21:31 glusterfs.key
-rw-r--r--. 1 root root 1090 Feb 16 21:32 glusterfs.pem
[root@client81 mnt]# 
[root@client81 mnt]# 
[root@client81 mnt]# ll /var/lib/glusterd/secure-access 
-rw-r--r--. 1 root root 0 Feb 16 19:55 /var/lib/glusterd/secure-access
[root@client81 mnt]# 
[root@client81 mnt]# 
[root@client81 mnt]# ll
total 32
drwxr-xr-x. 4 root root 32768 Feb 17 14:26 glusterfs
[root@client81 mnt]# 
[root@client81 mnt]# 
[root@client81 mnt]# df -k glusterfs/
Filesystem            1K-blocks    Used Available Use% Mounted on
10.70.42.245:/testvol 104857600 2433792 102423808   3% /mnt/glusterfs
[root@client81 mnt]# 
[root@client81 mnt]# 
[root@client81 mnt]# 


[2016-02-16 16:51:01.337169] I [socket.c:347:ssl_setup_connection] 0-socket.management: peer CN = dhcp42-245
[2016-02-16 16:51:01.398896] I [MSGID: 106143] [glusterd-pmap.c:229:pmap_registry_bind] 0-pmap: adding brick /bricks/brick0/test on port 49152
[2016-02-16 16:51:01.400743] I [rpc-clnt.c:986:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2016-02-16 16:51:01.400972] I [socket.c:3931:socket_init] 0-management: SSL support for glusterd is ENABLED
[2016-02-16 16:51:01.401087] E [socket.c:4009:socket_init] 0-management: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2016-02-16 16:51:01.415991] I [rpc-clnt.c:986:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2016-02-16 16:51:01.416087] I [socket.c:3931:socket_init] 0-snapd: SSL support for glusterd is ENABLED
[2016-02-16 16:51:01.416183] E [socket.c:4009:socket_init] 0-snapd: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled
[2016-02-16 16:51:01.416844] I [rpc-clnt.c:986:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2016-02-16 16:51:01.416922] I [socket.c:3931:socket_init] 0-nfs: SSL support for glusterd is ENABLED
[2016-02-16 16:51:01.416999] E [socket.c:4009:socket_init] 0-nfs: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled


The log files will be updated in http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/<bugnumber>/
Comment 2 Kaushal 2016-02-23 03:56:52 EST
I suspect this to be similar to the other RPC connections we're seeing in GlusterD. I'll go through the logs and update if I find anything different.
Comment 6 Amar Tumballi 2018-02-06 23:26:27 EST
We have noticed that the bug is not reproduced in the latest version of the product (RHGS-3.3.1+).

If the bug is still relevant and is being reproduced, feel free to reopen the bug.

Note You need to log in before you can comment on or make changes to this bug.