Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1412930 - [SSL] - when a node or glusternw is down all the clients logs are flooded with SSL connect error and client setup failed messages
[SSL] - when a node or glusternw is down all the clients logs are flooded wit...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: core (Show other bugs)
3.2
Unspecified Unspecified
unspecified Severity unspecified
: ---
: RHGS 3.3.0
Assigned To: Mohit Agrawal
Bala Konda Reddy M
ssl
:
Depends On:
Blocks: Gluster-HC-2 1417147 1433896
  Show dependency treegraph
 
Reported: 2017-01-13 02:13 EST by RamaKasturi
Modified: 2017-09-21 00:56 EDT (History)
6 users (show)

See Also:
Fixed In Version: glusterfs-3.8.4-19
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-09-21 00:30:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:2774 normal SHIPPED_LIVE glusterfs bug fix and enhancement update 2017-09-21 04:16:29 EDT

  None (edit)
Description RamaKasturi 2017-01-13 02:13:46 EST
Description of problem:
Have SSL enabled on the volumes. Bring down one of the node/glusternw in the cluster. until the node or glusternw comes back up client logs are flooded with the messages below.

[2017-01-13 06:28:09.111117] E [socket.c:3135:socket_connect] 0-engine-client-0: connection attempt on 10.70.36.79:49153 failed, (No route to host)
[2017-01-13 06:28:09.111325] E [socket.c:353:ssl_setup_connection] 0-engine-client-0: SSL connect error (client: 10.70.36.79:49153)
[2017-01-13 06:28:09.111344] E [socket.c:2443:socket_poller] 0-engine-client-0: client setup failed


Version-Release number of selected component (if applicable):
glusterfs-3.8.4-11.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Have SSL enabled on the volumes on a three node cluster
2. Now bring down one of the node or glusternw.
3. 

Actual results:
client logs are flooded with the messages put in the description until the node or network comes back up.

Expected results:
client logs should just log one instance of not able to reach the port and the log should not be flooded with these messages.

Additional info:
Comment 3 Atin Mukherjee 2017-02-27 01:08:29 EST
upstream patch : https://review.gluster.org/16767
Comment 5 Atin Mukherjee 2017-03-24 05:57:24 EDT
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/101323/
Comment 7 Bala Konda Reddy M 2017-06-19 11:18:34 EDT
BUILD : 3.8.4-26

Followed the steps mentioned in the description.
1. When the node is down or glusterd is stopped still getting the following error messages in Client logs
[2017-06-19 15:08:14.321261] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-cross2-client-0: disconnected from cross2-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2017-06-19 15:08:17.327234] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:20.333123] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:26.343173] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:29.349299] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host)
[2017-06-19 15:08:32.355294] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:35.361269] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host)
[2017-06-19 15:08:38.367180] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:41.373326] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host)

The following are glusterd logs of one node which is part of trusted storage pool

[2017-06-19 15:08:16.360551] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:22.370603] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:25.376630] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:31.386711] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:37.400693] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:40.406546] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:46.416718] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:49.422555] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:55.432719] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)
[2017-06-19 15:08:58.439598] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host)

Hence marking it failed qa
Comment 8 Bala Konda Reddy M 2017-06-19 11:23:00 EDT
The error messages in client and server nodes are not related to ssl but the logs are flooded with error messages mentioned in comment7(In reply to Bala Konda Reddy M from comment #7)
> BUILD : 3.8.4-26
> 
> Followed the steps mentioned in the description.
> 1. When the node is down or glusterd is stopped still getting the following
> error messages in Client logs
> [2017-06-19 15:08:14.321261] I [MSGID: 114018]
> [client.c:2280:client_rpc_notify] 0-cross2-client-0: disconnected from
> cross2-client-0. Client process will keep trying to connect to glusterd
> until brick's port is available
> [2017-06-19 15:08:17.327234] E [socket.c:3219:socket_connect] 0-glusterfs:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:20.333123] E [socket.c:3219:socket_connect] 0-glusterfs:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:26.343173] E [socket.c:3219:socket_connect] 0-glusterfs:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:29.349299] E [socket.c:3219:socket_connect]
> 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No
> route to host)
> [2017-06-19 15:08:32.355294] E [socket.c:3219:socket_connect] 0-glusterfs:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:35.361269] E [socket.c:3219:socket_connect]
> 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No
> route to host)
> [2017-06-19 15:08:38.367180] E [socket.c:3219:socket_connect] 0-glusterfs:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:41.373326] E [socket.c:3219:socket_connect]
> 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No
> route to host)
> 
> The following are glusterd logs of one node which is part of trusted storage
> pool
> 
> [2017-06-19 15:08:16.360551] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:22.370603] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:25.376630] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:31.386711] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:37.400693] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:40.406546] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:46.416718] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:49.422555] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:55.432719] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> [2017-06-19 15:08:58.439598] E [socket.c:3219:socket_connect] 0-management:
> connection attempt on 10.70.37.135:24007 failed, (No route to host)
> 
> Hence marking it failed qa
Comment 9 Mohit Agrawal 2017-06-19 11:32:55 EDT
Hi Bala,

  I fixed the problem only to avoid SSL specific errors not socket_connect errors.
  As you can see in problem description logs earlier there were ssl_setup connection errors after failed socket_connect so i resolved that issue from the patch. 
  I did not change the log specific to socket_connect error, i think it is required for user to know about the root cause of not establish the connection.

  IMO it is working as expected.

Regards
Mohit Agrawal
Comment 10 Atin Mukherjee 2017-06-19 11:36:35 EDT
Based on comment 9, moving this back to ON_QA
Comment 11 Mohit Agrawal 2017-06-19 11:38:46 EDT
Even in case of non-ssl also you will see same kind of error logs, these logs are not specific to SSL.



Regards
Mohit Agrawal
Comment 12 Bala Konda Reddy M 2017-06-19 22:55:17 EDT
As per Mohit's comment, the changes are made respective to SSL, these errors messages are expected hence marking it to verified


(In reply to Mohit Agrawal from comment #9)
> Hi Bala,
> 
> I fixed the problem only to avoid SSL specific errors not socket_connect
> errors.
> As you can see in problem description logs earlier there were ssl_setup
> connection errors after failed socket_connect so i resolved that issue from
> the patch. 
> I did not change the log specific to socket_connect error, i think it is
> required for user to know about the root cause of not establish the
> connection.
> 
> IMO it is working as expected.
> 
> Regards
> Mohit Agrawal
Comment 14 errata-xmlrpc 2017-09-21 00:30:55 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774
Comment 15 errata-xmlrpc 2017-09-21 00:56:40 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2774

Note You need to log in before you can comment on or make changes to this bug.