Description of problem: Have SSL enabled on the volumes. Bring down one of the node/glusternw in the cluster. until the node or glusternw comes back up client logs are flooded with the messages below. [2017-01-13 06:28:09.111117] E [socket.c:3135:socket_connect] 0-engine-client-0: connection attempt on 10.70.36.79:49153 failed, (No route to host) [2017-01-13 06:28:09.111325] E [socket.c:353:ssl_setup_connection] 0-engine-client-0: SSL connect error (client: 10.70.36.79:49153) [2017-01-13 06:28:09.111344] E [socket.c:2443:socket_poller] 0-engine-client-0: client setup failed Version-Release number of selected component (if applicable): glusterfs-3.8.4-11.el7rhgs.x86_64 How reproducible: Always Steps to Reproduce: 1. Have SSL enabled on the volumes on a three node cluster 2. Now bring down one of the node or glusternw. 3. Actual results: client logs are flooded with the messages put in the description until the node or network comes back up. Expected results: client logs should just log one instance of not able to reach the port and the log should not be flooded with these messages. Additional info:
upstream patch : https://review.gluster.org/16767
downstream patch : https://code.engineering.redhat.com/gerrit/#/c/101323/
BUILD : 3.8.4-26 Followed the steps mentioned in the description. 1. When the node is down or glusterd is stopped still getting the following error messages in Client logs [2017-06-19 15:08:14.321261] I [MSGID: 114018] [client.c:2280:client_rpc_notify] 0-cross2-client-0: disconnected from cross2-client-0. Client process will keep trying to connect to glusterd until brick's port is available [2017-06-19 15:08:17.327234] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:20.333123] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:26.343173] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:29.349299] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host) [2017-06-19 15:08:32.355294] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:35.361269] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host) [2017-06-19 15:08:38.367180] E [socket.c:3219:socket_connect] 0-glusterfs: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:41.373326] E [socket.c:3219:socket_connect] 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No route to host) The following are glusterd logs of one node which is part of trusted storage pool [2017-06-19 15:08:16.360551] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:22.370603] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:25.376630] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:31.386711] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:37.400693] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:40.406546] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:46.416718] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:49.422555] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:55.432719] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) [2017-06-19 15:08:58.439598] E [socket.c:3219:socket_connect] 0-management: connection attempt on 10.70.37.135:24007 failed, (No route to host) Hence marking it failed qa
The error messages in client and server nodes are not related to ssl but the logs are flooded with error messages mentioned in comment7(In reply to Bala Konda Reddy M from comment #7) > BUILD : 3.8.4-26 > > Followed the steps mentioned in the description. > 1. When the node is down or glusterd is stopped still getting the following > error messages in Client logs > [2017-06-19 15:08:14.321261] I [MSGID: 114018] > [client.c:2280:client_rpc_notify] 0-cross2-client-0: disconnected from > cross2-client-0. Client process will keep trying to connect to glusterd > until brick's port is available > [2017-06-19 15:08:17.327234] E [socket.c:3219:socket_connect] 0-glusterfs: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:20.333123] E [socket.c:3219:socket_connect] 0-glusterfs: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:26.343173] E [socket.c:3219:socket_connect] 0-glusterfs: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:29.349299] E [socket.c:3219:socket_connect] > 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No > route to host) > [2017-06-19 15:08:32.355294] E [socket.c:3219:socket_connect] 0-glusterfs: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:35.361269] E [socket.c:3219:socket_connect] > 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No > route to host) > [2017-06-19 15:08:38.367180] E [socket.c:3219:socket_connect] 0-glusterfs: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:41.373326] E [socket.c:3219:socket_connect] > 0-cross2-client-0: connection attempt on 10.70.37.135:49152 failed, (No > route to host) > > The following are glusterd logs of one node which is part of trusted storage > pool > > [2017-06-19 15:08:16.360551] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:22.370603] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:25.376630] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:31.386711] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:37.400693] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:40.406546] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:46.416718] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:49.422555] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:55.432719] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > [2017-06-19 15:08:58.439598] E [socket.c:3219:socket_connect] 0-management: > connection attempt on 10.70.37.135:24007 failed, (No route to host) > > Hence marking it failed qa
Hi Bala, I fixed the problem only to avoid SSL specific errors not socket_connect errors. As you can see in problem description logs earlier there were ssl_setup connection errors after failed socket_connect so i resolved that issue from the patch. I did not change the log specific to socket_connect error, i think it is required for user to know about the root cause of not establish the connection. IMO it is working as expected. Regards Mohit Agrawal
Based on comment 9, moving this back to ON_QA
Even in case of non-ssl also you will see same kind of error logs, these logs are not specific to SSL. Regards Mohit Agrawal
As per Mohit's comment, the changes are made respective to SSL, these errors messages are expected hence marking it to verified (In reply to Mohit Agrawal from comment #9) > Hi Bala, > > I fixed the problem only to avoid SSL specific errors not socket_connect > errors. > As you can see in problem description logs earlier there were ssl_setup > connection errors after failed socket_connect so i resolved that issue from > the patch. > I did not change the log specific to socket_connect error, i think it is > required for user to know about the root cause of not establish the > connection. > > IMO it is working as expected. > > Regards > Mohit Agrawal
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:2774