Description of problem: Glustershd process crashes intermittently on different nodes on the cluster. Version-Release number of selected component (if applicable): RHGS 3.2 How reproducible: Happens intermittently on customer environment. Actual results: glustershd crashes Expected results: glustershd should not crash. Additional info: - Cluster /volume specific information are in the next comment - Complete bt of the crashdump will be attached to the bz 'gdb.txt'
Hi, shd is getting below logs and these logs are showing ssl_setup_connection is throwing connect error but it is failing because socket_connect is getting connection refused from glusterd side may be other end point is not available. >>>>>>>>>>>>>>>>> [2017-06-08 00:32:43.786637] E [socket.c:3142:socket_connect] 0-glusterfs: connection attempt on failed, (Connection refused) [2017-06-08 00:32:43.786851] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2017-06-08 00:32:43.786890] E [socket.c:358:ssl_setup_connection] 0-glusterfs: SSL connect error (client: ) [2017-06-08 00:32:43.786912] E [socket.c:2447:socket_poller] 0-glusterfs: client setup failed [2017-06-08 00:32:43.786955] E [glusterfsd-mgmt.c:1928:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport endpoint is not connected) >>>>>>>>>>>>>>>>>>>>>>> We already solved this problem in ssl code, now it will try to establish connection with peer only while socket_connect is getting success. In downstream issue is fixed from below patch https://code.engineering.redhat.com/gerrit/#/c/106256/ Regards Mohit Agrawal
*** Bug 1478010 has been marked as a duplicate of this bug. ***
*** Bug 1499666 has been marked as a duplicate of this bug. ***