Bug 1460245

Summary: [GSS]Glustershd process crashes intermittently on SSL enabled volume in RHGS 3.2
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Riyas Abdulrasak <rnalakka>
Component: coreAssignee: Ravishankar N <ravishankar>
Status: CLOSED NEXTRELEASE QA Contact: Rahul Hinduja <rhinduja>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, moagrawa, nbalacha, nchilaka, ravishankar, rgowdapp, rhs-bugs, rnalakka, sreber, storage-qa-internal
Target Milestone: ---Keywords: Reopened, ZStream
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-20 02:41:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1596513, 1597229, 1597230    
Bug Blocks:    

Description Riyas Abdulrasak 2017-06-09 13:14:34 UTC
Description of problem:

Glustershd process crashes intermittently on different nodes on the cluster. 

Version-Release number of selected component (if applicable):

RHGS 3.2

How reproducible:

Happens intermittently on customer environment. 


Actual results:

glustershd crashes

Expected results:

glustershd should not crash. 


Additional info:

- Cluster /volume specific information are in the next comment
- Complete bt of the crashdump will be attached to the bz 'gdb.txt'

Comment 19 Mohit Agrawal 2017-06-28 16:16:28 UTC
Hi,

shd is getting below logs and these logs are showing ssl_setup_connection is throwing connect error but it is failing because socket_connect is getting connection refused from glusterd side may be other end point is not available.

>>>>>>>>>>>>>>>>>

[2017-06-08 00:32:43.786637] E [socket.c:3142:socket_connect] 0-glusterfs: connection attempt on  failed, (Connection refused)
[2017-06-08 00:32:43.786851] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-06-08 00:32:43.786890] E [socket.c:358:ssl_setup_connection] 0-glusterfs: SSL connect error (client: )
[2017-06-08 00:32:43.786912] E [socket.c:2447:socket_poller] 0-glusterfs: client setup failed
[2017-06-08 00:32:43.786955] E [glusterfsd-mgmt.c:1928:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport endpoint is not connected)

>>>>>>>>>>>>>>>>>>>>>>>

We already solved this problem in ssl code, now it will try to establish connection with peer only while socket_connect is getting success.

In downstream issue is fixed from below patch
 https://code.engineering.redhat.com/gerrit/#/c/106256/


Regards
Mohit Agrawal

Comment 22 Ravishankar N 2017-08-24 09:15:01 UTC
*** Bug 1478010 has been marked as a duplicate of this bug. ***

Comment 23 Simon Reber 2017-10-11 14:44:56 UTC
*** Bug 1499666 has been marked as a duplicate of this bug. ***