Bug 1378528
| Summary: | [SSL] glustershd disconnected from glusterd | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | SATHEESARAN <sasundar> | ||||||
| Component: | replicate | Assignee: | Mohit Agrawal <moagrawa> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | SATHEESARAN <sasundar> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | rhgs-3.2 | CC: | amukherj, asrivast, moagrawa, rcyriac, rhinduja, rhs-bugs, storage-qa-internal | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | RHGS 3.2.0 | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | glusterfs-3.8.4-2 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | 
       RHEV-RHGS-HCI
RHEL 7.2 
 | 
        |||||||
| Last Closed: | 2017-03-23 05:48:36 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1351528 | ||||||||
| Attachments: | 
            
  | 
      ||||||||
| 
 
        
          Description
        
        
          SATHEESARAN
        
        
        
        
        
          2016-09-22 16:23:43 UTC
        
       
      
      
      
    Error messages in glusterd.log <snip> [2016-09-22 16:00:47.345717] I [MSGID: 106006] [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management: glustershd has disconnected from glusterd. [2016-09-22 16:02:23.354480] W [socket.c:590:__socket_rwv] 0-glustershd: readv on /var/run/gluster/6f50c5e5a1fe717d1cea6d3d9944edd4.socket failed (Invalid argument) The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management: glustershd has disconnected from glusterd." repeated 39 times between [2016-09-22 16:00:47.345717] and [2016- 09-22 16:02:44.356399] [2016-09-22 16:02:47.356743] I [MSGID: 106006] [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management: glustershd has disconnected from glusterd. [2016-09-22 16:04:13.854091] E [socket.c:353:ssl_setup_connection] 0-socket.management: SSL connect error (client: ) [2016-09-22 16:04:13.854170] E [socket.c:202:ssl_dump_error_stack] 0-socket.management: error:1408F10B:SSL routines:SSL3_GET_RECORD:wrong version number [2016-09-22 16:04:13.854195] E [socket.c:2419:socket_poller] 0-socket.management: server setup failed [2016-09-22 16:04:29.366244] W [socket.c:590:__socket_rwv] 0-glustershd: readv on /var/run/gluster/6f50c5e5a1fe717d1cea6d3d9944edd4.socket failed (Invalid argument) The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management: glustershd has disconnected from glusterd." repeated 39 times between [2016-09-22 16:02:47.356743] and [2016- 09-22 16:04:44.367580] [2016-09-22 16:04:47.367856] I [MSGID: 106006] [glusterd-svc-mgmt.c:327:glusterd_svc_common_rpc_notify] 0-management: glustershd has disconnected from glusterd. [2016-09-22 16:05:14.426256] I [MSGID: 106499] [glusterd-handler.c:4360:__glusterd_handle_status_volume] 0-management: Received status volume req for volume engine </snip> Error in glustershd.log <snip> [2016-09-22 16:09:16.412011] I [MSGID: 100030] [glusterfsd.c:2412:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.8.4 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/gluster/6f50c5e5a1fe717d1cea6d3d9944edd4.socket --xlator-option *replicate*.node-uuid=aec6ef55-7a72-4a14-b17e-fab690817e8c) [2016-09-22 16:09:16.420485] I [socket.c:3974:socket_init] 0-socket.glusterfsd: SSL support for glusterd is ENABLED [2016-09-22 16:09:16.420888] E [socket.c:4052:socket_init] 0-socket.glusterfsd: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled [2016-09-22 16:09:16.421894] I [socket.c:3974:socket_init] 0-glusterfs: SSL support for glusterd is ENABLED [2016-09-22 16:09:16.422010] E [socket.c:4052:socket_init] 0-glusterfs: failed to open /etc/ssl/dhparam.pem, DH ciphers are disabled [2016-09-22 16:09:16.430991] E [socket.c:3060:socket_connect] 0-glusterfs: connection attempt on failed, (Connection refused) [2016-09-22 16:09:16.431141] I [MSGID: 101190] [event-epoll.c:628:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-09-22 16:09:16.431302] W [socket.c:590:__socket_rwv] 0-glusterfs: writev on ::1:24007 failed (Success) [2016-09-22 16:09:16.431885] E [rpc-clnt.c:365:saved_frames_unwind] (--> /lib64/libglusterfs.so.0(_gf_log_callingfn+0x192)[0x7f76666dd2a2] (--> /lib64/libgfrpc.so.0(saved_frames_unwind+0x1de)[0x7f76664a37fe] (--> /lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7f76664a390e] (--> /lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x84)[0x7f76664a5064] (--> /lib64/libgfrpc.so.0(rpc_clnt_notify+0x120)[0x7f76664a5940] ))))) 0-glusterfs: forced unwinding frame type(GlusterFS Handshake) op(GETSPEC(2)) called at 2016-09-22 16:09:16.431317 (xid=0x1) [2016-09-22 16:09:16.431927] E [glusterfsd-mgmt.c:1706:mgmt_getspec_cbk] 0-mgmt: failed to fetch volume file (key:gluster/glustershd) [2016-09-22 16:09:16.431976] W [glusterfsd.c:1288:cleanup_and_exit] (-->/lib64/libgfrpc.so.0(saved_frames_unwind+0x205) [0x7f76664a3825] -->/usr/sbin/glusterfs(mgmt_getspec_cbk+0x536) [0x7f7666bd8056] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x7f7666bd1abb] ) 0-: received signum (0), shutting down </snip> Self-heals pending for long time <snip> [root@ ~]# gluster volume heal engine info Brick 10.70.36.73:/rhgs/engine/enginebrick /file2 /.shard Status: Connected Number of entries: 2 Brick 10.70.36.76:/rhgs/engine/enginebrick Status: Connected Number of entries: 0 Brick 10.70.36.77:/rhgs/engine/enginebrick /file2 /.shard Status: Connected Number of entries: 2 </snip> Note block size is just 512MB Created attachment 1203846 [details]
glusterd.log
    Created attachment 1203847 [details]
glustershd.log
    Hi, After analyse the logs it shows it is failing because other end point of socket (unix socket) is not connected so it is throwing "Connection refused".After backport patch from upstream (https://bugzilla.redhat.com/show_bug.cgi?id=1333317) it seems issue is resolved. Patch(http://review.gluster.org/#/c/15567/) is upload to merge the same. Regards Mohit Agrawal Tested with glusterfs-3.8.4-2.el7rhgs. After enabling SSL/TLS encryption on the management and data path, self-heal daemon comes up and there are no errors/problems with its communication with glusterd # gluster volume status vmstore Status of volume: vmstore Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 10.70.36.74:/rhgs/brick1/vmb1 49153 0 Y 18318 Brick 10.70.36.75:/rhgs/brick1/vmb1 49153 0 Y 17434 Brick 10.70.36.78:/rhgs/brick1/vmb1 49153 0 Y 15767 Self-heal Daemon on localhost N/A N/A Y 12863 Self-heal Daemon on 10.70.36.75 N/A N/A Y 14456 Self-heal Daemon on 10.70.36.78 N/A N/A Y 15789 Task Status of Volume vmstore ------------------------------------------------------------------------------ There are no active volume tasks Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html  |