Bug 1001056

Summary: glusterd: Probing a machine which is part of another cluster fails but no error reported in CLI and glusterd logs
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Rahul Hinduja <rhinduja>
Component: glusterdAssignee: Nagaprasad Sathyanarayana <nsathyan>
Status: CLOSED EOL QA Contact: SATHEESARAN <sasundar>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.1CC: amukherj, rhs-bugs, sasundar, smohan, vbellur
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: glusterd
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-12-03 17:23:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rahul Hinduja 2013-08-26 12:24:16 UTC
Description of problem:
=======================

Probe a machine which is a part of another cluster fails which is expected but no error is reported in the cli and in the logs.


[root@dj ~]# gluster peer probe 10.70.34.119
peer probe: failed: 
[root@dj ~]# 


logs on dj:
===========


[2013-08-26 04:41:37.573560] I [glusterd-handler.c:821:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 10.70.34.119 24007
[2013-08-26 04:41:37.603588] I [glusterd-handler.c:2905:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.70.34.119 (24007)
[2013-08-26 04:41:37.608636] I [rpc-clnt.c:967:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2013-08-26 04:41:37.608721] I [socket.c:3487:socket_init] 0-management: SSL support is NOT enabled
[2013-08-26 04:41:37.608740] I [socket.c:3502:socket_init] 0-management: using system polling thread
[2013-08-26 04:41:37.612267] I [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect returned 0
[2013-08-26 04:41:37.690146] I [glusterd-rpc-ops.c:241:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: 5e3e1a7c-5bc6-4bb7-add9-afd45b8ff33c, host: 10.70.34.119
[2013-08-26 04:41:37.690349] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=2 total=4
[2013-08-26 04:41:37.690370] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=124 max=2 total=4





Log on 10.70.34.119 machine which was being probed.
===================================================

[2013-08-26 11:59:08.587823] I [glusterd-handshake.c:553:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 2
[2013-08-26 11:59:08.594390] I [glusterd-handler.c:2324:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: 2dde2c42-1616-4109-b782-dd37185702d8
[2013-08-26 11:59:08.597958] I [glusterd-handler.c:2376:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.70.34.90, op_ret: -1, op_errno: 3, ret: 0

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.4.0.20rhs-2.el6rhs.x86_64


Steps to Reproduce:
===================
1. Probe a machine which is part of another cluster.

Actual results:
===============


# gluster peer probe 10.70.34.119
peer probe: failed: 
# 


Expected results:
Should report proper error that the machine is part of another cluster.

Additional info:
================

separating this issue from bug 1000986

Comment 2 SATHEESARAN 2014-11-13 11:17:42 UTC
This issue is no longer seen in RHS 3.0.3
I have tested the same, but log messages requires changes.

Here are the steps
1. rhss1 and rhss2 formed the trusted storage pool
2. rhss3 is a new node from which I tried to probe rhss1

Console logs on rhss3
----------------------
[root@rhss3 ~]# gluster peer probe 10.70.37.44
peer probe: failed: 10.70.37.44 is already part of another cluster

Content in .cmd_history on rhss3
---------------------------------
[2014-11-13 16:23:41.426243]  : pe probe 10.70.37.44 : FAILED : 10.70.37.44 is already part of another cluster

Corresponding glusterd logs on that machine rhss3
---------------------------------------------------
2014-11-13 16:22:41.239253] E [rpc-transport.c:481:rpc_transport_unref] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f) [0x7fc31d726e4f] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xcb) [0x7fc31d7259eb] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_unref+0x63) [0x7fc31d724623]))) 0-rpc_transport: invalid argument: this
[2014-11-13 16:23:41.407235] I [glusterd-handler.c:1109:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 10.70.37.44 24007
[2014-11-13 16:23:41.411602] I [glusterd-handler.c:3199:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.70.37.44 (24007)
[2014-11-13 16:23:41.416065] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2014-11-13 16:23:41.418799] I [glusterd-handler.c:3180:glusterd_friend_add] 0-management: connect returned 0
[2014-11-13 16:23:41.426210] I [glusterd-rpc-ops.c:237:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: 1d9677dc-6159-405e-9319-ad85ec030880, host: 10.70.37.44

Corresponding glusterd logs on that machine rhss1
--------------------------------------------------
[2014-11-13 16:23:41.341482] I [glusterd-handshake.c:1011:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30000
[2014-11-13 16:23:41.344539] I [glusterd-handler.c:2611:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: b5ab2ec3-5411-45fa-a30f-43bd04caf96b
[2014-11-13 16:23:41.346843] I [glusterd-handler.c:2663:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.70.37.216, op_ret: -1, op_errno: 3, ret: 0

tl;dr :

glusterd logs needs to be improved to contain the information regarding the "Peer Probe failure" and the reason behind it

Comment 4 Vivek Agarwal 2015-12-03 17:23:12 UTC
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.