Bug 1001056 - glusterd: Probing a machine which is part of another cluster fails but no error reported in CLI and glusterd logs
glusterd: Probing a machine which is part of another cluster fails but no er...
Status: CLOSED EOL
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
2.1
x86_64 Linux
unspecified Severity medium
: ---
: ---
Assigned To: Nagaprasad Sathyanarayana
SATHEESARAN
glusterd
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-26 08:24 EDT by Rahul Hinduja
Modified: 2016-02-17 19:20 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-03 12:23:12 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rahul Hinduja 2013-08-26 08:24:16 EDT
Description of problem:
=======================

Probe a machine which is a part of another cluster fails which is expected but no error is reported in the cli and in the logs.


[root@dj ~]# gluster peer probe 10.70.34.119
peer probe: failed: 
[root@dj ~]# 


logs on dj:
===========


[2013-08-26 04:41:37.573560] I [glusterd-handler.c:821:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 10.70.34.119 24007
[2013-08-26 04:41:37.603588] I [glusterd-handler.c:2905:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.70.34.119 (24007)
[2013-08-26 04:41:37.608636] I [rpc-clnt.c:967:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2013-08-26 04:41:37.608721] I [socket.c:3487:socket_init] 0-management: SSL support is NOT enabled
[2013-08-26 04:41:37.608740] I [socket.c:3502:socket_init] 0-management: using system polling thread
[2013-08-26 04:41:37.612267] I [glusterd-handler.c:2886:glusterd_friend_add] 0-management: connect returned 0
[2013-08-26 04:41:37.690146] I [glusterd-rpc-ops.c:241:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: 5e3e1a7c-5bc6-4bb7-add9-afd45b8ff33c, host: 10.70.34.119
[2013-08-26 04:41:37.690349] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=2236 max=2 total=4
[2013-08-26 04:41:37.690370] I [mem-pool.c:539:mem_pool_destroy] 0-management: size=124 max=2 total=4





Log on 10.70.34.119 machine which was being probed.
===================================================

[2013-08-26 11:59:08.587823] I [glusterd-handshake.c:553:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 2
[2013-08-26 11:59:08.594390] I [glusterd-handler.c:2324:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: 2dde2c42-1616-4109-b782-dd37185702d8
[2013-08-26 11:59:08.597958] I [glusterd-handler.c:2376:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.70.34.90, op_ret: -1, op_errno: 3, ret: 0

Version-Release number of selected component (if applicable):
=============================================================

glusterfs-3.4.0.20rhs-2.el6rhs.x86_64


Steps to Reproduce:
===================
1. Probe a machine which is part of another cluster.

Actual results:
===============


# gluster peer probe 10.70.34.119
peer probe: failed: 
# 


Expected results:
Should report proper error that the machine is part of another cluster.

Additional info:
================

separating this issue from bug 1000986
Comment 2 SATHEESARAN 2014-11-13 06:17:42 EST
This issue is no longer seen in RHS 3.0.3
I have tested the same, but log messages requires changes.

Here are the steps
1. rhss1 and rhss2 formed the trusted storage pool
2. rhss3 is a new node from which I tried to probe rhss1

Console logs on rhss3
----------------------
[root@rhss3 ~]# gluster peer probe 10.70.37.44
peer probe: failed: 10.70.37.44 is already part of another cluster

Content in .cmd_history on rhss3
---------------------------------
[2014-11-13 16:23:41.426243]  : pe probe 10.70.37.44 : FAILED : 10.70.37.44 is already part of another cluster

Corresponding glusterd logs on that machine rhss3
---------------------------------------------------
2014-11-13 16:22:41.239253] E [rpc-transport.c:481:rpc_transport_unref] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x13f) [0x7fc31d726e4f] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xcb) [0x7fc31d7259eb] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_unref+0x63) [0x7fc31d724623]))) 0-rpc_transport: invalid argument: this
[2014-11-13 16:23:41.407235] I [glusterd-handler.c:1109:__glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 10.70.37.44 24007
[2014-11-13 16:23:41.411602] I [glusterd-handler.c:3199:glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.70.37.44 (24007)
[2014-11-13 16:23:41.416065] I [rpc-clnt.c:969:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2014-11-13 16:23:41.418799] I [glusterd-handler.c:3180:glusterd_friend_add] 0-management: connect returned 0
[2014-11-13 16:23:41.426210] I [glusterd-rpc-ops.c:237:__glusterd_probe_cbk] 0-glusterd: Received probe resp from uuid: 1d9677dc-6159-405e-9319-ad85ec030880, host: 10.70.37.44

Corresponding glusterd logs on that machine rhss1
--------------------------------------------------
[2014-11-13 16:23:41.341482] I [glusterd-handshake.c:1011:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 30000
[2014-11-13 16:23:41.344539] I [glusterd-handler.c:2611:__glusterd_handle_probe_query] 0-glusterd: Received probe from uuid: b5ab2ec3-5411-45fa-a30f-43bd04caf96b
[2014-11-13 16:23:41.346843] I [glusterd-handler.c:2663:__glusterd_handle_probe_query] 0-glusterd: Responded to 10.70.37.216, op_ret: -1, op_errno: 3, ret: 0

tl;dr :

glusterd logs needs to be improved to contain the information regarding the "Peer Probe failure" and the reason behind it
Comment 4 Vivek Agarwal 2015-12-03 12:23:12 EST
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/

If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.

Note You need to log in before you can comment on or make changes to this bug.