Description of problem: I am having a cluster of 4 nodes and tried to add one more node to the existing cluster. This addition resulted in a Peer Rejected(connected) issue. I found that the glusterfsd.log mentions, [2015-06-25 11:51:24.197134] E [MSGID: 106010] [glusterd-utils.c:2649:glusterd_compare_friend_volume] 0-management: Version of Cksums gluster_shared_storage differ. local cksum = 1662818965, remote cksum = 2342825121 on peer 10.70.46.39 Could not make out how the cksum is different only for volume gluster_shared_storage, whereas the other existing volume vol2, it is same. Version-Release number of selected component (if applicable): glusterfs-3.7.1-5.el6rhs.x86_64 How reproducible: always Steps to Reproduce: 1. create the meta volume gluster_shared_storage, start it, mount it on all nodes with the native client 2. create another volume called as vol2, start it 3. gluster peer probe <ip of new node> Actual results: from exiting node, where peer probe command was executed, [root@nfs11 ~]# gluster peer status Number of Peers: 4 Hostname: 10.70.46.27 Uuid: 238ffb63-3548-43c3-a527-ed53aa8f188c State: Peer in Cluster (Connected) Hostname: 10.70.46.25 Uuid: 99ddf436-01d1-4c62-8b21-096b1a08a6de State: Peer in Cluster (Connected) Hostname: 10.70.46.29 Uuid: d05e8c04-9142-406a-9374-51b478ced7e5 State: Peer in Cluster (Connected) Hostname: 10.70.46.39 Uuid: 745b58ba-c963-4004-93fe-5ada9b39d107 State: Peer Rejected (Connected) from the new node, which is attempted to join the cluster, [root@nfs15 ~]# gluster peer status Number of Peers: 1 Hostname: 10.70.46.8 Uuid: dea164a8-d55c-409e-92a5-960fd1dcf7d5 State: Peer Rejected (Connected) Expected results: peer probe should be succesful, cksum should be same for existing volume on all nodes including the new one. Additional info:
Created attachment 1042978 [details] sosreport of nfs11
Created attachment 1042980 [details] sosreport of nfs15
I tried to reproduce the issue with a new VM having the latest iso installed and issue happens with this one, [root@nfs11 ~]# gluster peer probe 10.70.46.22 peer probe: success. [root@nfs11 ~]# gluster peer status Number of Peers: 4 Hostname: 10.70.46.27 Uuid: 238ffb63-3548-43c3-a527-ed53aa8f188c State: Peer in Cluster (Connected) Hostname: 10.70.46.25 Uuid: 99ddf436-01d1-4c62-8b21-096b1a08a6de State: Peer in Cluster (Connected) Hostname: 10.70.46.29 Uuid: d05e8c04-9142-406a-9374-51b478ced7e5 State: Peer in Cluster (Connected) Hostname: 10.70.46.22 Uuid: 39aea6ea-602f-472c-8a00-e72d253d04d6 State: Peer Rejected (Connected) [root@nfs16 ~]# gluster peer status Number of Peers: 1 Hostname: 10.70.46.8 Uuid: dea164a8-d55c-409e-92a5-960fd1dcf7d5 State: Peer Rejected (Connected) [2015-06-25 10:24:00.818488] E [MSGID: 106010] [glusterd-utils.c:2649:glusterd_compare_friend_volume] 0-management: Version of Cksums gluster_shared_storage differ. local cksum = 1662818965, remote cksum = 2342825121 on peer 10.70.46.22 [2015-06-25 10:24:00.818626] I [glusterd-handler.c:3719:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.46.22 (0), ret: 0 [2015-06-25 10:24:03.543396] I [glusterd-handler.c:1395:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req
Initial RCA: ganesha is started with the below mentioned command: gluster nfs-ganesha enable The above command will disable gluster nfs and thus "nfs.disable" will set to "on" in volinfo for all the volumes in the cluster. This option is set in volinfo but not persisted in the store. Because of which during handshake the new node gets "nfs.disable" as "on" but the current node does not have this data present in the store leading to mismatched cksum. I will investigate further and send a patch soon.
Upstream master: http://review.gluster.org/11412/ Upstream release 3.7: http://review.gluster.org/11428 RHGS 3.1: https://code.engineering.redhat.com/gerrit/51703/
Executed a peer probe to a new vm and it worked fine.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html