Description of problem: ----------------------- When there are multiple interfaces available in the gluster node and to make use both the interfaces for gluster traffic, the peer probe should be done with all the network identifiers (i.e) IP or FQDN While doing so, the other names for the particular peer is updated. The problem here is that the other name of the particular host is not propogated to all the nodes in the cluster, leading to error - "staging failed on the host" - on the other hosts, for any volume related operation, as that node is unaware of the new hostname or IP Version-Release number of selected component (if applicable): ------------------------------------------------------------- 3.7.8 How reproducible: ----------------- Always Steps to Reproduce: -------------------- 1. Create 3 gluster nodes with 2 network interfaces and each of them connected to different (isolated) network 2. Form a gluster cluster with 2 gluster nodes by peer probing with one set of IP ( from network1 ) 3. Probe the node2 ( from node1 ) with IP ( from network2 ) 4. Check peer status on both the nodes 5. From node1, peer probe node3 with IP from network1 6. From node1, peer probe node3 with IP from network2 Actual results: --------------- Peer status on node2 doesn't get updated with other name of node3 Expected results: ----------------- Peer information should be consistent/updated across all the nodes in the cluster
Peer status on 2 nodes ----------------------- [root@data-node1 ~]# gluster peer status Number of Peers: 1 Hostname: mgmt-node2.lab.eng.blr.redhat.com Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e State: Peer in Cluster (Connected) Other names: data-node2.lab.eng.blr.redhat.com mgmt-node2 [root@data-node2 ~]# gluster peer status Number of Peers: 1 Hostname: mgmt-node1.lab.eng.blr.redhat.com Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4 State: Peer in Cluster (Connected) Other names: data-node1.lab.eng.blr.redhat.com Peer status on 3 nodes after probing node3 with network1 --------------------------------------------------------- [root@data-node1 ~]# gluster peer status Number of Peers: 2 Hostname: mgmt-node2.lab.eng.blr.redhat.com Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e State: Peer in Cluster (Connected) Other names: data-node2.lab.eng.blr.redhat.com mgmt-node2 Hostname: mgmt-node3.lab.eng.blr.redhat.com Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710 State: Peer in Cluster (Connected) [root@data-node2 ~]# gluster peer status Number of Peers: 2 Hostname: mgmt-node1.lab.eng.blr.redhat.com Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4 State: Peer in Cluster (Connected) Other names: data-node1.lab.eng.blr.redhat.com Hostname: mgmt-node3.lab.eng.blr.redhat.com Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710 State: Peer in Cluster (Connected) [root@localhost ~]# gluster peer status Number of Peers: 2 Hostname: mgmt-node1.lab.eng.blr.redhat.com Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4 State: Peer in Cluster (Connected) Other names: data-node1.lab.eng.blr.redhat.com Hostname: mgmt-node2.lab.eng.blr.redhat.com Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e State: Peer in Cluster (Connected) Other names: data-node2.lab.eng.blr.redhat.com mgmt-node2 Peer status on 3 nodes after probing node3 with network2 --------------------------------------------------------- [root@data-node1 ~]# gluster peer probe data-node3.lab.eng.blr.redhat.com peer probe: success. Host data-node3.lab.eng.blr.redhat.com port 24007 already in peer list [root@data-node1 ~]# gluster peer status Number of Peers: 2 Hostname: mgmt-node2.lab.eng.blr.redhat.com Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e State: Peer in Cluster (Connected) Other names: data-node2.lab.eng.blr.redhat.com mgmt-node2 Hostname: mgmt-node3.lab.eng.blr.redhat.com Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710 State: Peer in Cluster (Connected) Other names: data-node3.lab.eng.blr.redhat.com <--- other name updated in node1 [root@data-node2 ~]# gluster pe s Number of Peers: 2 Hostname: mgmt-node1.lab.eng.blr.redhat.com Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4 State: Peer in Cluster (Connected) Other names: data-node1.lab.eng.blr.redhat.com Hostname: mgmt-node3.lab.eng.blr.redhat.com <---not updated with other name Uuid: 5b4abfd3-9397-4527-a39e-ee3bc00f5710 State: Peer in Cluster (Connected) [root@localhost ~]# gluster peer status Number of Peers: 2 Hostname: mgmt-node1.lab.eng.blr.redhat.com Uuid: 5ba71f4c-fe2e-410d-939a-d5fc903a1ec4 State: Peer in Cluster (Connected) Other names: data-node1.lab.eng.blr.redhat.com Hostname: mgmt-node2.lab.eng.blr.redhat.com Uuid: 204a51d3-3c2c-4bec-a005-4e974a49aa7e State: Peer in Cluster (Connected) Other names: data-node2.lab.eng.blr.redhat.com mgmt-node [root@data-node1 ~]# gluster volume create testvol data-node3.lab.eng.blr.redhat.com:/rhs/brick1/brc1 volume create: testvol: failed: Staging failed on mgmt-node2.lab.eng.blr.redhat.com. Error: Host data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state Error messages in glusterd log in node1 - <snip> [2016-03-03 18:40:38.034436] I [MSGID: 106487] [glusterd-handler.c:1411:__glusterd_handle_cli_list_friends] 0-glusterd: Received cli list req [2016-03-03 18:45:20.723287] E [MSGID: 106452] [glusterd-utils.c:5735:glusterd_new_brick_validate] 0-management: Host data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state [2016-03-03 18:45:20.723323] E [MSGID: 106536] [glusterd-volume-ops.c:1336:glusterd_op_stage_create_volume] 0-management: Host data-node3.lab.eng.blr.redhat.com is not in 'Peer in Cluster' state [2016-03-03 18:45:20.723338] E [MSGID: 106301] [glusterd-op-sm.c:5241:glusterd_op_ac_stage_op] 0-management: Stage failed on operation 'Volume Create', Status : -1 </snip>
REVIEW: http://review.gluster.org/13817 (glusterd: Add a new event to handle multi-net probes) posted (#1) for review on master by Kaushal M (kaushal)
REVIEW: http://review.gluster.org/13840 (glusterd: Add a new event to handle multi-net probes) posted (#1) for review on release-3.7 by Kaushal M (kaushal)
COMMIT: http://review.gluster.org/13840 committed in release-3.7 by Atin Mukherjee (amukherj) ------ commit de450e8cf8f2bd483523a2721a289d3f1027dacc Author: Kaushal M <kaushal> Date: Tue Mar 22 16:32:32 2016 +0530 glusterd: Add a new event to handle multi-net probes Backport of d0cb21b from master This allows GlusterD to send updates to all other nodes when attaching new addresses using multi-net peer probe. Change-Id: I62846be750ab3721912e7b49656594347ea61723 BUG: 1314366 Signed-off-by: Kaushal M <kaushal> Reviewed-originally-on: http://review.gluster.org/13817 Reviewed-on: http://review.gluster.org/13840 Smoke: Gluster Build System <jenkins.com> CentOS-regression: Gluster Build System <jenkins.com> NetBSD-regression: NetBSD Build System <jenkins.org>
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.10, please open a new bug report. glusterfs-3.7.10 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution. [1] https://www.gluster.org/pipermail/gluster-users/2016-April/026164.html [2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user