Description of problem: If glusterd on one of the node goes down while rebalancing and comes up after few second , status on the nodes are gone Version-Release number of selected component (if applicable): 3.3.0qa33 How reproducible: Steps to Reproduce: 1. Let A be node on the cluster 2. create a distribute volume and fill up with some data 3. add a new peer to the cluster say B 4. Add a brick from peer B to the volume 5. Initiate rebalance and keep on checking the status 6. Until this point everythong will be fine. 7. Now bring down glusterd on node B for 5 second and restart 8. while node B's glusterd is down check the rebalance status on Node A and B 9. Now bring back glusterd on node B and again check the status Actual results: when glusterd on node B goes down , rebalance status on A will not show node B as the participant for rebalance. at this point checking rebalance status on node B will not show anything. once glusterd on node B comes up check the status on both the nodes , node A will never show up the node B in the status. node B will not show anything upon running rebalance status. Expected results: Additional info:
After glusterd restarts peer status on both the node will be in State: Peer Rejected (Connected) state. [root@gqac023 mnt]# gluster peer status Number of Peers: 1 Hostname: 10.16.157.72 Uuid: 1045cf7a-482d-4e12-9434-948cfbe582c1 State: Peer Rejected (Connected) [root@gqac025 ~]# gluster peer status Number of Peers: 1 Hostname: 10.16.157.66 Uuid: a8da2343-647c-447b-9b17-4876810f9f60 State: Peer Rejected (Connected)
CHANGE: http://review.gluster.com/3110 (glusterd/rebalance: re-establish conn between rebalance process) merged in master by Vijay Bellur (vijay)
Now when glusterd goes down on any of the node, respective node will not be visible on the status list, but when glusterd comes back we can see the entry for the same node.