Description of problem: While rebalance is in progress, peer probe to a new host is causing one of the existing healthy node in the cluster to go to "Peer Rejected" state. I think it is causing the brick on that particular node to go off-line. Hence the rebalance process is failing. Version-Release number of selected component (if applicable): glusterfs-server-3.6.0.28-1.el6rhs How reproducible: Intermittent Steps to Reproduce: 1. Create a 2X2 distribute volume. Start the volume. 2. Fuse mount the volume on a RHEL 6.5 client, create data on it. 3. Add a spare disk to the volume 4. Start rebalance 5. Peer probe to a new host from the cluster 6. Check output of "gluster peer status" 7. Also check the rebalance progress Actual results: Existing node in cluster going to peer rejected state. Expected results: Additional info: Beaker Job link: https://beaker.engineering.redhat.com/jobs/737692 TESTOUT.log from masternode : :: [ PASS ] :: Command 'qeVolumeCreate.sh rebalvol 2 0 0 tcp' (Expected 0, got 0) xxxxx :: [ 02:45:17 ] :: Logging gluster volume info: :: [ BEGIN ] :: Running 'gluster volume info' Volume Name: rebalvol Type: Distribute Volume ID: 7b8d92c6-5689-4aa9-a2a6-5ed2a1e3c888 Status: Started Snap Volume: no Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: rhsauto022.lab.eng.blr.redhat.com:/bricks/rebalvol_brick0 Brick2: rhsauto019.lab.eng.blr.redhat.com:/bricks/rebalvol_brick1 Options Reconfigured: server.allow-insecure: on performance.stat-prefetch: off performance.readdir-ahead: on auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256 :: [ 02:58:53 ] :: attempting to run rebalance :: [ PASS ] :: Adding a spare brick (Expected 0, got 0) :: [ BEGIN ] :: starting rebalance :: actually running 'gluster volume rebalance rebalvol start' volume rebalance: rebalvol: success: Starting rebalance on volume rebalvol has been successful. ID: 48851e59-9f9d-4d79-b6c2-9c8a03965044 :: [ PASS ] :: starting rebalance (Expected 0, got 0) :: [ BEGIN ] :: Peer probing rhsauto062.lab.eng.blr.redhat.com :: actually running 'gluster peer probe rhsauto062.lab.eng.blr.redhat.com' peer probe: success. :: [ PASS ] :: Peer probing rhsauto062.lab.eng.blr.redhat.com (Expected 0, got 0) :: [ 02:59:06 ] :: gluster peer status :: [ BEGIN ] :: Running 'gluster peer status' Number of Peers: 2 Hostname: rhsauto019.lab.eng.blr.redhat.com Uuid: 141ec389-ab2a-42df-a96e-fa28462f6c89 State: Peer Rejected (Connected) Hostname: rhsauto062.lab.eng.blr.redhat.com Uuid: 398605a1-488f-4117-95c3-ce342438fb31 State: Peer in Cluster (Connected) Glusterd logs ( http://lab-02.rhts.eng.blr.redhat.com/beaker/logs/tasks/24142+/24142546/etc-glusterfs-glusterd.vol.log) and found below in logs: [2014-09-03 21:17:55.671776] W [socket.c:529:__socket_rwv] 0-management: readv on 10.70.36.249:24007 failed (Connection timed out) [2014-09-03 21:17:55.671880] I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd. [2014-09-03 21:17:55.671910] W [glusterd-locks.c:632:glusterd_mgmt_v3_unlock] 0-management: Lock for vol rebalvol not held [2014-09-03 21:18:24.119712] E [socket.c:2169:socket_connect_finish] 0-management: connection to 10.70.36.249:24007 failed (No route to host) [2014-09-03 21:18:24.119814] I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd. The message "I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd." repeated 28 times between [2014-09-03 21:18:24.119814] and [2014-09-03 21:20:12.272821]
So far I can see that the rebalance client saw a child down event which led to rebalance failure. > [2014-09-03 21:29:00.739735] E > [client-handshake.c:1498:client_query_portmap_cbk] > 0-rebalvol-client-1: failed to get the port number for remote > subvolume. Please run 'gluster volume status' on server to see if > brick process is running. > [2014-09-03 21:29:00.740355] I [client.c:2215:client_rpc_notify] > 0-rebalvol-client-1: disconnected from rebalvol-client-1. Client > process will keep trying to connect to glusterd until brick's port is > available > [2014-09-03 21:29:00.740395] W [dht-common.c:5914:dht_notify] > 0-rebalvol-dht: Received CHILD_DOWN. Exiting Talked to Lala and he said the brick did not crash. From the glusterd logs we can see disconnection between peers leading to "Peer rejected". >>>>>>>>>>> Hostname: rhsauto019.lab.eng.blr.redhat.com Uuid: 141ec389-ab2a-42df-a96e-fa28462f6c89 State: Peer Rejected (Connected) >>>>>>>>>>> At this point of time it seems to be network disconnection issue.
This issue is not reproducible in last couple of week runs (BVT), hence lowering the severity.
Upstream patch link : http://review.gluster.org/8932
Downstream patch link : https://code.engineering.redhat.com/gerrit/#/c/35648/
glusterd op-sm uses global peer list which might get modified if peer membership is requested while there is an on going op-sm transaction, this could lead to an incorrect peerinfo structure resulting either op-sm or peer membership command to behave incorrectly. Fix is to use local peer list instead of global peer list.
Verified with glusterfs-3.6.0.34-1.el6rhs Did the following tests, 1. Created a cluster of 2 nodes 2. Created a distribute volume of 2 bricks 3. Fuse mounted the volume and created 10 files of 10GB each 4. Added more bricks and triggered rebalance 5. While rebalance is going on, tried to probe a new peer. Tested probing/detaching a peer, volume set operations. All worked seamlessly Tried the peer probe/detach, volume set operations during remove-brick with data migration too. All worked well
Atin, Please review the edited doc text and sign-off.
Doc text looks okay to me, verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0038.html