Bug 1138547 - Peer probe during rebalance causing "Peer rejected" state for an existing node in trusted cluster
Summary: Peer probe during rebalance causing "Peer rejected" state for an existing no...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: RHGS 3.0.3
Assignee: Atin Mukherjee
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks: 1152890 1162694
TreeView+ depends on / blocked
 
Reported: 2014-09-05 07:10 UTC by Lalatendu Mohanty
Modified: 2016-09-17 14:43 UTC (History)
8 users (show)

Fixed In Version: glusterfs-3.6.0.31-1
Doc Type: Bug Fix
Doc Text:
Previously, peer probe failed during rebalance as the global peerinfo structure was modified while a transaction was in progress. The peer was rejected and could not be added into the trusted cluster. With this fix, local peer list is maintained in gluster op state machine on a per transaction basis such that peer probe and rebalance can go on independently. Now, probing a peer during rebalance operation will be successful.
Clone Of:
: 1152890 (view as bug list)
Environment:
Last Closed: 2015-01-15 13:39:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0038 0 normal SHIPPED_LIVE Red Hat Storage 3.0 enhancement and bug fix update #3 2015-01-15 18:35:28 UTC

Description Lalatendu Mohanty 2014-09-05 07:10:52 UTC
Description of problem:

While rebalance is in progress, peer probe to a new host is causing one of the existing healthy node in the cluster to go to "Peer Rejected" state. I think it is causing the brick on that particular node to go off-line. Hence the rebalance process is failing.

Version-Release number of selected component (if applicable):
glusterfs-server-3.6.0.28-1.el6rhs

How reproducible:
Intermittent

Steps to Reproduce:
1. Create a 2X2 distribute volume. Start the volume.
2. Fuse mount the volume on a RHEL 6.5 client, create data on it.
3. Add a spare disk to the volume
4. Start rebalance
5. Peer probe to a new host from the cluster
6. Check output of "gluster peer status"
7. Also check the rebalance progress

Actual results:

Existing node in cluster going to peer rejected state.

Expected results:


Additional info:

Beaker Job link: https://beaker.engineering.redhat.com/jobs/737692
TESTOUT.log from masternode :

:: [   PASS   ] :: Command 'qeVolumeCreate.sh rebalvol 2 0 0 tcp' (Expected 0, got 0)
xxxxx
:: [ 02:45:17 ] :: Logging gluster volume info:
:: [  BEGIN   ] :: Running 'gluster volume info'
 
Volume Name: rebalvol
Type: Distribute
Volume ID: 7b8d92c6-5689-4aa9-a2a6-5ed2a1e3c888
Status: Started
Snap Volume: no
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: rhsauto022.lab.eng.blr.redhat.com:/bricks/rebalvol_brick0
Brick2: rhsauto019.lab.eng.blr.redhat.com:/bricks/rebalvol_brick1
Options Reconfigured:
server.allow-insecure: on
performance.stat-prefetch: off
performance.readdir-ahead: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256

:: [ 02:58:53 ] :: attempting to run rebalance

:: [   PASS   ] :: Adding a spare brick  (Expected 0, got 0)
:: [  BEGIN   ] :: starting rebalance :: actually running 'gluster volume rebalance  rebalvol  start'
 volume rebalance: rebalvol: success: Starting rebalance on volume rebalvol has been successful.
 ID: 48851e59-9f9d-4d79-b6c2-9c8a03965044
:: [   PASS   ] :: starting rebalance (Expected 0, got 0)

 :: [  BEGIN   ] :: Peer probing rhsauto062.lab.eng.blr.redhat.com :: actually running 'gluster peer probe rhsauto062.lab.eng.blr.redhat.com'
 peer probe: success. 
:: [   PASS   ] :: Peer probing rhsauto062.lab.eng.blr.redhat.com (Expected 0, got 0)


:: [ 02:59:06 ] :: gluster peer status
:: [  BEGIN   ] :: Running 'gluster peer status'
Number of Peers: 2

Hostname: rhsauto019.lab.eng.blr.redhat.com
Uuid: 141ec389-ab2a-42df-a96e-fa28462f6c89
State: Peer Rejected (Connected)

Hostname: rhsauto062.lab.eng.blr.redhat.com
Uuid: 398605a1-488f-4117-95c3-ce342438fb31
State: Peer in Cluster (Connected)

Glusterd logs ( http://lab-02.rhts.eng.blr.redhat.com/beaker/logs/tasks/24142+/24142546/etc-glusterfs-glusterd.vol.log) and found below in logs:

[2014-09-03 21:17:55.671776] W [socket.c:529:__socket_rwv] 0-management: readv on 10.70.36.249:24007 failed (Connection timed out)
[2014-09-03 21:17:55.671880] I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd.
[2014-09-03 21:17:55.671910] W [glusterd-locks.c:632:glusterd_mgmt_v3_unlock] 0-management: Lock for vol rebalvol not held
[2014-09-03 21:18:24.119712] E [socket.c:2169:socket_connect_finish] 0-management: connection to 10.70.36.249:24007 failed (No route to host)
[2014-09-03 21:18:24.119814] I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd.
The message "I [MSGID: 106004] [glusterd-handler.c:4388:__glusterd_peer_rpc_notify] 0-management: Peer 141ec389-ab2a-42df-a96e-fa28462f6c89, in Peer in Cluster state, has disconnected from glusterd." repeated 28 times between [2014-09-03 21:18:24.119814] and [2014-09-03 21:20:12.272821]

Comment 2 Susant Kumar Palai 2014-09-05 09:58:17 UTC
So far I can see that the rebalance client saw a child down event which led to rebalance failure. 

> [2014-09-03 21:29:00.739735] E
> [client-handshake.c:1498:client_query_portmap_cbk]
> 0-rebalvol-client-1: failed to get the port number for remote
> subvolume. Please run 'gluster volume status' on server to see if
> brick process is running.
> [2014-09-03 21:29:00.740355] I [client.c:2215:client_rpc_notify]
> 0-rebalvol-client-1: disconnected from rebalvol-client-1. Client
> process will keep trying to connect to glusterd until brick's port is
> available
> [2014-09-03 21:29:00.740395] W [dht-common.c:5914:dht_notify]
> 0-rebalvol-dht: Received CHILD_DOWN. Exiting

Talked to Lala and he said the brick did not crash.
From the glusterd logs we can see disconnection between peers leading to "Peer rejected".

>>>>>>>>>>>
Hostname: rhsauto019.lab.eng.blr.redhat.com
Uuid: 141ec389-ab2a-42df-a96e-fa28462f6c89
State: Peer Rejected (Connected)
>>>>>>>>>>>


At this point of time it seems to be network disconnection issue.

Comment 4 Lalatendu Mohanty 2014-09-29 09:44:44 UTC
This issue is not reproducible in last couple of week runs (BVT), hence lowering the severity.

Comment 5 Atin Mukherjee 2014-10-29 07:11:49 UTC
Upstream patch link : http://review.gluster.org/8932

Comment 6 Atin Mukherjee 2014-10-29 08:53:13 UTC
Downstream patch link : https://code.engineering.redhat.com/gerrit/#/c/35648/

Comment 7 Atin Mukherjee 2014-11-11 04:51:00 UTC
glusterd op-sm uses global peer list which might get modified if peer membership is requested while there is an on going op-sm transaction, this could lead to an incorrect peerinfo structure resulting either op-sm or peer membership command to behave incorrectly.

Fix is to use local peer list instead of global peer list.

Comment 8 SATHEESARAN 2014-11-27 12:26:18 UTC
Verified with glusterfs-3.6.0.34-1.el6rhs

Did the following tests,

1. Created a cluster of 2 nodes
2. Created a distribute volume of 2 bricks
3. Fuse mounted the volume and created 10 files of 10GB each
4. Added more bricks and triggered rebalance
5. While rebalance is going on, tried to probe a new peer.
Tested probing/detaching a peer, volume set operations.
All worked seamlessly

Tried the peer probe/detach, volume set operations during remove-brick with data migration too. All worked well

Comment 9 Divya 2015-01-09 08:50:35 UTC
Atin,

Please review the edited doc text and sign-off.

Comment 10 Atin Mukherjee 2015-01-13 07:02:51 UTC
Doc text looks okay to me, verified.

Comment 12 errata-xmlrpc 2015-01-15 13:39:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0038.html


Note You need to log in before you can comment on or make changes to this bug.