1120589 – Peers moved to rejected state because of mismatch in volume checksums after creating the volume.

Bug 1120589 - Peers moved to rejected state because of mismatch in volume checksums after creating the volume.

Summary: Peers moved to rejected state because of mismatch in volume checksums after c...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	glusterd
Sub Component:
Version:	mainline
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	Kaushal
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1120198
TreeView+	depends on / blocked

Reported:	2014-07-17 08:46 UTC by Kaushal
Modified:	2014-11-11 08:37 UTC (History)
CC List:	9 users (show)
Fixed In Version:	glusterfs-3.6.0beta1
Clone Of:	1120198
Environment:
Last Closed:	2014-11-11 08:37:08 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Kaushal 2014-07-17 08:46:06 UTC

+++ This bug was initially created as a clone of Bug #1120198 +++

Description of problem:
========================
There were 4 nodes in the cluster. Using 2 nodes created a distribute volume with 2 bricks, one brick per node. 

Before starting the volume , if any of the node's glusterd is restarted, the nodes are moved to "Peer Rejected (Connected)" state. After this the nodes are out of sync. 

Version-Release number of selected component (if applicable):
===============================================================
glusterfs 3.6.0.24 built on Jul  3 2014 11:03:38

How reproducible:
===================
Often

Steps to Reproduce:
========================
1. Have a cluster with 4 nodes. Create a dis volume with 2 bricks, one brick per node. 

root@rhs-client13 [Jul-16-2014-16:55:15] >gluster v info dis1
 
Volume Name: dis1
Type: Distribute
Volume ID: daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
Status: Created
Snap Volume: no
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: rhs-client13:/rhs/device1/b1
Brick2: rhs-client14:/rhs/device1/b2
Options Reconfigured:
performance.readdir-ahead: on
snap-max-hard-limit: 256
snap-max-soft-limit: 90
auto-delete: disable


2. Check the volume ckecksum on all the nodes. The volume checksum differs. 

Node1:
~~~~~~
root@rhs-client11 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum 
info=3943198411

Node2:
~~~~~~
root@rhs-client12 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum 
info=3943198411

Node3:
~~~~~~
root@rhs-client13 [Jul-16-2014-16:55:21] >cat /var/lib/glusterd/vols/dis1/cksum 
info=1802542222

Node4:
~~~~~
root@rhs-client14 [Jul-16-2014-16:57:44] >cat /var/lib/glusterd/vols/dis1/cksum 
info=1802542222


3. restart glusterd on Node1. 

Node1 peer status
~~~~~~~~~~~~~~~~~~
root@rhs-client11 [Jul-16-2014-17:01:05] >gluster peer status
Number of Peers: 4

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer Rejected (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer Rejected (Connected)
root@rhs-client11 [Jul-16-2014-17:01:14] >


Node2 peer status:
~~~~~~~~~~~~~~~~~~~
root@rhs-client12 [Jul-16-2014-16:59:51] >gluster peer status
Number of Peers: 4

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer in Cluster (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer in Cluster (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)
root@rhs-client12 [Jul-16-2014-17:08:09] >


Node3 peer status:
~~~~~~~~~~~~~~~~~~~
root@rhs-client13 [Jul-16-2014-17:00:43] >gluster peer status
Number of Peers: 4

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer Rejected (Connected)
root@rhs-client13 [Jul-16-2014-17:00:57] >

Node4 peer status:
~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client14 [Jul-16-2014-16:59:51] >gluster peer status
Number of Peers: 4

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer Rejected (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer in Cluster (Connected)
root@rhs-client14 [Jul-16-2014-17:07:39] >


Expected results:
====================
volume checksums shouldn't match due to which peers are moved to rejected state because of which the nodes are not in sync.  

Additional info:
===================
Node1 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

root@rhs-client11 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
caps=15
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client11 [Jul-16-2014-17:10:09] >

Node2 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client12 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
caps=15
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client12 [Jul-16-2014-17:10:09] >

Node3 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client13 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client13 [Jul-16-2014-17:10:09] >

Node4 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client14 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client14 [Jul-16-2014-17:10:09] >

--- Additional comment from Kaushal on 2014-07-17 12:19:11 IST ---

Shwetha,
Could you describe how you had the bricks setup (LVM/ThinP etc.)? I've not verified it yet, but I feel this is a part of the cause.

~kaushal

--- Additional comment from  on 2014-07-17 13:12:35 IST ---

running the script : ./mkfs_snapshot1.sh "create"

--- Additional comment from SATHEESARAN on 2014-07-17 13:42:23 IST ---

Kaushal,

I feel this bug is dependent on, https://bugzilla.redhat.com/show_bug.cgi?id=1116264

'gluster volume info' has got some extra capabilities listed and does that change the volume checksums ?

--- Additional comment from Kaushal on 2014-07-17 14:04:14 IST ---

Thanks Shwetha.

Turns out the problem wasn't with the bricks. It was with the way BD xlator capabilities was being set, or rather being erased during volume create.

During volume create, the caps are initially set to all enabled. This happens on all volumes irrespective of whether its a BD volume or not. And then based on the bricks capabilities, unsupported capabilities were removed or erased. But this removal was being done only on the peers which contained bricks for the volume. On the other peers, the capabilities were not erased, which was leading to checksums differring.
This lead to peers being rejected when they were restarted.

--- Additional comment from Kaushal on 2014-07-17 14:07:32 IST ---

(In reply to SATHEESARAN from comment #4)
> Kaushal,
> 
> I feel this bug is dependent on,
> https://bugzilla.redhat.com/show_bug.cgi?id=1116264
> 
> 'gluster volume info' has got some extra capabilities listed and does that
> change the volume checksums ?

Yes. Both of these are caused by the same problem. I would close this bug as a duplicate of the above, but that would probably cause some other procedural problems. Can you check and let me know if it is okay?

Comment 1 Anand Avati 2014-07-17 09:09:00 UTC

REVIEW: http://review.gluster.org/8323 (Correctly reset volinfo->caps during volume create) posted (#1) for review on master by Kaushal M (kaushal)

Comment 2 Anand Avati 2014-07-17 10:04:25 UTC

REVIEW: http://review.gluster.org/8323 (glusterd: Correctly reset volinfo->caps during volume create) posted (#2) for review on master by Kaushal M (kaushal)

Comment 3 Anand Avati 2014-07-17 12:21:06 UTC

COMMIT: http://review.gluster.org/8323 committed in master by Krishnan Parthasarathi (kparthas) 
------
commit 8896ffd86b1856de17d65874f89a76ad84b6258b
Author: Kaushal M <kaushal>
Date:   Thu Jul 17 14:17:17 2014 +0530

    glusterd: Correctly reset volinfo->caps during volume create
    
    Change-Id: I012899be08a06d39ea5c9fb98a66acf833d7213f
    BUG: 1120589
    Signed-off-by: Kaushal M <kaushal>
    Reviewed-on: http://review.gluster.org/8323
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Krishnan Parthasarathi <kparthas>
    Tested-by: Krishnan Parthasarathi <kparthas>

Comment 4 Niels de Vos 2014-09-22 12:44:55 UTC

A beta release for GlusterFS 3.6.0 has been released. Please verify if the release solves this bug report for you. In case the glusterfs-3.6.0beta1 release does not have a resolution for this issue, leave a comment in this bug and move the status to ASSIGNED. If this release fixes the problem for you, leave a note and change the status to VERIFIED.

Packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update (possibly an "updates-testing" repository) infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-September/018836.html
[2] http://supercolony.gluster.org/pipermail/gluster-users/

Comment 5 Niels de Vos 2014-11-11 08:37:08 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.6.1, please reopen this bug report.

glusterfs-3.6.1 has been announced [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://supercolony.gluster.org/pipermail/gluster-users/2014-November/019410.html
[2] http://supercolony.gluster.org/mailman/listinfo/gluster-users

Note You need to log in before you can comment on or make changes to this bug.