1120198 – Peers moved to rejected state because of mismatch in volume checksums after creating the volume.

Bug 1120198 - Peers moved to rejected state because of mismatch in volume checksums after creating the volume.

Summary: Peers moved to rejected state because of mismatch in volume checksums after c...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.0.0
Assignee:	Kaushal
QA Contact:	SATHEESARAN
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1116264 (view as bug list)
Depends On:	1120589
Blocks:
TreeView+	depends on / blocked

Reported:	2014-07-16 11:44 UTC by spandura
Modified:	2015-05-13 16:52 UTC (History)
CC List:	8 users (show)
Fixed In Version:	glusterfs-3.6.0.25-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1120589 (view as bug list)
Environment:
Last Closed:	2014-09-22 19:44:25 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Script to create bricks (2.76 KB, application/x-shellscript) 2014-07-17 07:42 UTC, spandura	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2014:1278	0	normal	SHIPPED_LIVE	Red Hat Storage Server 3.0 bug fix and enhancement update	2014-09-22 23:26:55 UTC

Description spandura 2014-07-16 11:44:21 UTC

Description of problem:
========================
There were 4 nodes in the cluster. Using 2 nodes created a distribute volume with 2 bricks, one brick per node. 

Before starting the volume , if any of the node's glusterd is restarted, the nodes are moved to "Peer Rejected (Connected)" state. After this the nodes are out of sync. 

Version-Release number of selected component (if applicable):
===============================================================
glusterfs 3.6.0.24 built on Jul  3 2014 11:03:38

How reproducible:
===================
Often

Steps to Reproduce:
========================
1. Have a cluster with 4 nodes. Create a dis volume with 2 bricks, one brick per node. 

root@rhs-client13 [Jul-16-2014-16:55:15] >gluster v info dis1
 
Volume Name: dis1
Type: Distribute
Volume ID: daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
Status: Created
Snap Volume: no
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: rhs-client13:/rhs/device1/b1
Brick2: rhs-client14:/rhs/device1/b2
Options Reconfigured:
performance.readdir-ahead: on
snap-max-hard-limit: 256
snap-max-soft-limit: 90
auto-delete: disable


2. Check the volume ckecksum on all the nodes. The volume checksum differs. 

Node1:
~~~~~~
root@rhs-client11 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum 
info=3943198411

Node2:
~~~~~~
root@rhs-client12 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum 
info=3943198411

Node3:
~~~~~~
root@rhs-client13 [Jul-16-2014-16:55:21] >cat /var/lib/glusterd/vols/dis1/cksum 
info=1802542222

Node4:
~~~~~
root@rhs-client14 [Jul-16-2014-16:57:44] >cat /var/lib/glusterd/vols/dis1/cksum 
info=1802542222


3. restart glusterd on Node1. 

Node1 peer status
~~~~~~~~~~~~~~~~~~
root@rhs-client11 [Jul-16-2014-17:01:05] >gluster peer status
Number of Peers: 4

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer Rejected (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer Rejected (Connected)
root@rhs-client11 [Jul-16-2014-17:01:14] >


Node2 peer status:
~~~~~~~~~~~~~~~~~~~
root@rhs-client12 [Jul-16-2014-16:59:51] >gluster peer status
Number of Peers: 4

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer in Cluster (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer in Cluster (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)
root@rhs-client12 [Jul-16-2014-17:08:09] >


Node3 peer status:
~~~~~~~~~~~~~~~~~~~
root@rhs-client13 [Jul-16-2014-17:00:43] >gluster peer status
Number of Peers: 4

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client14
Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer Rejected (Connected)
root@rhs-client13 [Jul-16-2014-17:00:57] >

Node4 peer status:
~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client14 [Jul-16-2014-16:59:51] >gluster peer status
Number of Peers: 4

Hostname: rhs-client12
Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422
State: Peer in Cluster (Connected)

Hostname: 10.70.36.35
Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a
State: Peer Rejected (Connected)

Hostname: mia
Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2
State: Peer Rejected (Connected)

Hostname: rhs-client13
Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e
State: Peer in Cluster (Connected)
root@rhs-client14 [Jul-16-2014-17:07:39] >


Expected results:
====================
volume checksums shouldn't match due to which peers are moved to rejected state because of which the nodes are not in sync.  

Additional info:
===================
Node1 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

root@rhs-client11 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
caps=15
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client11 [Jul-16-2014-17:10:09] >

Node2 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client12 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
caps=15
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client12 [Jul-16-2014-17:10:09] >

Node3 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client13 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client13 [Jul-16-2014-17:10:09] >

Node4 glusterd vol info for the volume dis1:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
root@rhs-client14 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info 
type=0
count=2
status=0
sub_count=0
stripe_count=1
replica_count=1
version=1
transport-type=0
volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e
username=8dc282dd-bbb4-46b2-9c29-9419f046edf3
password=57277be1-5589-4191-8d52-529d4cbc9779
parent_volname=N/A
restored_from_snap=00000000-0000-0000-0000-000000000000
snap-max-hard-limit=256
performance.readdir-ahead=on
brick-0=rhs-client13:-rhs-device1-b1
brick-1=rhs-client14:-rhs-device1-b2
root@rhs-client14 [Jul-16-2014-17:10:09] >

Comment 2 Kaushal 2014-07-17 06:49:11 UTC

Shwetha,
Could you describe how you had the bricks setup (LVM/ThinP etc.)? I've not verified it yet, but I feel this is a part of the cause.

~kaushal

Comment 3 spandura 2014-07-17 07:42:35 UTC

Created attachment 918626 [details]
Script to create bricks

running the script : ./mkfs_snapshot1.sh "create"

Comment 4 SATHEESARAN 2014-07-17 08:12:23 UTC

Kaushal,

I feel this bug is dependent on, https://bugzilla.redhat.com/show_bug.cgi?id=1116264

'gluster volume info' has got some extra capabilities listed and does that change the volume checksums ?

Comment 5 Kaushal 2014-07-17 08:34:14 UTC

Thanks Shwetha.

Turns out the problem wasn't with the bricks. It was with the way BD xlator capabilities was being set, or rather being erased during volume create.

During volume create, the caps are initially set to all enabled. This happens on all volumes irrespective of whether its a BD volume or not. And then based on the bricks capabilities, unsupported capabilities were removed or erased. But this removal was being done only on the peers which contained bricks for the volume. On the other peers, the capabilities were not erased, which was leading to checksums differring.
This lead to peers being rejected when they were restarted.

Comment 6 Kaushal 2014-07-17 08:37:32 UTC

(In reply to SATHEESARAN from comment #4)
> Kaushal,
> 
> I feel this bug is dependent on,
> https://bugzilla.redhat.com/show_bug.cgi?id=1116264
> 
> 'gluster volume info' has got some extra capabilities listed and does that
> change the volume checksums ?

Yes. Both of these are caused by the same problem. I would close this bug as a duplicate of the above, but that would probably cause some other procedural problems. Can you check and let me know if it is okay?

Comment 7 Kaushal 2014-07-17 08:42:59 UTC

*** Bug 1116264 has been marked as a duplicate of this bug. ***

Comment 9 SATHEESARAN 2014-07-25 10:37:05 UTC

Verified with glusterfs-3.6.0.25-1.el6rhs

Performed the following steps :
1. Created a 4 node cluster ( Trusted Storage Pool )
2. Created thinp bricks as recommended for gluster volume snapshot feature
3. Created distributed volume, distributed-replicate volume
4. Checked the 'gluster volume status' on all the nodes

Observation: No BD Xlator related capabilities are shown in the 'gluster volume info' output

5. Restarted glusterd on all nodes
Observation: 'gluster peer status' shows all the nodes with "Peer in Cluster"

6. Started the volume
7. Checked 'gluster peer status' and 'gluster volume info'

Observations remain the same
Marking this bug as verified

Comment 11 errata-xmlrpc 2014-09-22 19:44:25 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2014-1278.html

Note You need to log in before you can comment on or make changes to this bug.