Description of problem: ======================== There were 4 nodes in the cluster. Using 2 nodes created a distribute volume with 2 bricks, one brick per node. Before starting the volume , if any of the node's glusterd is restarted, the nodes are moved to "Peer Rejected (Connected)" state. After this the nodes are out of sync. Version-Release number of selected component (if applicable): =============================================================== glusterfs 3.6.0.24 built on Jul 3 2014 11:03:38 How reproducible: =================== Often Steps to Reproduce: ======================== 1. Have a cluster with 4 nodes. Create a dis volume with 2 bricks, one brick per node. root@rhs-client13 [Jul-16-2014-16:55:15] >gluster v info dis1 Volume Name: dis1 Type: Distribute Volume ID: daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e Status: Created Snap Volume: no Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: rhs-client13:/rhs/device1/b1 Brick2: rhs-client14:/rhs/device1/b2 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable 2. Check the volume ckecksum on all the nodes. The volume checksum differs. Node1: ~~~~~~ root@rhs-client11 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum info=3943198411 Node2: ~~~~~~ root@rhs-client12 [Jul-16-2014-16:54:59] >cat /var/lib/glusterd/vols/dis1/cksum info=3943198411 Node3: ~~~~~~ root@rhs-client13 [Jul-16-2014-16:55:21] >cat /var/lib/glusterd/vols/dis1/cksum info=1802542222 Node4: ~~~~~ root@rhs-client14 [Jul-16-2014-16:57:44] >cat /var/lib/glusterd/vols/dis1/cksum info=1802542222 3. restart glusterd on Node1. Node1 peer status ~~~~~~~~~~~~~~~~~~ root@rhs-client11 [Jul-16-2014-17:01:05] >gluster peer status Number of Peers: 4 Hostname: mia Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2 State: Peer Rejected (Connected) Hostname: rhs-client12 Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422 State: Peer in Cluster (Connected) Hostname: rhs-client13 Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e State: Peer Rejected (Connected) Hostname: rhs-client14 Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee State: Peer Rejected (Connected) root@rhs-client11 [Jul-16-2014-17:01:14] > Node2 peer status: ~~~~~~~~~~~~~~~~~~~ root@rhs-client12 [Jul-16-2014-16:59:51] >gluster peer status Number of Peers: 4 Hostname: rhs-client13 Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e State: Peer in Cluster (Connected) Hostname: rhs-client14 Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee State: Peer in Cluster (Connected) Hostname: 10.70.36.35 Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a State: Peer in Cluster (Connected) Hostname: mia Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2 State: Peer Rejected (Connected) root@rhs-client12 [Jul-16-2014-17:08:09] > Node3 peer status: ~~~~~~~~~~~~~~~~~~~ root@rhs-client13 [Jul-16-2014-17:00:43] >gluster peer status Number of Peers: 4 Hostname: rhs-client12 Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422 State: Peer in Cluster (Connected) Hostname: mia Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2 State: Peer Rejected (Connected) Hostname: rhs-client14 Uuid: adf3e812-f95f-431d-81f6-2f58df8f9aee State: Peer in Cluster (Connected) Hostname: 10.70.36.35 Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a State: Peer Rejected (Connected) root@rhs-client13 [Jul-16-2014-17:00:57] > Node4 peer status: ~~~~~~~~~~~~~~~~~~~~~~ root@rhs-client14 [Jul-16-2014-16:59:51] >gluster peer status Number of Peers: 4 Hostname: rhs-client12 Uuid: 6c86ec43-d220-49f1-8b34-364c745d8422 State: Peer in Cluster (Connected) Hostname: 10.70.36.35 Uuid: 60add8e4-7cf6-4d35-8b7c-1b6884ed8c6a State: Peer Rejected (Connected) Hostname: mia Uuid: 647348db-1489-4795-871f-e9a71ffa2ef2 State: Peer Rejected (Connected) Hostname: rhs-client13 Uuid: aa523db7-35fa-4737-b5b3-fb150b78ba8e State: Peer in Cluster (Connected) root@rhs-client14 [Jul-16-2014-17:07:39] > Expected results: ==================== volume checksums shouldn't match due to which peers are moved to rejected state because of which the nodes are not in sync. Additional info: =================== Node1 glusterd vol info for the volume dis1: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ root@rhs-client11 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info type=0 count=2 status=0 sub_count=0 stripe_count=1 replica_count=1 version=1 transport-type=0 volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e username=8dc282dd-bbb4-46b2-9c29-9419f046edf3 password=57277be1-5589-4191-8d52-529d4cbc9779 caps=15 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 performance.readdir-ahead=on brick-0=rhs-client13:-rhs-device1-b1 brick-1=rhs-client14:-rhs-device1-b2 root@rhs-client11 [Jul-16-2014-17:10:09] > Node2 glusterd vol info for the volume dis1: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ root@rhs-client12 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info type=0 count=2 status=0 sub_count=0 stripe_count=1 replica_count=1 version=1 transport-type=0 volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e username=8dc282dd-bbb4-46b2-9c29-9419f046edf3 password=57277be1-5589-4191-8d52-529d4cbc9779 caps=15 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 performance.readdir-ahead=on brick-0=rhs-client13:-rhs-device1-b1 brick-1=rhs-client14:-rhs-device1-b2 root@rhs-client12 [Jul-16-2014-17:10:09] > Node3 glusterd vol info for the volume dis1: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ root@rhs-client13 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info type=0 count=2 status=0 sub_count=0 stripe_count=1 replica_count=1 version=1 transport-type=0 volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e username=8dc282dd-bbb4-46b2-9c29-9419f046edf3 password=57277be1-5589-4191-8d52-529d4cbc9779 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 performance.readdir-ahead=on brick-0=rhs-client13:-rhs-device1-b1 brick-1=rhs-client14:-rhs-device1-b2 root@rhs-client13 [Jul-16-2014-17:10:09] > Node4 glusterd vol info for the volume dis1: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ root@rhs-client14 [Jul-16-2014-17:09:59] >cat /var/lib/glusterd/vols/dis1/info type=0 count=2 status=0 sub_count=0 stripe_count=1 replica_count=1 version=1 transport-type=0 volume-id=daf0dbe1-ceb1-4a62-9e43-c57fc7c4de8e username=8dc282dd-bbb4-46b2-9c29-9419f046edf3 password=57277be1-5589-4191-8d52-529d4cbc9779 parent_volname=N/A restored_from_snap=00000000-0000-0000-0000-000000000000 snap-max-hard-limit=256 performance.readdir-ahead=on brick-0=rhs-client13:-rhs-device1-b1 brick-1=rhs-client14:-rhs-device1-b2 root@rhs-client14 [Jul-16-2014-17:10:09] >
Shwetha, Could you describe how you had the bricks setup (LVM/ThinP etc.)? I've not verified it yet, but I feel this is a part of the cause. ~kaushal
Created attachment 918626 [details] Script to create bricks running the script : ./mkfs_snapshot1.sh "create"
Kaushal, I feel this bug is dependent on, https://bugzilla.redhat.com/show_bug.cgi?id=1116264 'gluster volume info' has got some extra capabilities listed and does that change the volume checksums ?
Thanks Shwetha. Turns out the problem wasn't with the bricks. It was with the way BD xlator capabilities was being set, or rather being erased during volume create. During volume create, the caps are initially set to all enabled. This happens on all volumes irrespective of whether its a BD volume or not. And then based on the bricks capabilities, unsupported capabilities were removed or erased. But this removal was being done only on the peers which contained bricks for the volume. On the other peers, the capabilities were not erased, which was leading to checksums differring. This lead to peers being rejected when they were restarted.
(In reply to SATHEESARAN from comment #4) > Kaushal, > > I feel this bug is dependent on, > https://bugzilla.redhat.com/show_bug.cgi?id=1116264 > > 'gluster volume info' has got some extra capabilities listed and does that > change the volume checksums ? Yes. Both of these are caused by the same problem. I would close this bug as a duplicate of the above, but that would probably cause some other procedural problems. Can you check and let me know if it is okay?
*** Bug 1116264 has been marked as a duplicate of this bug. ***
Verified with glusterfs-3.6.0.25-1.el6rhs Performed the following steps : 1. Created a 4 node cluster ( Trusted Storage Pool ) 2. Created thinp bricks as recommended for gluster volume snapshot feature 3. Created distributed volume, distributed-replicate volume 4. Checked the 'gluster volume status' on all the nodes Observation: No BD Xlator related capabilities are shown in the 'gluster volume info' output 5. Restarted glusterd on all nodes Observation: 'gluster peer status' shows all the nodes with "Peer in Cluster" 6. Started the volume 7. Checked 'gluster peer status' and 'gluster volume info' Observations remain the same Marking this bug as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-1278.html