Red Hat Bugzilla – Bug 1019817
Nodes glusterd operating-version "2" is set to operating-version "1" by the stale peers
Last modified: 2015-12-03 12:21:59 EST
Description of problem:
A node which was part of a storage-cluster having operating-version=1 was powered off and installed with latest RHS 2.1 iso which has the glusterd operating-version=2.
When the node came online, the glusterd operating-version of the node changed from operating-version=2 to operating-version=1 . As soon as the node came online the other peers in the cluster established the connection and changed the "operating-version" to "1". Since the node's glusterd UUID is changed (because of fresh install) and hostname , ip-address remains the same, the node is moved to "Peer Rejected" state from the cluster but the node is still in connected state.
The node doesn't have any peers/volume information . Hence added the node to another cluster. Since the node had glusterd "operating-version":"1" , all other nodes in the cluster which had "operating-version" "2" now got "operating-version" "1" by the new node.
Version-Release number of selected component (if applicable):
glusterfs 126.96.36.199rhs built on Oct 15 2013 14:06:04
Steps to Reproduce:
1. Create a cluster with 2 nodes (node1 and node2) in "operating-version" : "1"
2. Re-install RHS on node2 with latest RHS2.1 iso.
At this phase, peer status output from node1 looks like:
[root@upgrade-4 ~]# gluster peer status
Number of Peers: 1
State: Peer Rejected (Connected)
[root@upgrade-4 ~]# cat /var/lib/glusterd/glusterd.info
peer status output from node2 looks like:
root@rhs-client11 [Oct-16-2013-10:19:03] >gluster peer status
Number of Peers: 0
root@rhs-client11 [Oct-16-2013-11:16:26] >cat glusterd.info
3. Peer probe from node3(RHS 2.1 latest having operating-version=2) to node2.(peer probe successful)
4. Now, node3 operating version is also changed to "operating-version": "1"
5. Create a volume . Start the volume.
6. Try to check volume status quotad
root@rhs-client11 [Oct-16-2013-10:26:07] >gluster v status `gluster v list` quotad
The cluster is operating at version 1. Getting the status of quotad is not allowed in this state.
Before adding node2 to a different cluster(step 3), it needs to be detached completely from the old cluster it was part of. i.e. from node1 the 'gluster peer detach node2 force' should be executed before adding node2 to different cluster.
With this the op-version of node2 will not be affected, and when node2 is probed by node3 the op-version of the cluster would still remain 2 and the quotad command would succeed.
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/
If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.