Bug 1168897 - Attempt remove-brick after node has terminated in cluster gives error: volume remove-brick commit force: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.
Summary: Attempt remove-brick after node has terminated in cluster gives error: volume...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: GlusterFS
Classification: Community
Component: build
Version: 3.6.1
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-11-28 11:03 UTC by john.lane
Modified: 2016-08-30 12:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-30 12:47:25 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description john.lane 2014-11-28 11:03:00 UTC
One node failed/terminated in a 2 node cluster with volumes replicated across the 2 nodes.
When attempt to remove bricks associated with the now now-existent node it fails as below:
gluster volume remove-brick gv_bmo replica 1 10.110.25.137:/data/brick/gv_bmo force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.

If I attach a new node to the cluster (with same OS version and glusterfs package versions (all components 3.6.1-1.el7.x86_64 ). And try to add a brick to the volume a similar error is reported:

 gluster vol add-brick gv_bmo replica 3 10.110.25.243:/data/brick/gv_bmo volume add-brick: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.

The removal of bricks in same manner worked previously with 3.5.1.


Version-Release number of selected component (if applicable):

glusterfs-libs-3.6.1-1.el7.x86_64
glusterfs-3.6.1-1.el7.x86_64
glusterfs-server-3.6.1-1.el7.x86_64
glusterfs-api-3.6.1-1.el7.x86_64
glusterfs-fuse-3.6.1-1.el7.x86_64
glusterfs-cli-3.6.1-1.el7.x86_64



How reproducible:


Steps to Reproduce:
1. Set up volume replicated across bricks on 2 nodes
2. Shutdown one node
3. Try to remove the brick associated with failed node.

Actual results:
failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.

Expected results:


Additional info:

Comment 1 john.lane 2014-11-28 15:54:10 UTC
Have found can 'fix' the issue by changing the glusterd.info setting for the op-version:

 diff glusterd.info glusterd.info.orig
2c2
< operating-version=30600
---
> operating-version=30501

and restarting glusterd.

Presumably, the glusterfs-server package update forgets to modify this file.

Comment 2 john.lane 2014-11-28 15:55:01 UTC
Probably should be operating-version=30601

But above works.

Comment 3 Khoi Mai 2014-12-09 00:13:44 UTC
is there a cLi command that would update that file in 3.5.3?  When I try to execute what I searched:

https://botbot.me/freenode/gluster/search/?q=op-version

# gluster volume set all cluster.op-version 30501
volume set: failed: option : cluster.op-version does not exist
Did you mean cluster.eager-lock?


am I doing something wrong?

Comment 4 Anatoly Pugachev 2015-03-06 09:53:20 UTC
Can someone bump version for this bug to mainline, since it is still present in 3.6.2 (package glusterfs-3.6.2-1.fc21.x86_64) ? Thanks.


[root@node01 ~]# rpm -q glusterfs
glusterfs-3.6.2-1.fc21.x86_64
[root@node01 ~]# gluster vol info vol1
 
Volume Name: vol1
Type: Replicate
Volume ID: 5b6ae6c7-6139-45b5-b4af-d2f21709f6f2
Status: Stopped
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: node01:/export/sdb1/brick
Brick2: node02:/export/sdb1/brick
[root@node01 ~]# gluster vol remove-brick vol1 replica 1 node02:/export/sdb1/brick force
Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y
volume remove-brick commit force: failed: One or more nodes do not support the required op-version. Cluster op-version must atleast be 30600.

Comment 5 mailbox 2015-05-20 13:59:18 UTC
Seen also while using 3.6.2 from Ubuntu PPA (http://ppa.launchpad.net/gluster/glusterfs-3.6/ubuntu) on Ubuntu 14.04.

My /var/lib/glusterd.info contains a `operating-version=2` line; when changed to `operating-version=30601` prevented the daemon to start, leaving the following in the logs:

E [glusterd-store.c:2037:glusterd_restore_op_version] 0-management: wrong op-version (30601) retrieved
E [glusterd-store.c:4278:glusterd_restore] 0-management: Failed to restore op_version
E [xlator.c:425:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again

Comment 6 Kaushal 2016-08-30 12:47:25 UTC
GlusterFS doesn't automatically update the op-version. This is done to prevent incompatibilities when doing a rolling update of a GlusterFS cluster.

Users need to bump the op-version after all servers in the cluster have been upgraded. From GlusterFS-3.6, the command `gluster volume set all cluster.op-version <version>` can be used to set the op-version. More information regarding this can be obtained from https://gluster.readthedocs.io/en/latest/Upgrade-Guide/op_version/


Note You need to log in before you can comment on or make changes to this bug.