Bug 763850 (GLUSTER-2118)

Summary: [3.1.1qa5] : gluster volume sync doesn't start already started volumes
Product: [Community] GlusterFS Reporter: Harshavardhana <fharshav>
Component: cliAssignee: krishnan parthasarathi <kparthas>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 3.1.1CC: amarts, cww, gluster-bugs, nsathyan, pkarampu
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.4.0 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-07-24 17:37:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Harshavardhana 2010-11-16 18:09:10 UTC
Steps to reproduce:

two server nodes 'gluster volume info" 

Volume Name: repl
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.1.10.112:/sdb
Brick2: 10.1.10.113:/sdc


One of the node goes down and we reconfigure ping-timeout on 10.1.10.113 to 5 using below command line 

gluster volume set repl network.ping-timeout 5


Once the node 10.1.10.112 comes back up its peer status is in "Rejected" since the volumes are not matching between both nodes. 

What we do is on 10.1.10.112 stop the volume and delete the contents, then initiate a volume sync. 

NOTE: this volume stop and delete doesn't take effect on 10.1.10.113 since both peers are in rejected mode. 

But interesting part is when you do "gluster volume sync 10.1.10.113 repl" from 10.1.10.112

It syncs even the current volume state from the 10.1.10.113 which is actually started, but it fails to start its own. 

[root@compel2 ~]# gluster volume sync 10.1.10.113
volume sync: successful
[root@compel2 ~]# gluster volume info

Volume Name: repl
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.1.10.112:/sdb
Brick2: 10.1.10.113:/sdc
Options Reconfigured:
network.ping-timeout: 5


[root@compel2 ~]# ps -ef | grep gluster
root      3713     1  0 10:01 ?        00:00:00 /usr/sbin/glusterd
root      3989  3740  0 10:08 pts/0    00:00:00 grep gluster
[root@compel2 ~]#

This puts me into jinx!!

Comment 1 Amar Tumballi 2011-03-03 04:20:54 UTC
Taking this task after 3.1.3 release, hence this should be present mostly in 3.2.x (or 3.1.3+ releases).

Comment 2 Pranith Kumar K 2011-03-17 23:00:26 UTC
We found that glusterd store version is not updated for some of the operations like set, reset, start, stop. In remodeling glusterd store for fix to bug 763486, we fixed this. With this fix this problem should happen only when any operation is performed on volume with same name on *BOTH* peer1, peer2 when they are not connected to each other.

Comment 3 Amar Tumballi 2012-11-29 10:12:44 UTC
http://review.gluster.org/4188