Bug 765268 (GLUSTER-3536)

Summary: split brain for file copy for three way replication
Product: [Community] GlusterFS Reporter: Saurabh <saurabh>
Component: replicateAssignee: Kaushal <kaushal>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.3CC: gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Saurabh 2011-09-12 12:40:00 UTC
[root@Centos1 ~]# /opt/glusterfs/3.2.3/sbin/gluster volume info dist-3rep

Volume Name: dist-3rep
Type: Distributed-Replicate
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 10.1.12.134:/export/d-3rep
Brick2: 10.1.12.135:/export/d-3rep-rep
Brick3: 10.1.12.135:/export/d-3rep-rep-rep
Brick4: 10.1.12.134:/export/d-d-3rep
Brick5: 10.1.12.135:/export/d-d-3rep-rep
Brick6: 10.1.12.135:/export/d-d-3rep-rep-rep


[root@Centos3 nfs-test]#
[root@Centos3 nfs-test]# ls
[root@Centos3 nfs-test]# cat >> a.log
kkkkkk
[root@Centos3 nfs-test]# cat a.log
kkkkkk


bring the bricks down for the node 10.1.12.135,



[root@Centos3 nfs-test]# cp a.log b.log
[root@Centos3 nfs-test]# ls -l
total 16
-rw-r--r-- 1 root root 7 Sep 12 08:12 a.log
-rw-r--r-- 1 root root 7 Sep 12 08:13 b.log

kill the glusterfsds in the node 10.1.12.134,

[root@Centos3 nfs-test]# ls
ls: .: Input/output error

bring back the bricks on node 10.1.12.135,
[root@Centos3 nfs-test]# ls -li
total 8
16635320244810077074 -rw-r--r-- 1 root root 7 Sep 12 08:12 a.log

[root@Centos3 nfs-test]# cp a.log abc.log
[root@Centos3 nfs-test]# ls -l
total 16
-rw-r--r-- 1 root root 7 Sep 12 08:14 abc.log
-rw-r--r-- 1 root root 7 Sep 12 08:12 a.log

now, bring back the glusterfsd's back on 10.1.12.134 also, effectively all the nodes should be working properly,

[root@Centos3 nfs-test]# ls -lR
.:
total 16
-rw-r--r-- 1 root root 7 Sep 12 08:14 abc.log
-rw-r--r-- 1 root root 7 Sep 12 08:12 a.log
[root@Centos3 nfs-test]# ls -lR
.:
total 16
-rw-r--r-- 1 root root 7 Sep 12 08:14 abc.log
-rw-r--r-- 1 root root 7 Sep 12 08:12 a.log
[root@Centos3 nfs-test]# 

[root@Centos3 nfs-test]# mount | grep nfs
10.1.12.134:/dist-3rep on /mnt/nfs-test type nfs (rw,nfsvers=3,nolock,addr=10.1.12.134)

Comment 1 Kaushal 2011-10-11 06:44:59 UTC
Followed the steps specified and got all the expected results, except the last one. In the end 'ls' displayed the 3 files that should have been present. 

Most likely the problem here is that the bricks on *.134 went down again after they were brought up, which led to the file 'b.log' not being shown.