Bug 999430

Summary: [RFE] : AFR : Provide information about the type of split-brain
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: spandura
Component: replicateAssignee: Anuradha <atalur>
Status: CLOSED CURRENTRELEASE QA Contact: storage-qa-internal <storage-qa-internal>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1CC: nsathyan, pkarampu, rhs-bugs, smohan, storage-qa-internal, vbellur
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-31 10:28:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description spandura 2013-08-21 10:17:11 UTC
Description of problem:
=======================
When there is a split-brain we used to log what type of split-brain the dir/file is in. 

For example:-
Before we used to print cause for the IO error is : i.e."filetype differs" or "gfid differs" 

[2013-08-20 07:40:23.439787] D [afr-common.c:1505:afr_conflicting_iattrs] 0-vol_dis_rep-replicate-0: /a: filetype differs on subvolumes (1, 0)

[2013-08-20 07:40:41.776612] D [afr-common.c:1513:afr_conflicting_iattrs] 0-vol_dis_rep-replicate-1: /testdir/subdir1: gfid differs on subvolume 

After the fix for the bug 920870, we don't have this information logged anymore in WARNING log level. This information is moved to DEBUG log level.  

Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.4.0.21rhs built on Aug 20 2013 12:09:43

How reproducible:
==============
Often

Steps to Reproduce:
===================
1. Create a replicate volume. ( 1 x 2 ). Set "self-heal-daemon" "off" , "nfs.disable" "on"

2. Start the volume. 

3. Create fuse mount. Create a directory "testdir" 

4. Bring down brick-1 . 

5. From mount point execute:
touch a 
dd if=/dev/urandom of=testdir/file1 bs=1M count=1
mkdir testdir/subdir1
dd if=/dev/urandom of=testdir/subdir1/file1 bs=1M count=1

6. Bring back brick-1. Bring down brick-0

7. From mount point execute:
mkdir a 
dd if=/dev/urandom of=testdir/file1 bs=1M count=1
mkdir testdir/subdir1
dd if=/dev/urandom of=testdir/subdir1/file1 bs=1M count=1

8. Bring back brick-0

9. From mount point execute:
find . | xargs stat or ls -lR

Actual results:
=================
[2013-08-20 07:37:46.367814] E [afr-self-heal-common.c:1456:afr_sh_common_lookup_cbk] 0-vol_dis_rep-replicate-1: Conflicting entries for /testdir/subdir1
[2013-08-20 07:37:46.367962] E [afr-self-heal-common.c:1456:afr_sh_common_lookup_cbk] 0-vol_dis_rep-replicate-0: Conflicting entries for /testdir/file1
[2013-08-20 07:40:41.749653] E [afr-self-heal-common.c:1456:afr_sh_common_lookup_cbk] 0-vol_dis_rep-replicate-0: Conflicting entries for /a
[2013-08-20 07:40:41.789540] E [afr-self-heal-common.c:2744:afr_log_self_heal_completion_status] 0-vol_dis_rep-replicate-1:  gfid or missing entry self heal  failed, on /testdir/subdir1


10. set the client-log-level to DEBUG

11. from mount point execute :
find . | xargs stat

Expected results:
===================
[2013-08-20 07:40:23.439787] D [afr-common.c:1505:afr_conflicting_iattrs] 0-vol_dis_rep-replicate-0: /a: filetype differs on subvolumes (1, 0)

[2013-08-20 07:40:41.776612] D [afr-common.c:1513:afr_conflicting_iattrs] 0-vol_dis_rep-replicate-1: /testdir/subdir1: gfid differs on subvolume