Bug 857270

Summary: Healed file still showing up as split-brain
Product: [Community] GlusterFS Reporter: Joe Julian <joe>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DUPLICATE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-14 07:31:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Dump of the process that claims split-brain. none

Description Joe Julian 2012-09-14 00:24:47 UTC
Created attachment 612689 [details]
Dump of the process that claims split-brain.

Description of problem:
I had a file that was reporting split-brain. I deleted the file and the .glusterfs counterpart from two of the replica 3 servers. After a lookup() was performed on the file, it self-healed. Hash sums and extended attributes confirm that the file is clean.

The client is still reporting a split-brain condition even though the file is healed.

I mounted the volume on a second directory and could read the file through that mount.

[2012-09-13 16:45:49.280447] W [afr-common.c:1226:afr_detect_self_heal_by_lookup_status] 2-home-replicate-3: split brain detected during lookup of /ROBING/.thunderbird/393yixum.default/training.dat.
[2012-09-13 16:45:49.280524] I [afr-common.c:1340:afr_launch_self_heal] 2-home-replicate-3: background  data gfid self-heal triggered. path: /ROBING/.thunderbird/393yixum.default/training.dat, reason: lookup detected pending operations
[2012-09-13 16:45:49.281544] I [afr-self-heal-common.c:1189:afr_sh_missing_entry_call_impunge_recreate] 2-home-replicate-3: no missing files - /ROBING/.thunderbird/393yixum.default/training.dat. proceeding to metadata check
[2012-09-13 16:45:49.281904] I [afr-self-heal-common.c:994:afr_sh_missing_entries_done] 2-home-replicate-3: split brain found, aborting selfheal of /ROBING/.thunderbird/393yixum.default/training.dat
[2012-09-13 16:45:49.281931] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 2-home-replicate-3: background  data gfid self-heal failed on /ROBING/.thunderbird/393yixum.default/training.dat
[2012-09-13 16:45:49.282098] W [afr-open.c:213:afr_open] 2-home-replicate-3: failed to open as split brain seen, returning EIO
[2012-09-13 16:45:49.282159] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 972877: OPEN() /ROBING/.thunderbird/393yixum.default/training.dat => -1 (Input/output error)

[root@ewcs2 ~]# getfattr -m . -d -e hex /var/spool/glusterfs/d_home/ROBING/.thunderbird/393yixum.default/training.dat
getfattr: Removing leading '/' from absolute path names
# file: var/spool/glusterfs/d_home/ROBING/.thunderbird/393yixum.default/training.dat
trusted.afr.home-client-10=0x000000000000000000000000
trusted.afr.home-client-11=0x000000000000000000000000
trusted.afr.home-client-9=0x000000000000000000000000
trusted.gfid=0xfd593e58555b42689bea73208d083ce7

[root@ewcs2 ~]# getfattr -m . -d -e hex /var/spool/glusterfs/d_home/.glusterfs/fd/59/*3ce7
getfattr: Removing leading '/' from absolute path names
# file: var/spool/glusterfs/d_home/.glusterfs/fd/59/fd593e58-555b-4268-9bea-73208d083ce7
trusted.afr.home-client-10=0x000000000000000000000000
trusted.afr.home-client-11=0x000000000000000000000000
trusted.afr.home-client-9=0x000000000000000000000000
trusted.gfid=0xfd593e58555b42689bea73208d083ce7

The other two servers produce identical results.

Comment 1 Joe Julian 2012-09-14 07:31:43 UTC

*** This bug has been marked as a duplicate of bug 832305 ***