Created attachment 612689 [details] Dump of the process that claims split-brain. Description of problem: I had a file that was reporting split-brain. I deleted the file and the .glusterfs counterpart from two of the replica 3 servers. After a lookup() was performed on the file, it self-healed. Hash sums and extended attributes confirm that the file is clean. The client is still reporting a split-brain condition even though the file is healed. I mounted the volume on a second directory and could read the file through that mount. [2012-09-13 16:45:49.280447] W [afr-common.c:1226:afr_detect_self_heal_by_lookup_status] 2-home-replicate-3: split brain detected during lookup of /ROBING/.thunderbird/393yixum.default/training.dat. [2012-09-13 16:45:49.280524] I [afr-common.c:1340:afr_launch_self_heal] 2-home-replicate-3: background data gfid self-heal triggered. path: /ROBING/.thunderbird/393yixum.default/training.dat, reason: lookup detected pending operations [2012-09-13 16:45:49.281544] I [afr-self-heal-common.c:1189:afr_sh_missing_entry_call_impunge_recreate] 2-home-replicate-3: no missing files - /ROBING/.thunderbird/393yixum.default/training.dat. proceeding to metadata check [2012-09-13 16:45:49.281904] I [afr-self-heal-common.c:994:afr_sh_missing_entries_done] 2-home-replicate-3: split brain found, aborting selfheal of /ROBING/.thunderbird/393yixum.default/training.dat [2012-09-13 16:45:49.281931] E [afr-self-heal-common.c:2156:afr_self_heal_completion_cbk] 2-home-replicate-3: background data gfid self-heal failed on /ROBING/.thunderbird/393yixum.default/training.dat [2012-09-13 16:45:49.282098] W [afr-open.c:213:afr_open] 2-home-replicate-3: failed to open as split brain seen, returning EIO [2012-09-13 16:45:49.282159] W [fuse-bridge.c:713:fuse_fd_cbk] 0-glusterfs-fuse: 972877: OPEN() /ROBING/.thunderbird/393yixum.default/training.dat => -1 (Input/output error) [root@ewcs2 ~]# getfattr -m . -d -e hex /var/spool/glusterfs/d_home/ROBING/.thunderbird/393yixum.default/training.dat getfattr: Removing leading '/' from absolute path names # file: var/spool/glusterfs/d_home/ROBING/.thunderbird/393yixum.default/training.dat trusted.afr.home-client-10=0x000000000000000000000000 trusted.afr.home-client-11=0x000000000000000000000000 trusted.afr.home-client-9=0x000000000000000000000000 trusted.gfid=0xfd593e58555b42689bea73208d083ce7 [root@ewcs2 ~]# getfattr -m . -d -e hex /var/spool/glusterfs/d_home/.glusterfs/fd/59/*3ce7 getfattr: Removing leading '/' from absolute path names # file: var/spool/glusterfs/d_home/.glusterfs/fd/59/fd593e58-555b-4268-9bea-73208d083ce7 trusted.afr.home-client-10=0x000000000000000000000000 trusted.afr.home-client-11=0x000000000000000000000000 trusted.afr.home-client-9=0x000000000000000000000000 trusted.gfid=0xfd593e58555b42689bea73208d083ce7 The other two servers produce identical results.
*** This bug has been marked as a duplicate of bug 832305 ***