Description of problem: This is the sample directory on which even when there is no split-brain, it is shown as split-brain. root@localhost - ~ 22:25:02 :( ⚡ getfattr -d -m. -e hex /home/gfs/r2_?/a getfattr: Removing leading '/' from absolute path names # file: home/gfs/r2_0/a security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.gfid=0x0c450469ba184449b5808625245e2e8a # file: home/gfs/r2_1/a security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.r2-client-0=0x000000010000000100000000 trusted.gfid=0x0c450469ba184449b5808625245e2e8a root@localhost - ~ 22:25:04 :) ⚡ getfattr -d -m. -e hex /home/gfs/r2_? getfattr: Removing leading '/' from absolute path names # file: home/gfs/r2_0 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.r2-client-0=0x000000000000000000000001 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a # file: home/gfs/r2_1 security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a66696c655f743a733000 trusted.afr.r2-client-0=0x000000000000000000000001 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x83b0576aa0924f71aa108c5a54bb793a root@localhost - ~ 22:25:09 :) ⚡ gluster volume heal r2 info Brick localhost.localdomain:/home/gfs/r2_0 / - Is in split-brain Status: Connected Number of entries: 1 Brick localhost.localdomain:/home/gfs/r2_1 /a / - Is in split-brain Status: Connected Number of entries: 2 root@localhost - ~ 22:25:19 :) ⚡ gluster volume set r2 entry-self-heal on volume set: success root@localhost - ~ 22:26:40 :) ⚡ gluster volume heal r2 enable Enable heal on volume r2 has been successful root@localhost - ~ 22:26:45 :) ⚡ gluster volume heal r2 Launching heal operation to perform index self heal on volume r2 has been successful Use heal info commands to check status root@localhost - ~ 22:26:49 :) ⚡ gluster volume heal r2 info Brick localhost.localdomain:/home/gfs/r2_0 Status: Connected Number of entries: 0 Brick localhost.localdomain:/home/gfs/r2_1 Status: Connected Number of entries: 0 Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
QATP: ==== 1) have a afr volume 2) mount the volume and create some dirs and files in them 3)now bring down one of the replica bricks 4)now from mount change the permissions and ownership of some dirs and their files 5)disable self heal deamon and all the client side heal options(data,metadata,entry so as to avoid client side healing) 6)now bring back the brick online 7)keep monitoring the heal info and heal info split-brain in a loop till test case is complete 8)now enable the heal and start a heal of the volume 9)now check the healing the heal must be completed, and the new filepermissions and ownership must be updated in the sink brick which can be checked in the backend brick Also, no split brains errors must be seen all heals must pass successfully
Validation: ========= got the above case automated and ran it both manually and automated the case passed. hence moving to verified [root@dhcp35-191 ~]# rpm -qa|grep gluste glusterfs-cli-3.7.9-6.el7rhgs.x86_64 glusterfs-libs-3.7.9-6.el7rhgs.x86_64 glusterfs-fuse-3.7.9-6.el7rhgs.x86_64 glusterfs-client-xlators-3.7.9-6.el7rhgs.x86_64 glusterfs-server-3.7.9-6.el7rhgs.x86_64 python-gluster-3.7.9-5.el7rhgs.noarch glusterfs-3.7.9-6.el7rhgs.x86_64 glusterfs-api-3.7.9-6.el7rhgs.x86_64
When directory operations failed with errors other than the brick being offline, *Parent directory containing these* entries that failed were shown as being in a split-brain state even when they were not. This has been corrected so that state is shown correctly in this situation. Added the correction in '*'
Looks good to me.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240