Description of problem: While rebalance is in progress if network disconnects happens split brain occurs Version-Release number of selected component (if applicable): 3.4.0.59rhs-1.1.toyota.hotfix.el6rhs.x86_64 How reproducible: Tried once Steps to Reproduce: 1.created a 40 brick distributed-replicate volume 2.did kernel untar on the mount point, calculate the are-equal checksum 3.did add-brick of a pair 4. start rebalance 5. while migration is in progress stop and start network service on some of the nodes 6. after network comes back glusterd will be dead on that node 7. start glusterd 8. In the meantime rebalance status on these nodes shows "failed" 9 once rebalance completes restart rebalance again so that migration happens even from the remaining nodes as well (which were down during the first run) 10. calculate the are-equal checksum and check mount logs Actual results: mount logs ========= [2014-08-06 17:58:37.483774] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-18: metadata self heal failed, on / [2014-08-06 17:58:37.609166] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-2: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.609391] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-6: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.609536] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-10: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 2 0 ] ] [2014-08-06 17:58:37.609669] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-8: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.609781] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-12: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.610013] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-3: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.610274] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-7: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.610757] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-4: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.610905] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-6: metadata self heal failed, on / [2014-08-06 17:58:37.611015] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-10: metadata self heal failed, on / [2014-08-06 17:58:37.611147] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-8: metadata self heal failed, on / [2014-08-06 17:58:37.611216] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-12: metadata self heal failed, on / [2014-08-06 17:58:37.611324] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-16: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.611440] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-3: metadata self heal failed, on / [2014-08-06 17:58:37.611562] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-18: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 2 0 ] ] [2014-08-06 17:58:37.611660] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-7: metadata self heal failed, on / [2014-08-06 17:58:37.611830] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-5: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.612035] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-2: metadata self heal failed, on / [2014-08-06 17:58:37.612171] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-16: metadata self heal failed, on / [2014-08-06 17:58:37.612279] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-18: metadata self heal failed, on / [2014-08-06 17:58:37.612394] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-13: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 2 0 ] ] [2014-08-06 17:58:37.612536] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-14: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.612666] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-15: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 2 0 ] ] [2014-08-06 17:58:37.612774] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-4: metadata self heal failed, on / [2014-08-06 17:58:37.612871] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-19: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.613019] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-5: metadata self heal failed, on / [2014-08-06 17:58:37.613127] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-20: Unable to self-heal contents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:37.613253] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-13: metadata self heal failed, on / [2014-08-06 17:58:37.613343] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-14: metadata self heal failed, on / [2014-08-06 17:58:37.613438] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-15: metadata self heal failed, on / [2014-08-06 17:58:37.613544] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-19: metadata self heal failed, on / [2014-08-06 17:58:37.613743] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-20: metadata self heal failed, on / arequal-checksum mismatch ========================== BEFORE REBALANCE ---------------- [root@localhost ~]# ./arequal-checksum /shylesh/ Entry counts Regular files : 30493 Directories : 1879 Symbolic links : 0 Other : 0 Total : 32372 Metadata checksums Regular files : 478cd9 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : f371a4f37e79dc78780c5a6f2284407c Directories : 3034929077a144b Symbolic links : 0 Other : 0 Total : 887eb7b55b87884f AFTER REBALANCE ============== [root@localhost ~]# ./arequal-checksum /shylesh/ Entry counts Regular files : 30494 Directories : 1879 Symbolic links : 0 Other : 0 Total : 32373 Metadata checksums Regular files : ee85 Directories : 24d74c Symbolic links : 3e9 Other : 3e9 Checksums Regular files : bbe20a53de1782ef31cb2def2c646fe Directories : 36b674264157828 Symbolic links : 0 Other : 0 Total : fbc9f539ab3246f8 In fact it shows increase in the number of files. Uploading the rebalance and mount logs
found the duplicate enry [root@localhost shylesh]# find linux-2.6.32.63 | grep linux-2.6.32.63/arch/arm/mach-u300/clock.h |xargs ls -l -rw-rw-r-- 1 root root 1320 Jun 18 18:26 linux-2.6.32.63/arch/arm/mach-u300/clock.h -rw-rw-r-- 1 root root 1320 Jun 18 18:26 linux-2.6.32.63/arch/arm/mach-u300/clock.h this file appears twice [root@localhost shylesh]# getfattr -n trusted.glusterfs.pathinfo linux-2.6.32.63/arch/arm/mach-u300/clock.h # file: linux-2.6.32.63/arch/arm/mach-u300/clock.h trusted.glusterfs.pathinfo="(<DISTRIBUTE:shylesh-dht> (<REPLICATE:shylesh-replicate-2> <POSIX(/brick/shylesh5):localhost.localdomain:/brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300/clock.h> <POSIX(/brick/shylesh4):localhost.localdomain:/brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300/clock.h>))"
I could see file present on more than one subvolume (more than one replica pair in this case) 192.168.12.17 -rw-rw-r-- 2 root root 1320 Jun 18 18:26 /brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300/clock.h 192.168.12.18 -rw-rw-r-- 2 root root 1320 Jun 18 18:26 /brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300/clock.h -- -rw-r--r-- 1 root root 9635840 Jul 31 09:14 rpm.tar 192.168.12.73 -rw-rw-r-- 2 root root 1320 Jun 18 18:26 /brick/shylesh40/linux-2.6.32.63/arch/arm/mach-u300/clock.h 192.168.12.74 -rw-rw-r-- 2 root root 1320 Jun 18 18:26 /brick/shylesh41/linux-2.6.32.63/arch/arm/mach-u300/clock.h Volume Name: shylesh Type: Distributed-Replicate Volume ID: 96c5814f-8da0-4fcd-ad32-07135e3aa527 Status: Started Number of Bricks: 22 x 2 = 44 Transport-type: tcp Bricks: Brick1: 192.168.12.13:/brick/shylesh0 Brick2: 192.168.12.14:/brick/shylesh1 Brick3: 192.168.12.15:/brick/shylesh2 Brick4: 192.168.12.16:/brick/shylesh3 Brick5: 192.168.12.17:/brick/shylesh4 Brick6: 192.168.12.18:/brick/shylesh5 Brick7: 192.168.12.19:/brick/shylesh6 Brick8: 192.168.12.22:/brick/shylesh7 Brick9: 192.168.12.23:/brick/shylesh8 Brick10: 192.168.12.24:/brick/shylesh9 Brick11: 192.168.12.25:/brick/shylesh10 Brick12: 192.168.12.26:/brick/shylesh11 Brick13: 192.168.12.27:/brick/shylesh12 Brick14: 192.168.12.28:/brick/shylesh13 Brick15: 192.168.12.29:/brick/shylesh14 Brick16: 192.168.12.32:/brick/shylesh15 Brick17: 192.168.12.33:/brick/shylesh16 Brick18: 192.168.12.34:/brick/shylesh17 Brick19: 192.168.12.35:/brick/shylesh18 Brick20: 192.168.12.36:/brick/shylesh19 Brick21: 192.168.12.37:/brick/shylesh20 Brick22: 192.168.12.38:/brick/shylesh21 Brick23: 192.168.12.39:/brick/shylesh22 Brick24: 192.168.12.42:/brick/shylesh23 Brick25: 192.168.12.43:/brick/shylesh24 Brick26: 192.168.12.44:/brick/shylesh25 Brick27: 192.168.12.45:/brick/shylesh26 Brick28: 192.168.12.46:/brick/shylesh27 Brick29: 192.168.12.47:/brick/shylesh28 Brick30: 192.168.12.48:/brick/shylesh29 Brick31: 192.168.12.49:/brick/shylesh30 Brick32: 192.168.12.62:/brick/shylesh31 Brick33: 192.168.12.63:/brick/shylesh32 Brick34: 192.168.12.64:/brick/shylesh33 Brick35: 192.168.12.65:/brick/shylesh34 Brick36: 192.168.12.66:/brick/shylesh35 Brick37: 192.168.12.67:/brick/shylesh36 Brick38: 192.168.12.68:/brick/shylesh37 Brick39: 192.168.12.69:/brick/shylesh38 Brick40: 192.168.12.72:/brick/shylesh39 Brick41: 192.168.12.73:/brick/shylesh40 Brick42: 192.168.12.74:/brick/shylesh41 Brick43: 192.168.12.75:/brick/shylesh42 Brick44: 192.168.12.76:/brick/shylesh43
parent xattrs =============== [root@gqas003 ~]# ssh root.12.17 'getfattr -d -m . -e hex /brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300' getfattr: Removing leading '/' from absolute path names # file: brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300 trusted.afr.shylesh-client-4=0x000000000000000000000000 trusted.afr.shylesh-client-5=0x000000000000000000000000 trusted.gfid=0x7fe698bdfe44486e841c49030d637733 trusted.glusterfs.dht=0x00000001000000001745d17422e8ba2d [root@gqas003 ~]# ssh root.12.18 'getfattr -d -m . -e hex /brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300' getfattr: Removing leading '/' from absolute path names # file: brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300 trusted.afr.shylesh-client-4=0x000000000000000000000000 trusted.afr.shylesh-client-5=0x000000000000000000000000 trusted.gfid=0x7fe698bdfe44486e841c49030d637733 trusted.glusterfs.dht=0x00000001000000001745d17422e8ba2d [root@gqas003 ~]# ssh root.12.73 'getfattr -d -m . -e hex /brick/shylesh40/linux-2.6.32.63/arch/arm/mach-u300' # file: brick/shylesh40/linux-2.6.32.63/arch/arm/mach-u300 trusted.gfid=0x7fe698bdfe44486e841c49030d637733 trusted.glusterfs.dht=0x000000010000000022e8ba2e2e8ba2e7 getfattr: Removing leading '/' from absolute path names [root@gqas003 ~]# ssh root.12.74 'getfattr -d -m . -e hex /brick/shylesh41/linux-2.6.32.63/arch/arm/mach-u300' # file: brick/shylesh41/linux-2.6.32.63/arch/arm/mach-u300 trusted.gfid=0x7fe698bdfe44486e841c49030d637733 trusted.glusterfs.dht=0x000000010000000022e8ba2e2e8ba2e7 getfattr: Removing leading '/' from absolute path names xattrs from the file ======================= [root@gqas003 ~]# ssh root.12.74 'getfattr -d -m . -e hex /brick/shylesh41/linux-2.6.32.63/arch/arm/mach-u300/clock.h'hex /brgetfattr: Removing leading '/' from absolute path names # file: brick/shylesh41/linux-2.6.32.63/arch/arm/mach-u300/clock.h trusted.afr.shylesh-client-40=0x000000000000000000000000 trusted.afr.shylesh-client-41=0x000000000000000000000000 trusted.gfid=0x4b4afc0e2d524c24a5cfdf583ca5ee0b [root@gqas003 ~]# ssh root.12.73 'getfattr -d -m . -e hex /brick/shylesh40/linux-2.6.32.63/arch/arm/mach-u300/clock.h' # file: brick/shylesh40/linux-2.6.32.63/arch/arm/mach-u300/clock.h trusted.afr.shylesh-client-40=0x000000000000000000000000 trusted.afr.shylesh-client-41=0x000000000000000000000000 trusted.gfid=0x4b4afc0e2d524c24a5cfdf583ca5ee0b [root@gqas003 ~]# ssh root.12.18 'getfattr -d -m . -e hex /brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300/clock.h' getfattr: Removing leading '/' from absolute path names # file: brick/shylesh5/linux-2.6.32.63/arch/arm/mach-u300/clock.h trusted.afr.shylesh-client-4=0x000000000000000000000000 trusted.afr.shylesh-client-5=0x000000000000000000000000 trusted.gfid=0x4b4afc0e2d524c24a5cfdf583ca5ee0b [root@gqas003 ~]# ssh root.12.17 'getfattr -d -m . -e hex /brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300/clock.h' getfattr: Removing leading '/' from absolute path names # file: brick/shylesh4/linux-2.6.32.63/arch/arm/mach-u300/clock.h trusted.afr.shylesh-client-4=0x000000000000000000000000 trusted.afr.shylesh-client-5=0x000000000000000000000000 trusted.gfid=0x4b4afc0e2d524c24a5cfdf583ca5ee0b
Shylesh, Do you still have the setup? I see the following entries to have gone into split-brain. Are they all in metadata split-brain? Could you also update the bug with the nodes you have taken down. Could you give more information about what nodes' network interface is taken down? In what order they are taken down and in what order they are brought back up? 714 1-shylesh-replicate-10: '/' 587 1-shylesh-replicate-10: '/linux-2.6.32.63' 19 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation' 3 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/removed' 3 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook' 2 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook/dvb' 2 1-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook/v4l' 714 1-shylesh-replicate-12: '/' 587 1-shylesh-replicate-12: '/linux-2.6.32.63' 18 1-shylesh-replicate-12: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-12: '/linux-2.6.32.63/Documentation/ABI' 714 1-shylesh-replicate-13: '/' 587 1-shylesh-replicate-13: '/linux-2.6.32.63' 18 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/testing' 714 1-shylesh-replicate-14: '/' 587 1-shylesh-replicate-14: '/linux-2.6.32.63' 714 1-shylesh-replicate-15: '/' 587 1-shylesh-replicate-15: '/linux-2.6.32.63' 18 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/DocBook' 714 1-shylesh-replicate-16: '/' 587 1-shylesh-replicate-16: '/linux-2.6.32.63' 18 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook/dvb' 2 1-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook/v4l' 714 1-shylesh-replicate-18: '/' 587 1-shylesh-replicate-18: '/linux-2.6.32.63' 18 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/stable' 3 1-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/testing' 714 1-shylesh-replicate-19: '/' 587 1-shylesh-replicate-19: '/linux-2.6.32.63' 18 1-shylesh-replicate-19: '/linux-2.6.32.63/Documentation' 714 1-shylesh-replicate-2: '/' 714 1-shylesh-replicate-20: '/' 589 1-shylesh-replicate-20: '/linux-2.6.32.63' 18 1-shylesh-replicate-20: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI/removed' 587 1-shylesh-replicate-2: '/linux-2.6.32.63' 18 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/DocBook' 714 1-shylesh-replicate-3: '/' 587 1-shylesh-replicate-3: '/linux-2.6.32.63' 18 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/testing' 715 1-shylesh-replicate-4: '/' 587 1-shylesh-replicate-4: '/linux-2.6.32.63' 18 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/DocBook' 714 1-shylesh-replicate-5: '/' 587 1-shylesh-replicate-5: '/linux-2.6.32.63' 18 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/testing' 714 1-shylesh-replicate-6: '/' 587 1-shylesh-replicate-6: '/linux-2.6.32.63' 18 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/DocBook' 2 1-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/DocBook/dvb' 714 1-shylesh-replicate-7: '/' 587 1-shylesh-replicate-7: '/linux-2.6.32.63' 18 1-shylesh-replicate-7: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-7: '/linux-2.6.32.63/Documentation/ABI' 2 1-shylesh-replicate-7: '/linux-2.6.32.63/Documentation/ABI/obsolete' 714 1-shylesh-replicate-8: '/' 587 1-shylesh-replicate-8: '/linux-2.6.32.63' 18 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation' 2 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI' 3 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/obsolete' 2 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/removed' 2 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/stable' 2 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/testing' 4 1-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/DocBook' 925 2-shylesh-replicate-10: '/' 677 2-shylesh-replicate-10: '/linux-2.6.32.63' 20 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/ABI/testing' 7 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook/dvb' 3 2-shylesh-replicate-10: '/linux-2.6.32.63/Documentation/DocBook/v4l' 926 2-shylesh-replicate-12: '/' 677 2-shylesh-replicate-12: '/linux-2.6.32.63' 20 2-shylesh-replicate-12: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-12: '/linux-2.6.32.63/Documentation/ABI' 926 2-shylesh-replicate-13: '/' 677 2-shylesh-replicate-13: '/linux-2.6.32.63' 20 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-13: '/linux-2.6.32.63/Documentation/ABI/testing' 926 2-shylesh-replicate-14: '/' 677 2-shylesh-replicate-14: '/linux-2.6.32.63' 925 2-shylesh-replicate-15: '/' 677 2-shylesh-replicate-15: '/linux-2.6.32.63' 20 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/stable' 4 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-15: '/linux-2.6.32.63/Documentation/DocBook' 926 2-shylesh-replicate-16: '/' 677 2-shylesh-replicate-16: '/linux-2.6.32.63' 20 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook/dvb' 3 2-shylesh-replicate-16: '/linux-2.6.32.63/Documentation/DocBook/v4l' 926 2-shylesh-replicate-18: '/' 679 2-shylesh-replicate-18: '/linux-2.6.32.63' 20 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-18: '/linux-2.6.32.63/Documentation/ABI/testing' 926 2-shylesh-replicate-19: '/' 677 2-shylesh-replicate-19: '/linux-2.6.32.63' 20 2-shylesh-replicate-19: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-1: '/linux-2.6.32.63/arch/arm/mach-s3c6410/cpu.c' 923 2-shylesh-replicate-2: '/' 925 2-shylesh-replicate-20: '/' 677 2-shylesh-replicate-20: '/linux-2.6.32.63' 20 2-shylesh-replicate-20: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-20: '/linux-2.6.32.63/Documentation/ABI/removed' 677 2-shylesh-replicate-2: '/linux-2.6.32.63' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/arch/arm/mach-u300/Makefile' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/arch/arm/mach-u300/padmux.c' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/arch/arm/mach-u300/spi.h' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/arch/arm/plat-mxc/include/mach/iomux-mx27.h' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/arch/avr32/boards/atstk1000/setup.c' 20 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-2: '/linux-2.6.32.63/Documentation/DocBook' 926 2-shylesh-replicate-3: '/' 677 2-shylesh-replicate-3: '/linux-2.6.32.63' 20 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-3: '/linux-2.6.32.63/Documentation/ABI/testing' 926 2-shylesh-replicate-4: '/' 677 2-shylesh-replicate-4: '/linux-2.6.32.63' 20 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI' 4 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-4: '/linux-2.6.32.63/Documentation/DocBook' 926 2-shylesh-replicate-5: '/' 677 2-shylesh-replicate-5: '/linux-2.6.32.63' 20 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-5: '/linux-2.6.32.63/Documentation/ABI/testing' 926 2-shylesh-replicate-6: '/' 677 2-shylesh-replicate-6: '/linux-2.6.32.63' 20 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/removed' 3 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/DocBook' 3 2-shylesh-replicate-6: '/linux-2.6.32.63/Documentation/DocBook/dvb' 926 2-shylesh-replicate-7: '/' 677 2-shylesh-replicate-7: '/linux-2.6.32.63' 20 2-shylesh-replicate-7: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-7: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-7: '/linux-2.6.32.63/Documentation/ABI/obsolete' 926 2-shylesh-replicate-8: '/' 677 2-shylesh-replicate-8: '/linux-2.6.32.63' 20 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation' 4 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI' 3 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/obsolete' 3 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/removed' 4 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/stable' 3 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/ABI/testing' 6 2-shylesh-replicate-8: '/linux-2.6.32.63/Documentation/DocBook'
192.168.12.14, 192.168.12.16, 192.168.12.18 are the nodes on which network was brought down. I don't remember exact order in which nodes were brough back. setup is partially available because somebody accidentally screwed my hypervisor. splite brain logs =================== [2014-08-06 17:58:06.854872] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-16: metadata sel f heal failed, on / [2014-08-06 17:58:06.854972] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-15: metadata sel f heal failed, on / [2014-08-06 17:58:06.855053] E [afr-self-heal-common.c:2906:afr_log_self_heal_completion_status] 2-shylesh-replicate-18: metadata sel f heal failed, on / [2014-08-06 17:58:06.855148] E [afr-self-heal-common.c:2906:afr_ [2014-08-06 17:58:05.805057] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-6: Unable to self-heal co ntents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:05.805202] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-7: Unable to self-heal co ntents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ] [2014-08-06 17:58:05.805416] E [afr-self-heal-common.c:233:afr_sh_print_split_brain_log] 2-shylesh-replicate-3: Unable to self-heal co ntents of '/' (possible split-brain). Please delete the file from all but the preferred subvolume.- Pending matrix: [ [ 0 1 ] [ 1 0 ] ]
Shylesh, Is there at least one entry from the list I gave where you can see what is the kind of split-brain? Pranith
Changing the component to AFR as there is "split brain" in bug description.
Component is gluster-afr, so removing zteam from devel whiteboard.
Thank you for submitting this issue for consideration in Red Hat Gluster Storage. The release for which you requested us to review, is now End of Life. Please See https://access.redhat.com/support/policy/updates/rhs/ If you can reproduce this bug against a currently maintained version of Red Hat Gluster Storage, please feel free to file a new report against the current release.