Description of problem: ======================= automatic splitbrain resolution with size as policy can lead to inconsistency when there are only conservative merges and the parent directory is in metadata splitbrain we end up in a situation where in a 2x2 volume, each dht subvol has a different metadata information when compared with other dht-subvol for a directory. node1 post auto split brain resolution with size policy: if you notice dirty directory has different permissions for different subvols ========= [root@dhcp35-37 glusterfs]# ll /rhs/brick*/distrep /rhs/brick1/distrep: total 0 drw-rw-rw-. 2 root root 97 Jan 28 23:51 dirty d--x--x--x. 2 root root 6 Jan 28 23:37 fold1 /rhs/brick2/distrep: total 0 d--x--x--x. 2 root root 117 Jan 28 23:51 dirty d--x--x--x. 2 root root 6 Jan 28 23:37 fold1 [root@dhcp35-37 glusterfs]# node:2 [root@dhcp35-116 glusterfs]# ll /rhs/brick*/distrep /rhs/brick1/distrep: total 0 drw-rw-rw-. 2 root root 97 Jan 28 23:51 dirty d--x--x--x. 2 root root 6 Jan 28 23:37 fold1 /rhs/brick2/distrep: total 0 d--x--x--x. 2 root root 117 Jan 28 23:51 dirty d--x--x--x. 2 root root 6 Jan 28 23:37 fold1 Volume Name: distrep Type: Distributed-Replicate Volume ID: df5319f0-d889-4030-bb39-b8a41936a726 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick1/distrep Brick2: 10.70.35.116:/rhs/brick1/distrep Brick3: 10.70.35.37:/rhs/brick2/distrep Brick4: 10.70.35.116:/rhs/brick2/distrep Options Reconfigured: cluster.favorite-child-policy: size cluster.self-heal-daemon: enable transport.address-family: inet performance.readdir-ahead: on nfs.disable: on Version-Release number of selected component (if applicable): ========== 3.8.4-12 How reproducible: ===== always Steps to Reproduce: =================\ 1.create a 2x2 volume spanning on two nodes(b1 on n1 b2 on n2, b3 on n1 , b4 on n2; b1-b2 replica pairs b3-b4 replica pairs) 2.mount on clients as below c1 sees on n1 c2 only sees n2 c3 sees n1 and n2 3.now set fav child policy to size 4. create dir say dir1 from c3 5. disable self heal deamon 6. now under dir1 create files f{1..10} from c2 and x{1..10} from c1==>this would mean that there is a conservative merge required on dir1 7. do a chmod of dir1 to say 0000 from c1 8. do a chmod of dir1 to say 0777 from c2 9. check heal info , must show dir1 as in splitbrain 10. now enable heal and trigger heal Actual results: =============== dir1 on b1 on n1 will have different permissions and dir1 on b3 on n1 has different permission(ie different dht subvols have different permission) similarly on b2 and b4 on n2 Expected results: Additional info:
[root@dhcp35-196 glusterfs]# gluster v info distrep Volume Name: distrep Type: Distributed-Replicate Volume ID: df5319f0-d889-4030-bb39-b8a41936a726 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick1/distrep Brick2: 10.70.35.116:/rhs/brick1/distrep Brick3: 10.70.35.37:/rhs/brick2/distrep Brick4: 10.70.35.116:/rhs/brick2/distrep Options Reconfigured: cluster.favorite-child-policy: size cluster.self-heal-daemon: enable transport.address-family: inet performance.readdir-ahead: on nfs.disable: on [root@dhcp35-196 glusterfs]#
This BZ hasn't received any updates since when it has reported. Are we going to work on this bug in coming months? Can we have a decision on this bug?