Description of problem: =================== firstly, I don't see a real reason for having a policy "majority" The definition given is as below (from patch information in github) https://review.gluster.org/#/c/14535/ The majority policy will not pick a source if there is no majority. The other three policies pick the first brick with a valid reply and non-zero ctime/mtime/size as source. but when i create a data splitbrain situation with metadata heal pending, the splitbrain sometimes gets resolved, sometimes not However over the long run, ie by disabling and enabling heal shd, the file get healed. The problem is as below: 1)firstly what is the purpose of majority option in a x2 volume, as there can be no majority 2)the majority policy if at all is required, must have no effect on a x2 volume. Version-Release number of selected component (if applicable): ========== 3.8.4-12 How reproducible: ===== mostly Steps to Reproduce: =================\ 1.create a 2x2 volume spanning on two nodes(b1 on n1 b2 on n2, b3 on n1 , b4 on n2; b1-b2 replica pairs b3-b4 replica pairs) 2.mount on clients as below c1 sees on n1 c2 only sees n2 c3 sees n1 and n2 3.now set fav child policy to majority 4. create a file f1 from c3 5. disable self heal deamon 6. echo from c1 its hostname and from c2 its hostname to file f1 ==>will result in data splitbrain 7. do a chmod of f1 to say 0000 from c2 8. check heal info , must show dir1 as in splitbrain 9. now enable heal and trigger heal the file sometimes fails to heal the split brain with below shd logs [2017-01-29 07:41:44.017813] W [MSGID: 108042] [afr-self-heal-common.c:828:afr_mark_split_brain_source_sinks_by_policy] 60-dhtafr-replicate-0: Source dhtafr-client-0 selected as authentic to resolve conflicting data in file (gfid:5c510dc1-b210-4c56-bea5-07448d22dee9) by SIZE (34 bytes @ 2017-01-29 13:05:04 mtime, 2017-01-29 13:11:44 ctime). [2017-01-29 07:41:44.018377] I [MSGID: 108026] [afr-self-heal-metadata.c:52:__afr_selfheal_metadata_do] 60-dhtafr-replicate-0: performing metadata selfheal on 5c510dc1-b210-4c56-bea5-07448d22dee9 However if we disable and enable the heal again, the split brain gets resolved
Volume Name: dhtafr Type: Distributed-Replicate Volume ID: dc74c0b6-eb4c-402e-b6af-b38d3fddb3c1 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.35.37:/rhs/brick1/dhtafr Brick2: 10.70.35.116:/rhs/brick1/dhtafr Brick3: 10.70.35.37:/rhs/brick2/dhtafr Brick4: 10.70.35.116:/rhs/brick2/dhtafr Options Reconfigured: cluster.self-heal-daemon: disable cluster.favorite-child-policy: majority performance.readdir-ahead: on nfs.disable: on [root@dhcp35-196 ~]# [root@dhcp35-196 ~]# [root@dhcp35-196 ~]# [root@dhcp35-196 ~]#
This BZ hasn't received any updates since long time. What's the plan on this bug?