Description of problem: ====================== With BZ#1315781 - AFR returns the node uuid of the same node for every file in the replica AFR returns both node UUIDs and hence dht can rebalance files from both the nodes. However, DHT kind of hard sets the set of files to be migrated by each node. Say I have files f{1..10000}. DHT selects such that files about 5000 files are rebalanced by n1 and remaining by n2. Now for some reason n1 was not able to rebalance(say brick b1 went down or n1 went down as mentioned in another bZ#1476676 , then n2 must be able to take care of rebalancing files which was n1's responsibility. Especially,once the setup is healthy again, ie n1 is up, and assuming n2 completed rebalancing its job before n1 came up, then now if we trigger reblaance, only n1 participates in reblancing and n2 doesnt do any rebalance. hence load-balancing is lost Version-Release number of selected component (if applicable): ======= 3.8.4-36 How reproducible: ========== always Steps to Reproduce: 1.create a 1x2 volume on b1 on n1 ; b2 on n2 2.create files dir{1..10}/f{1..10000} 3.add-brick b3 on n1 and b4 on n2 4. trigger rebalance 5. rebal status shows both n1 and n2 participating in rebal 6. now bring down n1, n2 goes ahead with rebalance of its prealloted files 7.after rebalance is compelted(by n2), now bring back n1 online , and trigger heal 8. after heal ,retrigger rebalance Actual results: ========= now you can see that only n1 is rebalancing files, while n2 is just a passive watcher Expected results: ======= now n2 must also pitch in to rebalance files , to help load balance