Created attachment 583493 [details] sos report Description of problem: While self heal is happening rebalance doesnot completely migrated the files, rerunning rebalance again migrates some of the files Version-Release number of selected component (if applicable): 3.3.0qa40 How reproducible: Steps to Reproduce: 1. create a pure replica volume with count 2 (sat brick1, brick2) 2. created 100 files of 10 MB, directories of depth 100 with each level containing 1 file of 1MB each 3. Peer probed another machine and did a add-brick (brick3, brick4) 4. Initiated the rebalance and brought down brick2 5. After some time brought up the brick 6. Checking the rebalance status says completed with count 34 , log messages says transport end point not connected, 7. Rerunning rebalance again migrates some of the files totaling upto 64. 8. Are eual-checksum on mount point before and after rebalance are same. Actual results: Status should not say completed until proper data migration happens. Attached the SOS report: Rebalance log location var/log/glusterfs/repl-rebalance.log
I suspect mostly the bug is due to the issue of migrating from pure replica to distributed replica, but considering arequal-checksums are fine, i would reduce the priority.
I tried to re-create this on a single node. I still got failures. When the brick is down, we seem to be getting duplicate entries from readdirp, and hence multiple migration attempts are being done. [2012-06-08 13:32:28.202224] I [dht-rebalance.c:639:dht_migrate_file] 0-new-dht: /55.file: attempting to move from new-replicate-0 to new-replicate-1 . . [2012-06-08 13:32:31.624447] I [dht-rebalance.c:639:dht_migrate_file] 0-new-dht: /55.file: attempting to move from new-replicate-0 to new-replicate-1 . . [2012-06-08 13:32:31.640475] I [dht-rebalance.c:639:dht_migrate_file] 0-new-dht: /55.file: attempting to move from new-replicate-0 to new-replicate-1 This leads to few of the migrations to fail. Looks like when a child of replica is down, readdir doesn't perform as expected. Note, if a child of DHT is down, we stop rebalance(assert-on-child-down is set to on).
Part of the issue got fixed by default loading of distribute even in case of one subvolume (for the layout xattr creation). Also considering we disable replicate self-heal in rebalance process, this issue is not seen much. Other than that, the issue is happening because replicate is returning wrong (or say in-correct) offset for readdirp_cbk() when a brick goes down. Need more feedback from replicate team to handle this issue. Currently I don't see an issue with distribute (ie, rebalance process)
This is replica related bug. Readdir returns different entries at different offset from subvolume, when a replica pair goes down.
*** This bug has been marked as a duplicate of bug 859387 ***