Description of problem: doing rebalance while geo-rep syncing files, geo-rep xsync fails to get some of the files. Consequently those file won't be synced to slave. Observations: 1. all missing files had entries in the one brick changelog 2. That brick changelog was present in either, .processing or .processed directories of the geo-rep working_dir of the brick 3. Those had no entries in the XSYNC-CHANGELOG too, which means, after stop and start of the geo-rep , first xsync crawl failed get the those entries. Version-Release number of selected component (if applicable):3.4.0.12rhs.beta6-1.el6rhs.x86_64 How reproducible: Didn't try reproducing it Steps to Reproduce: 1.Create and start a geo-rep relationship between master(DIST_REP) and slave. 2.Add bricks to the master volume. 3.start creating file on the master volume and parallely do geo-rep stop && geo-rep start && rebalance start Actual results: Fails to sync few files Expected results:Should sync all the files Additional info:
One more observation to add, - .processed directory in geo-rep working directory of the brick, the brick where all the missing file were from, has entries for the changelogs , CHANGELOG.1374662896 and CHANGELOG.1374662936, the missing one was CHANGELOG.1374662916, and if you check from the backend changelog dir, all those files had entries in changelog CHANGELOG.1374662916, and the changelog of xsync crawl which was processed in that time, was XSYNC-CHANGELOG.1374662916, Hope this might help.
As per the discussions in mailing thread: > > Can we consider taking 'blocker' flag from this and mark bug as 'medium' > priority? Its an issue when rebalancing, you are not having the > geo-replication stopped and started. For now, if geo-replication is > continuously running, rebalance is handled properly. > Sayan: Agree that this is not a blocker. Amar : taking this out of 'blocker' list now.
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.