Hide Forgot
Description of problem: 1. Upload an object 2. Replicator runs 3. Upload another object that uses the same partition but different suffix 4. Loose a disk before replicator touched this partition, replace it 5. Run replicators again Object suffix from #1 will be replicated, object suffixes from #2 won't. 10% of the hashes will be outdated and updated on every replicator run. However, the order of partitions are randomized, and this means that it could require more than 10 replicator runs before a hashes.pkl is fixed, while others will be updated much more often. This should be more deterministic. In fact it is very likely that it takes 50..70 replication cycles until this is fixed, depending on the partition count per replicator. See upstream bugs for more details: https://bugs.launchpad.net/swift/+bug/1634967 https://bugs.launchpad.net/swift/+bug/1644807 Version-Release number of selected component (if applicable): Mitaka, Newton (OSP9 & OSP10) How reproducible: Always Steps to Reproduce: See above. Actual results:See above. Expected results: All hashesh.pkl updated after the 10th replication pass.
The bug in the hashes.pkl has been fixed on master, and backports have been submitted for review: https://review.openstack.org/#/q/Ie2700f6e6171f2ecfa7d07b0f18b79e90cbf1c8a,n,z The fix to make the invalidation of the hashes.pkl deterministic is ready for review on master: https://review.openstack.org/#/c/402376/ IMO both patches are critical, and we need to backport them both for Mitaka and Newton & publish updated rpms quickly. Thoughts?
Requires backports through 9.0 when repaired upstream.
updating for stable/newton reviews
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0235.html