Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1399079

Summary:	Swift replication might skip suffixes temporarily
Product:	Red Hat OpenStack	Reporter:	Christian Schwede (cschwede) <cschwede>
Component:	openstack-swift	Assignee:	Pete Zaitcev <zaitcev>
Status:	CLOSED ERRATA	QA Contact:	Mike Abrams <mabrams>
Severity:	urgent	Docs Contact:
Priority:	medium
Version:	9.0 (Mitaka)	CC:	acanan, cschwede, ddomingo, derekh, egafford, jschluet, pgrist, scohen, srevivo, thiago, zaitcev
Target Milestone:	z1	Keywords:	Triaged, ZStream
Target Release:	10.0 (Newton)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-swift-2.10.1-1.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-02-01 14:36:10 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Christian Schwede (cschwede) 2016-11-28 09:08:17 UTC

Description of problem:

1. Upload an object
2. Replicator runs
3. Upload another object that uses the same partition but different suffix
4. Loose a disk before replicator touched this partition, replace it
5. Run replicators again

Object suffix from #1 will be replicated, object suffixes from #2 won't.

10% of the hashes will be outdated and updated on every replicator run.
However, the order of partitions are randomized, and this means that it could require more than 10 replicator runs before a hashes.pkl is fixed, while others will be updated much more often. This should be more deterministic.

In fact it is very likely that it takes 50..70 replication cycles until this is fixed, depending on the partition count per replicator.

See upstream bugs for more details:

https://bugs.launchpad.net/swift/+bug/1634967
https://bugs.launchpad.net/swift/+bug/1644807


Version-Release number of selected component (if applicable):

Mitaka, Newton (OSP9 & OSP10)

How reproducible: Always

Steps to Reproduce: See above.

Actual results:See above.

Expected results: All hashesh.pkl updated after the 10th replication pass.

Comment 1 Christian Schwede (cschwede) 2016-11-28 09:16:59 UTC

The bug in the hashes.pkl has been fixed on master, and backports have been submitted for review:

https://review.openstack.org/#/q/Ie2700f6e6171f2ecfa7d07b0f18b79e90cbf1c8a,n,z

The fix to make the invalidation of the hashes.pkl deterministic is ready for review on master:

https://review.openstack.org/#/c/402376/

IMO both patches are critical, and we need to backport them both for Mitaka and Newton & publish updated rpms quickly. Thoughts?

Comment 3 Elise Gafford 2016-11-30 14:10:30 UTC

Requires backports through 9.0 when repaired upstream.

Comment 5 Jon Schlueter 2017-01-16 16:53:03 UTC

updating for stable/newton reviews

Comment 11 errata-xmlrpc 2017-02-01 14:36:10 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0235.html