Bug 1663570 - multisite sync errors from operations on a versioning-suspended bucket
Summary: multisite sync errors from operations on a versioning-suspended bucket
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW-Multisite
Version: 3.2
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 3.2
Assignee: Casey Bodley
QA Contact: Tejas
Bara Ancincova
URL:
Whiteboard:
Depends On:
Blocks: 1629656 1690927
TreeView+ depends on / blocked
 
Reported: 2019-01-04 21:50 UTC by Casey Bodley
Modified: 2019-11-12 13:02 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-12.2.8-65.el7cp Ubuntu: ceph_12.2.8-52redhat1xenial
Doc Type: Bug Fix
Doc Text:
.Objects are now synced correctly in versioning-suspended buckets Due to a bug in multi-site sync of versioning-suspended buckets, certain object versioning attributes were overwritten with incorrect values. Consequently, the objects failed to sync and attempted to retry endlessly, blocking further sync progress. With this update, the sync process no longer overwrites versioning attributes. In addition, any broken attributes are now detected and repaired. As a result, objects are synced correctly in versioning-suspended buckets.
Clone Of:
: 1690927 (view as bug list)
Environment:
Last Closed: 2019-03-07 15:51:27 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 37792 0 None None None 2019-01-04 21:50:35 UTC
Github ceph ceph pull 25794 0 'None' closed rgw: fetch_remote_obj filters out olh attrs 2020-11-10 12:36:27 UTC
Github ceph ceph pull 26157 0 'None' closed rgw: repair olh attributes that were broken by sync 2020-11-10 12:36:26 UTC
Red Hat Product Errata RHBA-2019:0475 0 None None None 2019-03-07 15:51:36 UTC

Description Casey Bodley 2019-01-04 21:50:35 UTC
Description of problem:

Overwrites in a versioning-suspended bucket will fail to sync with errors like "cls_rgw_bucket_link_olh() returned r=-125" and block further progress on the bucket.


How reproducible: 100%


Steps to Reproduce:

1. deploy a two-zone multisite configuration
2. on the master zone, create a bucket
3. upload an object "obj"
4. enable versioning on the bucket
5. reupload the same object "obj"
6. suspend versioning on the bucket
7. reupload the same object "obj"


Actual results:

Multisite sync on the secondary zone gets stuck attempting to sync the third upload, producing radosgw log errors like "cls_rgw_bucket_link_olh() returned r=-125" and osd log errors like "NOTICE: op.olh_tag (zxopy27aag3jjr38ddtow7517gdpgz4c) != olh.tag (bne5h7ou7gingobf89ae5crr2p3p284y)".

Expected results:

Sync in versioning-suspended buckets should succeed and converge on the same objects/versions in both zones.

Comment 38 errata-xmlrpc 2019-03-07 15:51:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0475


Note You need to log in before you can comment on or make changes to this bug.