Description of problem: Multisite sync encounters temporary EBUSY errors in normal operation, and will gracefully retry the operations until success. These temporary errors get written to the sync.error-log objects (visible via 'radosgw-admin sync error list'). The 'radosgw-admin sync error list' command should only contain actual sync errors that could require admin intervention. Including temporary EBUSY errors only serves to waste space in rados and obscure the more serious sync errors. Version-Release number of selected component (if applicable): RHCS 2.0 and later How reproducible: Easily reproducible, especially with multiple gateways per zone. Steps to Reproduce: 1. Create a multisite configuration with two zones and two gateways each. 2. On master zone, create a bucket and upload some objects. 3. On secondary zone, wait a few minutes, then run 'radosgw-admin sync error list'. Actual results: The output of 'radosgw-admin sync error list' contains errors of the form: "message": "failed to sync bucket instance: (16) Device or resource busy" Expected results: The output of 'radosgw-admin sync error list' should only contain real sync failures that would require admin intervention. Additional info:
Would you please do the jewel and luminous backport PRs upstream as well so we don't have to carry this patch long-term?
This bug is targeted for RHCEPH 2.5 and this fix is not in RHCEPH 3. Would you please cherry-pick the change to ceph-3.0-rhel-patches (with the RHCEPH 3 clone ID number, "Resolves: rhbz#1530665") so customers do not experience a regression?
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0340