Bug 1527132 - sync.error-log objects fill up with temporary EBUSY errors
Summary: sync.error-log objects fill up with temporary EBUSY errors
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW-Multisite
Version: 3.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: rc
: 2.5
Assignee: Casey Bodley
QA Contact: Tejas
URL:
Whiteboard:
Depends On: 1530665
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-12-18 15:49 UTC by Casey Bodley
Modified: 2022-02-21 18:01 UTC (History)
7 users (show)

Fixed In Version: RHEL: ceph-10.2.10-9.el7cp Ubuntu: ceph_10.2.10-6redhat1xenial
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
: 1530665 (view as bug list)
Environment:
Last Closed: 2018-02-21 19:47:28 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 22473 0 None None None 2017-12-18 15:49:57 UTC
Red Hat Issue Tracker RHCEPH-1535 0 None None None 2021-09-09 13:00:56 UTC
Red Hat Product Errata RHBA-2018:0340 0 normal SHIPPED_LIVE Red Hat Ceph Storage 2.5 bug fix and enhancement update 2018-02-22 00:50:32 UTC

Description Casey Bodley 2017-12-18 15:49:58 UTC
Description of problem:

Multisite sync encounters temporary EBUSY errors in normal operation, and will gracefully retry the operations until success. These temporary errors get written to the sync.error-log objects (visible via 'radosgw-admin sync error list').

The 'radosgw-admin sync error list' command should only contain actual sync errors that could require admin intervention. Including temporary EBUSY errors only serves to waste space in rados and obscure the more serious sync errors.

Version-Release number of selected component (if applicable): RHCS 2.0 and later


How reproducible:

Easily reproducible, especially with multiple gateways per zone.

Steps to Reproduce:
1. Create a multisite configuration with two zones and two gateways each.
2. On master zone, create a bucket and upload some objects.
3. On secondary zone, wait a few minutes, then run 'radosgw-admin sync error list'.

Actual results:

The output of 'radosgw-admin sync error list' contains errors of the form:

"message": "failed to sync bucket instance: (16) Device or resource busy"

Expected results:

The output of 'radosgw-admin sync error list' should only contain real sync failures that would require admin intervention.

Additional info:

Comment 5 Ken Dreyer (Red Hat) 2018-01-03 15:31:53 UTC
Would you please do the jewel and luminous backport PRs upstream as well so we don't have to carry this patch long-term?

Comment 6 Ken Dreyer (Red Hat) 2018-01-03 15:37:25 UTC
This bug is targeted for RHCEPH 2.5 and this fix is not in RHCEPH 3.

Would you please cherry-pick the change to ceph-3.0-rhel-patches (with the RHCEPH 3 clone ID number, "Resolves: rhbz#1530665") so customers do not experience a regression?

Comment 15 errata-xmlrpc 2018-02-21 19:47:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0340


Note You need to log in before you can comment on or make changes to this bug.