Bug 1699478

Summary: rgw-multisite: log trimming does not make progress unless zones 'sync_from_all'
Product: Red Hat Ceph Storage Reporter: Casey Bodley <cbodley>
Component: RGW-MultisiteAssignee: Casey Bodley <cbodley>
Status: CLOSED ERRATA QA Contact: Tejas <tchandra>
Severity: urgent Docs Contact: Aron Gunn <agunn>
Priority: low    
Version: 3.2CC: agunn, anharris, assingh, ceph-eng-bugs, ceph-qe-bugs, mbenjamin, mmuench, roemerso, tserlin, vimishra, vumrao
Target Milestone: z2   
Target Release: 3.2   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL: ceph-12.2.8-128.el7cp Ubuntu: ceph_12.2.8-111redhat1 Doc Type: Bug Fix
Doc Text:
.A multi-site Ceph Object Gateway is not trimming the data and bucket index logs Configuring zones for a multi-site Ceph Object Gateway without setting the `sync_from_all` option, was causing the data and bucket index logs not to be trimmed. With this release, the automated trimming process only consults the synchronization status of peer zones that are configured to synchronize. As result, this allows the data and bucket index logs to be trimmed properly.
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-30 15:57:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1629656    

Description Casey Bodley 2019-04-12 21:05:10 UTC
Description of problem:

The trimming process for data logs and bucket index logs relies on querying the sync status of peer zones to determine how much of your log is safe to trim. Log entries are only safe to trim if all peer zones report a sync status marker that is larger than the given log entry.

The default zone configuration sets 'sync_from_all=true', meaning that it syncs data from each peer zone in its zonegroup. A zone can also be configured to only 'sync_from' a subset of the zonegroup. When such a zone does not sync from one of its peers, it will return emtpy markers when that peer requests its sync status. This will prevent the peer zone from making progress in trimming its data logs and bucket index logs.


Version-Release number of selected component (if applicable):


How reproducible:

Whenever a zone is configured to -not- sync from one of its peer zones.


Steps to Reproduce:
1. Create a multisite configuration with two zones 'a' and 'b'.

2. On the primary cluster, modify zone 'a' to not sync from zone 'b':
$ radosgw-admin zone modify --rgw-zone a --sync-from-all=0
$ radosgw-admin period update --commit

3. On the primary cluster, create a bucket 'bucket' and upload some objects.

4. Verify that the objects sync to the secondary zone and that sync status catches up.

5. Wait for at least rgw_sync_log_trim_interval (default 20min)

6. List the data log and bucket index log on each zone:
$ radosgw-admin datalog list
$ radosgw-admin bilog list --bucket bucket


Actual results:

The logs on zone 'a' are empty, but the logs on zone 'b' are not.

Expected results:

The logs on both zones are empty.

Additional info:

Comment 1 Vikhyat Umrao 2019-04-15 16:56:46 UTC
*** Bug 1700042 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2019-04-30 15:57:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:0911