Bug 1459967 - [RGW]: Data sync issue seen post failover and failback on a multisite environment
[RGW]: Data sync issue seen post failover and failback on a multisite environ...
Status: ASSIGNED
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RGW (Show other bugs)
2.3
Unspecified Linux
low Severity medium
: rc
: 3.0
Assigned To: Casey Bodley
ceph-qe-bugs
:
Depends On:
Blocks: 1437916
  Show dependency treegraph
 
Reported: 2017-06-08 13:08 EDT by Tejas
Modified: 2017-07-30 11:58 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: Known Issue
Doc Text:
.Failover and failback cause data sync issues in multi-site environments In environments using the Ceph Object Gateway multi-site feature, failover and failback cause data sync to stall. This is because the `radosgw-admin sync status` command reports that `data sync is behind` for an extended period of time. To workaround this issue, run `radosgw-admin data sync init` and restart gateways.
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
console log (9.11 KB, text/plain)
2017-06-08 13:08 EDT, Tejas
no flags Details

  None (edit)
Description Tejas 2017-06-08 13:08:33 EDT
Created attachment 1286194 [details]
console log

Description of problem:

   In a 2 site setup, after a failover and failback is done, we see the error like " data is behind on 3 shards"

Version-Release number of selected component (if applicable):
ceph version 10.2.7-29redhat1xenial

How reproducible:
Always

Steps to Reproduce:
1. Create a user and a few buckets from either side.
2. Bring site A down(primary), and switch zone. Create a new bucket from site B.
3. Bring site A back up, and switch A back as master. Now list the contents of the bucket created when A was down.
4. The bucket can be listed, but the contents are not visible from A.
5. Upload  another object to the same bucket from B, and then data previously written to the same bucket can also be seen  from A.


Additional info:

status after failback:
radosgw-admin sync status --cluster master
          realm 0b2eced7-a62e-4509-bf5c-97b0273eb333 (movies)
      zonegroup d3177342-0542-48ad-aac9-6d654415769c (us)
           zone 7a75101f-019b-4735-b9e1-ff4f5758ac4c (us-east)
  metadata sync no sync (zone is master)
      data sync source: b7476a47-586d-4c21-a795-57fd32297c61 (us-west)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is behind on 3 shards
I will attach a log of my findings to this BZ.
Comment 9 Erin Donnelly 2017-06-15 08:28:18 EDT
Thanks Casey--updated doc text info.

Note You need to log in before you can comment on or make changes to this bug.