Bug 1351137 - Data stopped syncing from master to non-master while uploading objects to both the zones simultaneously
Summary: Data stopped syncing from master to non-master while uploading objects to bot...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: RGW
Version: 2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: 2.0
Assignee: Yehuda Sadeh
QA Contact: shilpa
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-29 10:34 UTC by shilpa
Modified: 2017-07-30 16:04 UTC (History)
10 users (show)

Fixed In Version: RHEL: ceph-10.2.2-15.el7cp Ubuntu: ceph_10.2.2-11redhat1xenial
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-23 19:43:08 UTC
Target Upstream Version:


Attachments (Terms of Use)
boto script to create workload on magna115 (1.20 KB, text/plain)
2016-06-29 15:41 UTC, shilpa
no flags Details


Links
System ID Priority Status Summary Last Updated
Ceph Project Bug Tracker 16530 None None None 2016-06-29 18:37:02 UTC
Red Hat Product Errata RHBA-2016:1755 normal SHIPPED_LIVE Red Hat Ceph Storage 2.0 bug fix and enhancement update 2016-08-23 23:23:52 UTC

Description shilpa 2016-06-29 10:34:57 UTC
Description of problem:
Create 100 buckets and upload object in each bucket on both rgw zones in parallel. The objects sync successfully to master zone whereas on the non-master zone, the buckets are synced from master but not the objects

Version-Release number of selected component (if applicable):
ceph-radosgw-10.2.2-5.el7cp.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Start a boto script on each zone to create and upload 100 buckets with an object of 1.5G in each bucket.
2. All objects and buckets are synced to master but only buckets are synced to non-master zone. No objects are created. 

Actual results:

On master:

]# radosgw-admin sync status --rgw-zone=us-1 --debug-rgw=0 --debug-ms=0          realm 1c60c863-689d-441f-b370-62390562e2aa (earth)
      zonegroup 540c9b3f-5eb7-4a67-a581-54bc704ce827 (us)
           zone 505a3a8e-19cf-4295-a43d-559e763891f6 (us-1)
  metadata sync no sync (zone is master)
      data sync source: d48cb942-a5fa-4597-89fd-0bab3bb9c5a3 (us-2)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

On non-master:

# radosgw-admin sync status --rgw-zone=us-2 --debug-rgw=0 --debug-ms=0
          realm 1c60c863-689d-441f-b370-62390562e2aa (earth)
      zonegroup 540c9b3f-5eb7-4a67-a581-54bc704ce827 (us)
           zone d48cb942-a5fa-4597-89fd-0bab3bb9c5a3 (us-2)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is behind on 1 shards
                oldest incremental change not applied: 2016-06-23 09:57:35.0.097857s
      data sync source: 505a3a8e-19cf-4295-a43d-559e763891f6 (us-1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is behind on 8 shards
                        oldest incremental change not applied: 2016-06-29 07:34:15.0.194232s


Expected results:
Sync should work from both side simultaneously

Additional info:
Will provide the path to logs

Comment 4 shilpa 2016-06-29 15:41:06 UTC
Created attachment 1174005 [details]
boto script to create workload on magna115

Comment 5 Yehuda Sadeh 2016-06-29 18:40:47 UTC
It seems that data sync stopped following errors in requests from non-master to master (retrieving of all incremental data log shards failed, maybe timing out due to lack of request processing threads on the master).

Comment 13 shilpa 2016-07-14 08:54:44 UTC
Tested and verified on 10.2.2-15. Issues related to segfault in multisite operations are tracked in other BZ's.

Comment 15 errata-xmlrpc 2016-08-23 19:43:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html


Note You need to log in before you can comment on or make changes to this bug.