Bug 2228612 - [5.3][RGW][archive]: recovering shards in sync status on archive site upon a bucket delete from primary
Summary: [5.3][RGW][archive]: recovering shards in sync status on archive site upon a ...
Keywords:
Status: ON_QA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW-Multisite
Version: 5.3
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 5.3z5
Assignee: shilpa
QA Contact: Vidushi Mishra
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-08-02 20:30 UTC by shilpa
Modified: 2023-08-03 15:39 UTC (History)
7 users (show)

Fixed In Version: ceph-16.2.10-195.el8cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-7148 0 None None None 2023-08-02 20:30:59 UTC

Description shilpa 2023-08-02 20:30:13 UTC
This bug was initially created as a copy of Bug #2128421


Description of problem:
Recovering shards seen in 'radosgw-admin sync status' on archive site, when a bucket from primary is deleted.
snippet:
[root@ceph-arc-kvm-5-3-archive-pn959k-node5 cephuser]# radosgw-admin sync status
          realm 2512e36d-cb3f-4955-87ce-a5744fe6a135 (india)
      zonegroup eb4a0d90-aef9-48d6-947e-316e68d414ca (shared)
           zone 363592b3-6183-43c2-ad32-3f80038350bb (archive)
   current time 2022-09-20T13:03:34Z
zonegroup features enabled: resharding
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: a793ad9b-3e3c-48e4-b120-182825478817 (primary)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        1 shards are recovering
                        recovering shards: [124]

~8hours after bucket delete is deleted, 1 recovering shards still exists

Version-Release number of selected component (if applicable):
ceph version 16.2.10-43.el8cp

How reproducible:
2/2

Steps to Reproduce:
1. Ceph cluster deployed with archive zone [ 1 rgw daemon in both singlesite + archive zone]
2. create a bucket on primary and add an object (size2G)
3. allow the sync to complete
4. delete the bucket from primary using 
'radosgw-admin bucket rm  --bucket kvm-dlo-bkt1 --purge-objects'
5. bucket and object is deleted from both primary and archive site
snippet:
primary site:
(env) [root@ceph-pri-kvm-5-3-archive-pn959k-node5 lib]# radosgw-admin bucket stats --bucket kvm-dlo-bkt1
failure: (2002) Unknown error 2002: 

secondary site:
[root@ceph-arc-kvm-5-3-archive-pn959k-node5 ~]# radosgw-admin bucket stats --bucket kvm-dlo-bkt1
failure: (2002) Unknown error 2002: 

6. check sync status on both sites
primary site:
(env) [root@ceph-pri-kvm-5-3-archive-pn959k-node5 lib]# radosgw-admin sync status
          realm 2512e36d-cb3f-4955-87ce-a5744fe6a135 (india)
      zonegroup eb4a0d90-aef9-48d6-947e-316e68d414ca (shared)
           zone a793ad9b-3e3c-48e4-b120-182825478817 (primary)
   current time 2022-09-20T03:51:28Z
zonegroup features enabled: resharding
  metadata sync no sync (zone is master)
      data sync source: 363592b3-6183-43c2-ad32-3f80038350bb (archive)
                        not syncing from zone

secondary site:
after ~8hours the status still has 1 recovering shards 
[root@ceph-arc-kvm-5-3-archive-pn959k-node5 cephuser]# radosgw-admin sync status
          realm 2512e36d-cb3f-4955-87ce-a5744fe6a135 (india)
      zonegroup eb4a0d90-aef9-48d6-947e-316e68d414ca (shared)
           zone 363592b3-6183-43c2-ad32-3f80038350bb (archive)
   current time 2022-09-20T13:03:34Z
zonegroup features enabled: resharding
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: a793ad9b-3e3c-48e4-b120-182825478817 (primary)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        1 shards are recovering
                        recovering shards: [124]
Almost 8 hours since the bucket is deleted, the sync status still holds recovering shards


Actual results:
recovering shards reported in radosgw-admin sync status 

Expected results:
sync status reports appropriate status

Additional info:

setup details: root/r
site1: rgw - 10.0.211.224
archive site:  rgw - 10.0.208.162

PFA, the console logs and rgw logs at http://magna002.ceph.redhat.com/ceph-qe-logs/madhavi/5.3/archive/

Comment 1 RHEL Program Management 2023-08-02 20:30:22 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.


Note You need to log in before you can comment on or make changes to this bug.