Bug 2237490 - [7.0][RGW multisite][Archive zone][Duplicate objects in the archive zone]
Summary: [7.0][RGW multisite][Archive zone][Duplicate objects in the archive zone]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RGW-Multisite
Version: 7.0
Hardware: Unspecified
OS: Linux
unspecified
urgent
Target Milestone: ---
: 7.0
Assignee: shilpa
QA Contact: Madhavi Kasturi
Rivka Pollack
URL:
Whiteboard:
Depends On: 2182022
Blocks: 2237662
TreeView+ depends on / blocked
 
Reported: 2023-09-05 17:45 UTC by Vidushi Mishra
Modified: 2024-05-25 04:25 UTC (History)
11 users (show)

Fixed In Version: ceph-18.2.0-43.el9cp
Doc Type: No Doc Update
Doc Text:
Clone Of: 2182022
Environment:
Last Closed: 2023-12-13 15:22:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHCEPH-7326 0 None None None 2023-09-05 17:49:25 UTC
Red Hat Product Errata RHBA-2023:7780 0 None None None 2023-12-13 15:22:52 UTC

Description Vidushi Mishra 2023-09-05 17:45:54 UTC
+++ This bug was initially created as a clone of Bug #2182022 +++

Description of problem:

Duplicate objects observed in the RGW archive zone

--- Additional comment from  on 2023-03-27 10:10:18 UTC ---

Description of problem:

* Inditex runs an archive zone in two productive multisite setups:

- AXEC ( Spain )
- IEEC ( Ireland )

* Reading Soumya's latest update in upstream PR #50676 [1], they decided to test if duplicate objects were created in the archive zone. The test was done in AXEC. They confirmed this is the case:

1. Created a new bucket called axec-test-az-replication (created by a playbook)

2. Created a random file:

	    dd if=/dev/urandom bs=1M count=10 of=random.dat > /dev/null 2>&1

3. Upload the file to the above bucket:

	    s3cmd -c ~/.s3cfg_axec_fernandodlhdt put random.dat s3://axec-test-az-replication
	    upload: 'random.dat' -> 's3://axec-test-az-replication/random.dat'  [1 of 1]
	    10485760 of 10485760   100% in    4s     2.00 MB/s  done


4. Bucket stats from master zone (1 object):

		    [root@axec1cs1o01 ~]# radosgw-admin bucket stats --bucket="axec-test-az-replication"
		    {
		     "bucket": "axec-test-az-replication",
		     "num_shards": 11,
		     "tenant": "",
		     "zonegroup": "41c13a2e-53cb-47cf-8d12-4b856c88f1e2",
		     "placement_rule": "default-placement",
		     "explicit_placement": {
		         "data_pool": "",
		         "data_extra_pool": "",
		         "index_pool": ""
		     },
		     "id": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "marker": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "index_type": "Normal",
		     "owner": "axec-fernandodlhdt",
		     "ver": "0#1,1#1,2#1,3#1,4#1,5#2,6#1,7#1,8#1,9#1,10#1",
		     "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
		     "mtime": "2023-03-27T08:19:47.317031Z",
		     "creation_time": "2023-03-27T08:19:46.474116Z",
		     "max_marker": "0#,1#,2#,3#,4#,5#00000000001.392652.6,6#,7#,8#,9#,10#",
		     "usage": {
		         "rgw.main": {
		             "size": 10485760,
		             "size_actual": 10485760,
		             "size_utilized": 10485760,
		             "size_kb": 10240,
		             "size_kb_actual": 10240,
		             "size_kb_utilized": 10240,
		             "num_objects": 1
		         }
		     },
		     "bucket_quota": {
		         "enabled": true,
		         "check_on_raw": false,
		         "max_size": 1073741824,
		         "max_size_kb": 1048576,
		         "max_objects": -1
		     },
		     "tagset": {
		         "contact_email": "fernandodlhdt.com",
		         "product_owner": "ismaelpf",
		         "valhalla_team": "INFRADIG"
		     }
		    }


5. Bucket stats from secondary zone (1 object):

		    [root@axec2cs1o01 ~]# radosgw-admin bucket stats --bucket="axec-test-az-replication"
		    {
		     "bucket": "axec-test-az-replication",
		     "num_shards": 11,
		     "tenant": "",
		     "zonegroup": "41c13a2e-53cb-47cf-8d12-4b856c88f1e2",
		     "placement_rule": "default-placement",
		     "explicit_placement": {
		         "data_pool": "",
		         "data_extra_pool": "",
		         "index_pool": ""
		     },
		     "id": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "marker": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "index_type": "Normal",
		     "owner": "axec-fernandodlhdt",
		     "ver": "0#1,1#1,2#1,3#1,4#1,5#2,6#1,7#1,8#1,9#1,10#1",
		     "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
		     "mtime": "2023-03-27T08:19:47.317031Z",
		     "creation_time": "2023-03-27T08:19:46.474116Z",
		     "max_marker": "0#,1#,2#,3#,4#,5#00000000001.395509.6,6#,7#,8#,9#,10#",
		     "usage": {
		         "rgw.main": {
		             "size": 10485760,
		             "size_actual": 10485760,
		             "size_utilized": 10485760,
		             "size_kb": 10240,
		             "size_kb_actual": 10240,
		             "size_kb_utilized": 10240,
		             "num_objects": 1
		         }
		     },
		     "bucket_quota": {
		         "enabled": true,
		         "check_on_raw": false,
		         "max_size": 1073741824,
		         "max_size_kb": 1048576,
		         "max_objects": -1
		     },
		     "tagset": {
		         "contact_email": "fernandodlhdt.com",
		         "product_owner": "ismaelpf",
		         "valhalla_team": "INFRADIG"
		     }
		    }


  6. Bucket stats from archive zone (2 objects):

		    [root@axec2csaz1o01 ~]# radosgw-admin bucket stats --bucket="axec-test-az-replication"
		    {
		     "bucket": "axec-test-az-replication",
		     "num_shards": 11,
		     "tenant": "",
		     "zonegroup": "41c13a2e-53cb-47cf-8d12-4b856c88f1e2",
		     "placement_rule": "default-placement",
		     "explicit_placement": {
		         "data_pool": "",
		         "data_extra_pool": "",
		         "index_pool": ""
		     },
		     "id": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "marker": "34eb3c10-0184-4ac5-8607-5a07c4cdd9ca.445449.2",
		     "index_type": "Normal",
		     "owner": "axec-fernandodlhdt",
		     "ver": "0#1,1#1,2#1,3#1,4#1,5#5,6#1,7#1,8#1,9#1,10#1",
		     "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0",
		     "mtime": "2023-03-27T08:19:47.317031Z",
		     "creation_time": "2023-03-27T08:19:46.474116Z",
		     "max_marker": "0#,1#,2#,3#,4#,5#00000000004.1855724.12,6#,7#,8#,9#,10#",
		     "usage": {
		         "rgw.main": {
		             "size": 20971520,
		             "size_actual": 20971520,
		             "size_utilized": 20971520,
		             "size_kb": 20480,
		             "size_kb_actual": 20480,
		             "size_kb_utilized": 20480,
		             "num_objects": 2
		         }
		     },
		     "bucket_quota": {
		         "enabled": true,
		         "check_on_raw": false,
		         "max_size": 1073741824,
		         "max_size_kb": 1048576,
		         "max_objects": -1
		     },
		     "tagset": {
		         "contact_email": "fernandodlhdt.com",
		         "product_owner": "ismaelpf",
		         "valhalla_team": "INFRADIG"
		     }
		    }


Version-Release number of selected component (if applicable):

RHCS 5.3

How reproducible:

Posted above - configure an archive zone and test the number of objects in the buckets. 

Actual results:

Duplicate objects created in the archive zone

Expected results:

The archive zone should not duplicate objects from each zone it backups. 

Additional info

[1] https://github.com/ceph/ceph/pull/50676?notification_referrer_id=NT_kwDOAFtribI1OTc1MTgxODkyOjU5OTEzMDU#issuecomment-1484678834

Thank you. Best regards,
Natalia Ravina
Storage TAM
Working hours: Mon - Fri 09:00 - 17:00 CET

--- Additional comment from Matt Benjamin (redhat) on 2023-03-27 16:08:58 UTC ---

Hi Casey,

Did you just recently send a report of a similar issue?

thanks

Matt

--- Additional comment from Casey Bodley on 2023-03-27 16:18:59 UTC ---

(In reply to Matt Benjamin (redhat) from comment #2)
> Hi Casey,
> 
> Did you just recently send a report of a similar issue?

Soumya and i discussed this in https://github.com/ceph/ceph/pull/50676

--- Additional comment from Matt Benjamin (redhat) on 2023-03-27 16:23:47 UTC ---

(In reply to Casey Bodley from comment #3)
> (In reply to Matt Benjamin (redhat) from comment #2)
> > Hi Casey,
> > 
> > Did you just recently send a report of a similar issue?
> 
> Soumya and i discussed this in https://github.com/ceph/ceph/pull/50676

ah, right, I was reading it this morning and couldn't find it again--thank you, Casey

Matt

--- Additional comment from Matt Benjamin (redhat) on 2023-04-13 16:04:51 UTC ---

Most likely will not finish by week of 4/24, moving to rhcs-6.1z1.

Matt

--- Additional comment from  on 2023-05-12 07:47:21 UTC ---

A very similar issue was already fixed in RHCS 4.0:

https://bugzilla.redhat.com/show_bug.cgi?id=1760862

Is this a regression?

Thank you. Best regards,
Natalia Ravina
Storage TAM
Working hours: Mon - Fri 09:00 - 17:00 CET

--- Additional comment from  on 2023-05-14 03:49:36 UTC ---

A testfix build is available for QE (based on latest 6.1 candidate build, ceph-17.2.6-47):

* rhceph-container-6-155.0.TEST.bz2182022
* Pull from: registry-proxy.engineering.redhat.com/rh-osbs/rhceph:6-155.0.TEST.bz2182022
* Brew link: https://brewweb.engineering.redhat.com/brew/buildinfo?buildID=2502667
* Ceph build in container: ceph-17.2.6-47.0.TEST.bz2182022.el9cp

Based on this -patches branch (7dbb7ff8c3483db224a0a9102c7e5ae92164904d):
https://gitlab.cee.redhat.com/ceph/ceph/-/commits/private-tserlin-ceph-6.1-rhel-9-test-bz2182022-patches

Thomas

--- Additional comment from Vidushi Mishra on 2023-05-23 07:55:26 UTC ---

Hi Shilpa,

The test fix [17.2.6-47.0.TEST.bz2182022.el9cp] is working for 

a) new buckets: duplicate objects are not created on the archive site and
b) old buckets: writing objects to older buckets does not create duplicate objects on the test fix build, however, the previous duplicates exist.


We would be progressing with further sanity tests on multisite and archive as mentioned in [1]

[1] https://docs.google.com/document/d/1JVu3q4a35w5T2l4JqHHCLE5jjF54HjB98w9gFulc4WU/edit

Thanks,
vidushi

--- Additional comment from shilpa on 2023-05-23 14:48:37 UTC ---

(In reply to Vidushi Mishra from comment #8)
> Hi Shilpa,
> 
> The test fix [17.2.6-47.0.TEST.bz2182022.el9cp] is working for 
> 
> a) new buckets: duplicate objects are not created on the archive site and
> b) old buckets: writing objects to older buckets does not create duplicate
> objects on the test fix build, however, the previous duplicates exist.
> 
> 
> We would be progressing with further sanity tests on multisite and archive
> as mentioned in [1]
> 
> [1]
> https://docs.google.com/document/d/
> 1JVu3q4a35w5T2l4JqHHCLE5jjF54HjB98w9gFulc4WU/edit
> 
> Thanks,
> vidushi

Hi vidushi,

That's right, the old objects will have to be cleaned up manually.

--- Additional comment from shilpa on 2023-05-24 14:53:34 UTC ---

pushed the commit https://gitlab.cee.redhat.com/ceph/ceph/-/commit/79d812dd62870b73ab67fd6caae71f345ca55595
to ceph-6.1-rhel-patches

--- Additional comment from errata-xmlrpc on 2023-05-25 01:54:41 UTC ---

This bug has been added to advisory RHBA-2023:112314 by Thomas Serlin (tserlin)

--- Additional comment from errata-xmlrpc on 2023-05-25 01:54:43 UTC ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2023:112314-01
https://errata.devel.redhat.com/advisory/112314

--- Additional comment from Vidushi Mishra on 2023-05-25 19:10:10 UTC ---

Hi Shilpa,

We tested on ceph version 17.2.6-64.el9cp, and observe that when performing multipart uploads, the objects are getting duplicated. This was observed 2/3 times.

Steps:

Create 3 buckets green-test-dup-{1..3} and perform multipart object upload as below:

@PRIMARY SITE
---------------

1. Upload a file 'phoenix.txt' of 18M size using s3cmd.
2. Upload '3' objects with key names as 'phoenix.txt', big1, and big2 via multipart uploads using the same 18M file (phoenix.txt)

@SECONDARY
------------

3. Upload 1 more object via multipart 'big3' from the secondary.


Repeat steps 1-3 for the buckets 'green-test-dup-1' and 'green-test-dup-2'


Expected Result:
-----------------

- We should observe a total of 4 objects for the above bucket at each site (primary, secondary and archive)

Actual Result:
--------------

We observed 7 objects at the Archive site for the bucket 'green-test-dup-3' and 5 objects for the bucket 'green-test-dup-1'.


Reproducibility
-----------------

seen 2/3 

Thanks,
Vidushi

--- Additional comment from Vidushi Mishra on 2023-05-25 19:22:58 UTC ---

=============== Sharing the console snippet for bucket 'green-test-dup-3' ====================

Primary site
--------------

[root@ceph-pri-mip-ywbdhn-node6 ~]# s3cmd mb s3://green-test-dup-3
Bucket 's3://green-test-dup-3/' created
[root@ceph-pri-mip-ywbdhn-node6 ~]# 
[root@ceph-pri-mip-ywbdhn-node6 ~]# s3cmd put phoenix.txt s3://green-test-dup-3/
upload: 'phoenix.txt' -> 's3://green-test-dup-3/phoenix.txt'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    45.96 MB/s  done
upload: 'phoenix.txt' -> 's3://green-test-dup-3/phoenix.txt'  [part 2 of 2, 3MB] [1 of 1]
 3145728 of 3145728   100% in    0s    27.70 MB/s  done
[root@ceph-pri-mip-ywbdhn-node6 ~]# s3cmd put phoenix.txt s3://green-test-dup-3/big1
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big1'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    49.06 MB/s  done
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big1'  [part 2 of 2, 3MB] [1 of 1]
 3145728 of 3145728   100% in    0s    28.34 MB/s  done
[root@ceph-pri-mip-ywbdhn-node6 ~]# s3cmd put phoenix.txt s3://green-test-dup-3/big2
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big2'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    45.86 MB/s  done
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big2'  [part 2 of 2, 3MB] [1 of 1]
 3145728 of 3145728   100% in    0s    31.05 MB/s  done
[root@ceph-pri-mip-ywbdhn-node6 ~]# 


secondary site
------------------

[root@ceph-sec-mip-ywbdhn-node6 ~]#  s3cmd put phoenix.txt s3://green-test-dup-3/big3
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big3'  [part 1 of 2, 15MB] [1 of 1]
 15728640 of 15728640   100% in    0s    41.82 MB/s  done
upload: 'phoenix.txt' -> 's3://green-test-dup-3/big3'  [part 2 of 2, 3MB] [1 of 1]
 3145728 of 3145728   100% in    0s    19.27 MB/s  done
[root@ceph-sec-mip-ywbdhn-node6 ~]# 


objects on all sites
-----------------------


primary :
------------

[root@ceph-pri-mip-ywbdhn-node6 ~]# radosgw-admin bucket stats --bucket green-test-dup-3 | egrep 'num_shards|num_objects'
    "num_shards": 11,
            "num_objects": 4
            "num_objects": 0
[root@ceph-pri-mip-ywbdhn-node6 ~]# 


secondary
------------

[root@ceph-sec-mip-ywbdhn-node6 ~]# radosgw-admin bucket stats --bucket green-test-dup-3 | egrep 'num_shards|num_objects'
    "num_shards": 11,
            "num_objects": 4
            "num_objects": 0
[root@ceph-sec-mip-ywbdhn-node6 ~]# 


archive 
------------

[root@ceph-arc-mip-ywbdhn-node6 ~]# radosgw-admin bucket stats --bucket green-test-dup-3 | egrep 'num_shards|num_objects'
    "num_shards": 11,
            "num_objects": 7
[root@ceph-arc-mip-ywbdhn-node6 ~]# 



Logs:
--------
1. bucket 'green-test-dup-3': http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/2182022/green-test-dup-3
2. bucket 'green-test-dup-1': http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/2182022/green-test-dup-1
2. host and setup details : http://magna002.ceph.redhat.com/ceph-qe-logs/vidushi/2182022/host_details


Thanks,
Vidushi

--- Additional comment from Vidushi Mishra on 2023-05-25 19:29:52 UTC ---

I discussed with Shilpa about the multipart objects being still duplicated. She suggested opening a separate bug for tracking the duplicity with multipart uploads.

--- Additional comment from shilpa on 2023-05-25 19:40:26 UTC ---

(In reply to Vidushi Mishra from comment #15)
> I discussed with Shilpa about the multipart objects being still duplicated.
> She suggested opening a separate bug for tracking the duplicity with
> multipart uploads.

Right, this bz should not be blocked because it still addresses regular objects.
If there is a bug in multipart uploads, the area of problem might be somewhere else
and it will need a discrete fix.

--- Additional comment from Vidushi Mishra on 2023-05-25 19:45:05 UTC ---

Hi Shilpa,

Have created Bug #2210132 for tracking the multipart uploads issue.

We would perform some sanity tests as mentioned in the test plan [1] before moving the bug to verified on ceph version 17.2.6-64.el9cp.


[1]https://docs.google.com/document/d/1JVu3q4a35w5T2l4JqHHCLE5jjF54HjB98w9gFulc4WU/edit

Thanks,
Vidushi

--- Additional comment from Vidushi Mishra on 2023-06-01 12:46:19 UTC ---

Moving bug to verified on ceph version 17.2.6-69.el9cp.


The test plan https://docs.google.com/document/d/1JVu3q4a35w5T2l4JqHHCLE5jjF54HjB98w9gFulc4WU/edit

Verification logs https://docs.google.com/document/d/1ILQZ7fwwwSD4ZDjJRHZp_-qdOk7jKgbgz-u9FE6S0iE/edit#

--- Additional comment from  on 2023-06-02 08:57:36 UTC ---



--- Additional comment from errata-xmlrpc on 2023-06-15 09:08:26 UTC ---

Bug report changed to RELEASE_PENDING status by Errata System.
Advisory RHSA-2023:112314-11 has been changed to PUSH_READY status.
https://errata.devel.redhat.com/advisory/112314

--- Additional comment from errata-xmlrpc on 2023-06-15 09:16:51 UTC ---

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 6.1 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3623

Comment 8 errata-xmlrpc 2023-12-13 15:22:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780

Comment 9 Red Hat Bugzilla 2024-05-25 04:25:08 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.