Description of problem: When deleting a bucket with an incomplete multipart upload that has about 2000 parts uploaded, we noticed an infinite loop, which stopped s3cmd from deleting the bucket forever. Per check, when the bucket index was sharded (for example 128 shards), the original logic in RGWRados::cls_bucket_list_unordered() did not calculate the bucket shard ID correctly when the index key of a data part was taken as the marker. The issue is not necessarily reproduced each time. It will depend on the key of the object. To reproduce it in 128-shard bucket, we use 334 as the key for the incomplete multipart upload, which will be located in Shard 127 (known by experiment). In this setup, the original logic will usually come out a shard ID smaller than 127 (since 127 is the largest one) from the marker and thus a circle is constructed, which results in an infinite loop. PS: Some times the bucket ID calculation may incorrectly going forward instead of backward. Thus, the check logic may skip some shards, which may have regular keys. In such scenarios, some non-empty buckets may be deleted by accident. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. See above 2. 3. Actual results: Expected results: Additional info: Corresponding upstream pr: https://github.com/ceph/ceph/pull/39358
Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3294
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days