2008835 – [GSS][RGW]Arbitrarily-large space leaks generated by re-uploading the same multi-part part multiple times

Bug 2008835 - [GSS][RGW]Arbitrarily-large space leaks generated by re-uploading the same multi-part part multiple times

Summary: [GSS][RGW]Arbitrarily-large space leaks generated by re-uploading the same mu...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RGW
Sub Component:
Version:	4.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	7.0
Assignee:	Matt Benjamin (redhat)
QA Contact:	Tejas
Docs Contact:	Rivka Pollack
URL:
Whiteboard:
Depends On:
Blocks:	2237662
TreeView+	depends on / blocked

Reported:	2021-09-29 10:06 UTC by Geo Jose
Modified:	2024-12-20 21:14 UTC (History)
CC List:	10 users (show)
Fixed In Version:	ceph-18.2.0-1
Doc Type:	Bug Fix
Doc Text:	.RADOS object multipart upload workflows complete properly. Previously, in some cases, a RADOS object that was part of a multipart upload workflow objects that were created on a previous upload would cause certain parts to not complete or stop in the middle of the upload. With this fix, all parts upload correctly, once the multipart upload workflow is complete.
Clone Of:
Environment:
Last Closed:	2023-12-13 15:18:39 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	ceph ceph pull 49709	None	Merged	rgw: fix multipart upload object leaks due to re-upload	2023-03-26 19:17:11 UTC
Red Hat Issue Tracker	RHCEPH-1924	None	None	None	2021-09-29 10:08:07 UTC
Red Hat Product Errata	RHBA-2023:7780	None	None	None	2023-12-13 15:18:43 UTC

Description Geo Jose 2021-09-29 10:06:10 UTC

In RGW workload, large quantities of objects are accumulated in the data pool that appear to be orphans. The leaked part objects are for completed multi-part uploads.

Engineering believes this is the primary underlying issue-the ability to generate arbitrarily-large space leaks by re-uploading the same multi-part part multiple times.  This affects all RGW versions which have supported S3 multipart upload.

The root cause is that although these RGWs contain logic to detect that the upload part operation has conflicted with a prior upload of the part, the code handling that case addresses the naming conflict but does not correctly accumulate the full set of object names generated by all the upload attempts for a given part, and instead overwrites metadata related to prior uploads of the part with the latest one.

To fix this, we propose to move the current serialization and store of part upload metadata into RGW's OSD-side CLS interface, where it is straightforward to combine existing and new part metadata, as well as avoid races between simultaneous uploads of the same part.  Secondarily, this extra historical data will be used in the code to clean up completed and aborted multipart uploads.


Pull request: https://github.com/ceph/ceph/pull/37260

Comment 1 RHEL Program Management 2021-09-29 10:06:17 UTC

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 26 errata-xmlrpc 2023-12-13 15:18:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 7.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:7780

Comment 27 Red Hat Bugzilla 2024-04-12 04:25:04 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.