1905431 – [GSS][RFE] Optimize PG removal for huge number of objects in Red Hat Ceph Storage 4

Bug 1905431 - [GSS][RFE] Optimize PG removal for huge number of objects in Red Hat Ceph Storage 4

Summary: [GSS][RFE] Optimize PG removal for huge number of objects in Red Hat Ceph Sto...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	4.1
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	4.2z2
Assignee:	Neha Ojha
QA Contact:	Pawan
Docs Contact:	Amrita
URL:
Whiteboard:
Duplicates (2):	1770510 1952920 (view as bug list)
Depends On:
Blocks:	1890121
TreeView+	depends on / blocked

Reported:	2020-12-08 11:03 UTC by Gaurav Sitlani
Modified:	2025-04-04 12:38 UTC (History)
CC List:	18 users (show)
Fixed In Version:	ceph-14.2.11-151.el8cp, ceph-14.2.11-151.el7cp
Doc Type:	Enhancement
Doc Text:	.Improvement in the efficiency of the PG removal code Previously,the code was inefficient as it did not keep a pointer to the last deleted object in the placement group (PG) in every pass which caused an unnecessary iteration over all the objects each time. With this release,there is an improved PG deletion performance with less impact on the client I/O. The parameters `osd_delete_sleep_ssd` and `osd_delete_sleep_hybrid` now have the default value of 1 second.
Clone Of:
Environment:
Last Closed:	2021-06-15 17:13:09 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	45765	None	None	None	2020-12-08 11:11:38 UTC
Ceph Project Bug Tracker	47174	None	None	None	2020-12-10 17:56:19 UTC
Github	ceph ceph pull 38478	None	closed	nautilus: osd: optimize PG removal (part1)	2021-02-09 11:59:48 UTC
Red Hat Issue Tracker	RHCEPH-4227	None	None	None	2022-05-03 18:02:53 UTC
Red Hat Product Errata	RHSA-2021:2445	None	None	None	2021-06-15 17:13:33 UTC

Description Gaurav Sitlani 2020-12-08 11:03:56 UTC

Description of problem:

This is a request for backporting the PG optimization from the following pull requests in Red Hat Ceph Storage 4.1 :

https://github.com/ceph/ceph/pull/37314
https://github.com/ceph/ceph/pull/37496


Version-Release number of selected component (if applicable):
ceph version 14.2.8-111.el7cp

Steps to Reproduce:

1. Instantiate a test cluster with 2 pools sharing the same crush rule (or at least the same OSDs), one of them filled with a high number of objects (let's say 100M)
2. Delete the pool containing the 100M objects
3. Observing the read/write latencies increasing over time on the other pool

Comment 4 Vikhyat Umrao 2020-12-10 17:56:21 UTC

Earlier reports from workload dfg:

https://bugzilla.redhat.com/show_bug.cgi?id=1770510
https://tracker.ceph.com/issues/47174

Comment 10 Vikhyat Umrao 2021-05-17 11:59:32 UTC

*** Bug 1952920 has been marked as a duplicate of this bug. ***

Comment 18 Amrita 2021-06-02 13:09:15 UTC

Hi Neha,

Could you please provide the doc text, this is for inclusion in the 4.2z2 Release Notes?

Thanks
Amrita

Comment 25 errata-xmlrpc 2021-06-15 17:13:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2445

Comment 26 Vikhyat Umrao 2022-05-03 17:56:53 UTC

*** Bug 1770510 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.