Bug 1905431 - [GSS][RFE] Optimize PG removal for huge number of objects in Red Hat Ceph Storage 4
Summary: [GSS][RFE] Optimize PG removal for huge number of objects in Red Hat Ceph Sto...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 4.1
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 4.2z2
Assignee: Neha Ojha
QA Contact: Pawan
Amrita
URL:
Whiteboard:
: 1770510 1952920 (view as bug list)
Depends On:
Blocks: 1890121
TreeView+ depends on / blocked
 
Reported: 2020-12-08 11:03 UTC by Gaurav Sitlani
Modified: 2022-05-03 18:02 UTC (History)
18 users (show)

Fixed In Version: ceph-14.2.11-151.el8cp, ceph-14.2.11-151.el7cp
Doc Type: Enhancement
Doc Text:
.Improvement in the efficiency of the PG removal code Previously,the code was inefficient as it did not keep a pointer to the last deleted object in the placement group (PG) in every pass which caused an unnecessary iteration over all the objects each time. With this release,there is an improved PG deletion performance with less impact on the client I/O. The parameters `osd_delete_sleep_ssd` and `osd_delete_sleep_hybrid` now have the default value of 1 second.
Clone Of:
Environment:
Last Closed: 2021-06-15 17:13:09 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 45765 0 None None None 2020-12-08 11:11:38 UTC
Ceph Project Bug Tracker 47174 0 None None None 2020-12-10 17:56:19 UTC
Github ceph ceph pull 38478 0 None closed nautilus: osd: optimize PG removal (part1) 2021-02-09 11:59:48 UTC
Red Hat Issue Tracker RHCEPH-4227 0 None None None 2022-05-03 18:02:53 UTC
Red Hat Product Errata RHSA-2021:2445 0 None None None 2021-06-15 17:13:33 UTC

Description Gaurav Sitlani 2020-12-08 11:03:56 UTC
Description of problem:

This is a request for backporting the PG optimization from the following pull requests in Red Hat Ceph Storage 4.1 :

https://github.com/ceph/ceph/pull/37314
https://github.com/ceph/ceph/pull/37496


Version-Release number of selected component (if applicable):
ceph version 14.2.8-111.el7cp

Steps to Reproduce:

1. Instantiate a test cluster with 2 pools sharing the same crush rule (or at least the same OSDs), one of them filled with a high number of objects (let's say 100M)
2. Delete the pool containing the 100M objects
3. Observing the read/write latencies increasing over time on the other pool

Comment 4 Vikhyat Umrao 2020-12-10 17:56:21 UTC
Earlier reports from workload dfg:

https://bugzilla.redhat.com/show_bug.cgi?id=1770510
https://tracker.ceph.com/issues/47174

Comment 10 Vikhyat Umrao 2021-05-17 11:59:32 UTC
*** Bug 1952920 has been marked as a duplicate of this bug. ***

Comment 18 Amrita 2021-06-02 13:09:15 UTC
Hi Neha,

Could you please provide the doc text, this is for inclusion in the 4.2z2 Release Notes?

Thanks
Amrita

Comment 25 errata-xmlrpc 2021-06-15 17:13:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix Update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2445

Comment 26 Vikhyat Umrao 2022-05-03 17:56:53 UTC
*** Bug 1770510 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.