Bug 1608060 - Limit pg log length during recovery/backfill so that we don't run out of memory
Summary: Limit pg log length during recovery/backfill so that we don't run out of memory
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 3.2
Assignee: Neha Ojha
QA Contact: Parikshith
Bara Ancincova
URL:
Whiteboard:
: 1644409 (view as bug list)
Depends On:
Blocks: 1584264 1644409 1673654
TreeView+ depends on / blocked
 
Reported: 2018-07-24 21:31 UTC by Neha Ojha
Modified: 2024-01-18 09:56 UTC (History)
13 users (show)

Fixed In Version: RHEL: ceph-12.2.8-66.el7cp Ubuntu: ceph_12.2.8-52redhat1xenial
Doc Type: Bug Fix
Doc Text:
.PG log length is now limited Previously, the `osd_max_pg_log_entries` option did not set a hard limit for the placement group (PG) log length. Consequently, during recovery and backfill, the log could grow significantly and consume a lot of memory, in some cases even all of it. With this update, a hard limit is set on the number of log entries in the PG log even during recovery and backfill. A corner case, where it might be hard to limit the PG log length, is on erasure-coded pools, when the rollback information on some of replicas is too old for some reason. A flag called pglog_hardlimit has been introduced. It is off by default. This flag enables the feature that limits the length of the pg log. Users should run 'ceph osd set pglog_hardlimit' after a complete upgrade is over. Once all the OSDs have this flag set, the length of the pg log will be capped by a hard limit. This flag should not be unset.
Clone Of:
: 1644409 1673654 (view as bug list)
Environment:
Last Closed: 2019-02-07 17:14:29 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 23979 0 None None None 2018-07-24 21:31:11 UTC
Github ceph ceph pull 25949 0 'None' closed luminous: osd/mon: pg log hard limit with upgrades fixed 2020-07-09 16:43:27 UTC
Github https://github.com/ceph ceph pull 23211 0 None None None 2020-07-09 16:43:27 UTC
Red Hat Issue Tracker RHCEPH-3718 0 None None None 2022-03-13 15:21:54 UTC

Description Neha Ojha 2018-07-24 21:31:12 UTC
Description of problem: osd_max_pg_log_entries is not a hard upper limit for the pg log length. During recovery/backfill the pg log may end up growing considerably, and using a lot of memory.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Neha Ojha 2018-07-24 23:51:51 UTC
For QA purposes:

The length of the pg log can be observed from the output of ceph pg dump.
It can also be viewed in the OSD logs as "approx pg log length = ".

Comment 4 Neha Ojha 2018-09-14 00:57:15 UTC
Pushed changes to ceph-3.2-rhel-patches

Comment 14 Neha Ojha 2018-10-30 21:13:15 UTC
Hi Bara,

I have fixed a typo and added some information to the Doc Text.

Thanks,
Neha

Comment 17 Neha Ojha 2018-11-05 23:48:43 UTC
Moving this to z1 in light of http://tracker.ceph.com/issues/36686.

Comment 23 errata-xmlrpc 2019-01-03 19:01:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0020

Comment 24 Vikhyat Umrao 2019-01-08 00:32:27 UTC
*** Bug 1644409 has been marked as a duplicate of this bug. ***

Comment 32 Red Hat Bugzilla 2023-09-15 00:11:03 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.