Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1489060

Summary:	backport prune past_intervals capability to 2.y
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Sage Weil <sweil>
Component:	RADOS	Assignee:	Josh Durgin <jdurgin>
Status:	CLOSED ERRATA	QA Contact:	Manohar Murthy <mmurthy>
Severity:	high	Docs Contact:	Bara Ancincova <bancinco>
Priority:	high
Version:	2.4	CC:	agunn, anharris, ceph-eng-bugs, dzafman, hnallurv, jdurgin, kchai, kdreyer, mhackett, vumrao
Target Milestone:	rc
Target Release:	2.5
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:	RHEL: ceph-10.2.10-1.el7cp Ubuntu: ceph_10.2.10-2redhat1xenial	Doc Type:	Enhancement
Doc Text:	.The `osd_hack_prune_past_interval` option is now supported The `osd_hack_prune_past_interval` option helps to reduce memory usage for the past intervals entries, which can help with recovery of unhealthy clusters. WARNING: This option can cause data loss, therefore, use it only when instructed by the Red Hat Support Engineers.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-02-21 19:43:32 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1536401

Description Sage Weil 2017-09-06 15:54:32 UTC

Description of problem:

The past_intervals structure can get very big, consuming memory and ultimately making recovery difficult.

How reproducible:

It has happened several times with customers with unhealthy clusters.

Steps to Reproduce:
1. make cluster unhealthy
2. thrash osds
3. osd memory requirements increase, eventually beyond what the host has available


Now in upstream jewel:
 https://github.com/ceph/ceph/pull/17351

Backport that patch to downstream 2.y.

Note that luminous (and thus 3.y) does not have this problem.

Comment 3 Ken Dreyer (Red Hat) 2018-01-02 21:29:54 UTC

Fix is in Ceph v10.2.10 upstream

Comment 12 errata-xmlrpc 2018-02-21 19:43:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0340