Bug 1814057

Summary:	Ceph Monitor heartbeat grace period does not reset
Product:	[Red Hat Storage] Red Hat Ceph Storage	Reporter:	Steve Baldwin <sbaldwin>
Component:	RADOS	Assignee:	Sridhar Seshasayee <sseshasa>
Status:	CLOSED ERRATA	QA Contact:	Pawan <pdhiran>
Severity:	high	Docs Contact:	Aron Gunn <agunn>
Priority:	medium
Version:	3.2	CC:	agunn, ceph-eng-bugs, dzafman, gsitlani, jdurgin, kchai, marjones, mmuench, nojha, pdhiran, segutier, sseshasa, tserlin, vumrao
Target Milestone:	z2
Target Release:	4.1
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:	ceph-14.2.8-100.el8cp, ceph-14.2.8-100.el7cp	Doc Type:	Enhancement
Doc Text:	.Update to the heartbeat grace period Previously, when there were no Ceph OSD failures for more than 48 hours, there was no mechanism to reset the grace timer back to the default value. With this release, the heartbeat grace timer is reset to the default value of 20 seconds, if there have been no failures on a Ceph OSD for 48 hours. When the failure interval between the last failure and the latest failure exceeds 48 hours, the grace timer is reset to the default value of 20 seconds. The grace time is the interval in which a Ceph storage cluster considers a Ceph OSD as down by the absence of a heartbeat. The grace time is scaled based on lag estimations or on how frequently a Ceph ODS is experiencing failures.	Story Points:	---
Clone Of:
Clones:	1855472 (view as bug list)		Environment:
Last Closed:	2020-09-30 17:24:49 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1816167, 1855472

Comment 30 errata-xmlrpc 2020-09-30 17:24:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 4.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4144