Bug 1814057

Summary: Ceph Monitor heartbeat grace period does not reset
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Steve Baldwin <sbaldwin>
Component: RADOSAssignee: Sridhar Seshasayee <sseshasa>
Status: CLOSED ERRATA QA Contact: Pawan <pdhiran>
Severity: high Docs Contact: Aron Gunn <agunn>
Priority: medium    
Version: 3.2CC: agunn, ceph-eng-bugs, dzafman, gsitlani, jdurgin, kchai, marjones, mmuench, nojha, pdhiran, segutier, sseshasa, tserlin, vumrao
Target Milestone: z2   
Target Release: 4.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-14.2.8-100.el8cp, ceph-14.2.8-100.el7cp Doc Type: Enhancement
Doc Text:
.Update to the heartbeat grace period Previously, when there were no Ceph OSD failures for more than 48 hours, there was no mechanism to reset the grace timer back to the default value. With this release, the heartbeat grace timer is reset to the default value of 20 seconds, if there have been no failures on a Ceph OSD for 48 hours. When the failure interval between the last failure and the latest failure exceeds 48 hours, the grace timer is reset to the default value of 20 seconds. The grace time is the interval in which a Ceph storage cluster considers a Ceph OSD as down by the absence of a heartbeat. The grace time is scaled based on lag estimations or on how frequently a Ceph ODS is experiencing failures.
Story Points: ---
Clone Of:
: 1855472 (view as bug list) Environment:
Last Closed: 2020-09-30 17:24:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1816167, 1855472    

Comment 30 errata-xmlrpc 2020-09-30 17:24:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 4.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4144