Bug 2130116

Summary: standby-replay mds is removed from MDSMap unexpectedly
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Venky Shankar <vshankar>
Component: CephFSAssignee: Venky Shankar <vshankar>
Status: CLOSED ERRATA QA Contact: Hemanth Kumar <hyelloji>
Severity: high Docs Contact: Akash Raj <akraj>
Priority: unspecified    
Version: 5.2CC: akraj, asagare, bkunal, ceph-eng-bugs, cephqe-warriors, gfarnum, hyelloji, vereddy, vumrao
Target Milestone: ---   
Target Release: 5.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-16.2.10-54.el8cp Doc Type: Bug Fix
Doc Text:
.The standby-replay Metadata Server daemon is no longer unexpectedly removed Previously, the Ceph Monitor would remove a standby-replay Metadata Server (MDS) daemon from the MDS map under certain conditions. This caused the standby-replay MDS daemon to be removed from the Metadata Server cluster, which generated cluster warnings. With this fix, the logic used in Ceph Monitors during the consideration of removal of an MDS daemon from the MDS map, now includes information about the standby-replay MDS daemons holding a rank. This ensures that the standby-replay MDS daemons are no longer unexpectedly removed from the MDS cluster.
Story Points: ---
Clone Of:
: 2130118 (view as bug list) Environment:
Last Closed: 2023-01-11 17:41:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2130118    
Bug Blocks: 2126049, 2130925    

Description Venky Shankar 2022-09-27 09:05:47 UTC
Description of problem:

standby-replay mds is removed from MDSMap unexpectedly.

Change https://github.com/ceph/ceph/commit/20509bb6c82e872127ab838d45402be0d0b91b5f evicts MDSs when a garbage beacon or an invalid state transition is seen by the monitor. To reproduce this, the standby-replay daemon needs to be laggy and then when it resumes back to normal operation, the monitor would remove the standby-replay MDS.

Comment 1 RHEL Program Management 2022-09-27 09:05:58 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 33 errata-xmlrpc 2023-01-11 17:41:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security update and Bug Fix), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0076