Bug 2130118 - standby-replay mds is removed from MDSMap unexpectedly
Summary: standby-replay mds is removed from MDSMap unexpectedly
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: CephFS
Version: 5.2
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 6.0
Assignee: Venky Shankar
QA Contact: Hemanth Kumar
Eliska
URL:
Whiteboard:
Depends On:
Blocks: 2126050 2130116
TreeView+ depends on / blocked
 
Reported: 2022-09-27 09:07 UTC by Venky Shankar
Modified: 2023-03-20 18:58 UTC (History)
6 users (show)

Fixed In Version: ceph-17.2.3-43.el9cp
Doc Type: Bug Fix
Doc Text:
.The standby-replay Metadata Server daemon is no longer unexpectedly removed Previously, the Ceph Monitor would remove a standby-replay Metadata Server (MDS) daemon from the MDS map under certain conditions. This would cause the standby-replay MDS daemon to get removed from the Metadata Server cluster, which generated cluster warnings. With this fix, the logic used in Ceph Monitors during the consideration of removal of an MDS daemon from the MDS map now includes information about the standby-replay MDS daemons holding a rank. As a consequence, the standby-replay MDS daemons are no longer unexpectedly removed from the MDS cluster.
Clone Of: 2130116
Environment:
Last Closed: 2023-03-20 18:58:17 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 57370 0 None None None 2022-09-27 09:07:52 UTC
Red Hat Issue Tracker RHCEPH-5362 0 None None None 2022-09-27 09:18:36 UTC
Red Hat Product Errata RHBA-2023:1360 0 None None None 2023-03-20 18:58:54 UTC

Description Venky Shankar 2022-09-27 09:07:53 UTC
+++ This bug was initially created as a clone of Bug #2130116 +++

Description of problem:

standby-replay mds is removed from MDSMap unexpectedly.

Change https://github.com/ceph/ceph/commit/20509bb6c82e872127ab838d45402be0d0b91b5f evicts MDSs when a garbage beacon or an invalid state transition is seen by the monitor. To reproduce this, the standby-replay daemon needs to be laggy and then when it resumes back to normal operation, the monitor would remove the standby-replay MDS.

--- Additional comment from RHEL Program Management on 2022-09-27 09:05:58 UTC ---

Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 1 RHEL Program Management 2022-09-27 09:08:07 UTC
Please specify the severity of this bug. Severity is defined here:
https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.

Comment 16 errata-xmlrpc 2023-03-20 18:58:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 6.0 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:1360


Note You need to log in before you can comment on or make changes to this bug.