Bug 2142174 - mon/Elector: notify_rank_removed erase rank from both live_pinging and dead_pinging sets for highest ranked MON
Summary: mon/Elector: notify_rank_removed erase rank from both live_pinging and dead_p...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RADOS
Version: 5.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 5.3
Assignee: Kamoltat (Junior) Sirivadhna
QA Contact: Pawan
Akash Raj
URL:
Whiteboard:
Depends On: 2142143
Blocks: 2121452 2126049 2142674 2142983 2150223
TreeView+ depends on / blocked
 
Reported: 2022-11-11 21:54 UTC by Kamoltat (Junior) Sirivadhna
Modified: 2023-01-17 07:53 UTC (History)
21 users (show)

Fixed In Version: ceph-16.2.10-85.el8cp
Doc Type: Bug Fix
Doc Text:
.Rank is removed from the `live_pinging` and `dead_pinging` set to mitigate the inconsistent connectivity score issue Previously, when removing two monitors consecutively, if the rank size is equal to Paxos's size, the monitor would face a condition and would not remove rank from the `dead_pinging` set. Due to this, the rank remained in the `dead_pinging` set which would cause problems, such as inconsistent connectivity score when the stretch-cluster mode was enabled. With this fix, a case is added where the highest ranked monitor is removed, that is, when the rank is equal to Paxos's size, remove the rank from the `live_pinging` and `dead_pinging` set. The monitor stays healthy with a clean `live_pinging` and `dead_pinging` set.
Clone Of: 2142143
Environment:
Last Closed: 2023-01-11 17:42:24 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github ceph ceph pull 47087 0 None Merged pacific:mon/Elector: notify_rank_removed erase rank from both live_pinging and dead_pinging sets for highest ranked MON 2022-11-11 21:57:04 UTC
Red Hat Issue Tracker RHCEPH-5603 0 None None None 2022-11-11 22:01:11 UTC
Red Hat Product Errata RHSA-2023:0076 0 None None None 2023-01-11 17:43:38 UTC

Description Kamoltat (Junior) Sirivadhna 2022-11-11 21:54:42 UTC
+++ This bug was initially created as a clone of Bug #2142143 +++

Description of problem:

Added a case where we are removing the highest rank monitor
in notify_rank_removed, the old version did not deal with this
since it would only go into the loop when rank_removed < paxos_size().
Therefore, we added an else case for when rank_removed == paxos_size(),
we erase the rank from both live_pinging and dead_pinging set.

Comment 1 Veera Raghava Reddy 2022-11-14 12:14:29 UTC
Hi Scott,
Looks like this bug is for ODF. Can you review fi this is a blocker for 5.3 or can be differed to 5.3z1?

Comment 2 Vikhyat Umrao 2022-11-15 15:39:56 UTC
(In reply to Veera Raghava Reddy from comment #1)
> Hi Scott,
> Looks like this bug is for ODF. Can you review fi this is a blocker for 5.3
> or can be differed to 5.3z1?

Yesterday, I had a discussion with Junior and Neha as we giving the ODF customer hotfix this can be taken out from 5.3. Because we won't be able to match the 5.3 timelines!

Comment 30 Kamoltat (Junior) Sirivadhna 2023-01-11 10:24:38 UTC
Hi Akash,

here is the doc text,

Thank you!

Comment 31 errata-xmlrpc 2023-01-11 17:42:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security update and Bug Fix), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:0076


Note You need to log in before you can comment on or make changes to this bug.