Back to bug 2142143

Who When What Removed Added
Kamoltat (Junior) Sirivadhna 2022-11-11 19:25:54 UTC Status NEW ASSIGNED
Assignee nojha ksirivad
Red Hat One Jira (issues.redhat.com) 2022-11-11 19:29:35 UTC Link ID Red Hat Issue Tracker RHCEPH-5598
Kamoltat (Junior) Sirivadhna 2022-11-11 21:54:42 UTC Blocks 2142174
Kamoltat (Junior) Sirivadhna 2022-11-14 05:58:03 UTC Target Release 6.1 6.0
Kamoltat (Junior) Sirivadhna 2022-11-15 17:23:52 UTC Status ASSIGNED POST
Keywords Rebase
Veera Raghava Reddy 2022-11-15 17:36:38 UTC Fixed In Version ceph-17.2.5-2.el9cp
CC tserlin
Status POST MODIFIED
Flags needinfo?(pdhiran)
Flags needinfo?(vereddy)
CC pdhiran, vereddy
Flags needinfo?(pdhiran) needinfo?(vereddy)
errata-xmlrpc 2022-11-15 17:37:53 UTC Status MODIFIED ON_QA
Eliska 2022-11-16 12:42:58 UTC CC ekristov
Flags needinfo?(ksirivad)
Kamoltat (Junior) Sirivadhna 2022-11-16 19:53:22 UTC Doc Type If docs needed, set a value Bug Fix
Flags needinfo?(ksirivad)
Doc Text Cause:

There is a case where the paxos_size of the MonMap gets updated before we change the rank of the monitor. Let's say paxos_size gets reduced from 5 to 4, but the highest rank of the Monitors is still 4, then the old code would skip deleting the rank from dead_pinging.

Consequence:

the targeted rank remains in `dead_pinging` forever, which can cause strange `peer_tracker` scores in election strategy: 3.

Fix:

Added a case when rank_removed == paxos_size(),
we erase the rank from both the live_pinging and dead_pinging sets.

Result:

targeted rank_removed gets erased from live_pinging, and dead_pinging and doesn't get stuck forever in either of these sets.
Eliska 2022-11-24 11:30:32 UTC Flags needinfo?(ksirivad)
Docs Contact ekristov
Doc Text Cause:

There is a case where the paxos_size of the MonMap gets updated before we change the rank of the monitor. Let's say paxos_size gets reduced from 5 to 4, but the highest rank of the Monitors is still 4, then the old code would skip deleting the rank from dead_pinging.

Consequence:

the targeted rank remains in `dead_pinging` forever, which can cause strange `peer_tracker` scores in election strategy: 3.

Fix:

Added a case when rank_removed == paxos_size(),
we erase the rank from both the live_pinging and dead_pinging sets.

Result:

targeted rank_removed gets erased from live_pinging, and dead_pinging and doesn't get stuck forever in either of these sets.
.The targeted `rank_removed` no longer gets stuck in `live_pinging` and `dead_pinging` states

Previously, in some cases, the `paxos_size` of the Monitor Map would get updated before the rank of the monitor was changed.
For example, `paxos_size` would get reduced from 5 to 4, but the highest rank of the Monitors was still 4, thus the old code would skip deleting the rank from `dead_pinging` state.
This would cause the targeted rank to remain in `dead_pinging` forever, which would then cause strange `peer_tracker` scores in election strategy: 3.

With this fix, a case is added when `rank_removed == paxos_size()` that erases the targeted `rank_removed` from both the `live_pinging` and `dead_pinging` states and the rank does not get stuck forever in either of these sets.
Eliska 2022-11-24 11:36:25 UTC Blocks 2126050
Kamoltat (Junior) Sirivadhna 2022-11-29 03:50:34 UTC Flags needinfo?(ksirivad)
Bipin Kunal 2022-12-02 06:55:50 UTC CC bkunal
Pawan 2022-12-09 05:36:51 UTC Flags needinfo?(ksirivad)
Status ON_QA ASSIGNED
Kamoltat (Junior) Sirivadhna 2022-12-14 19:43:58 UTC Flags needinfo?(ksirivad) needinfo?(pdhiran)
Ken Dreyer (Red Hat) 2022-12-14 21:46:59 UTC CC kdreyer
Neha Ojha 2022-12-14 22:26:33 UTC Status ASSIGNED MODIFIED
Pawan 2022-12-15 03:51:32 UTC Flags needinfo?(pdhiran)
Status MODIFIED VERIFIED
Red Hat Bugzilla 2022-12-31 19:13:37 UTC CC amathuri
Red Hat Bugzilla 2022-12-31 19:32:46 UTC CC pdhiran
QA Contact pdhiran
Red Hat Bugzilla 2022-12-31 20:00:12 UTC CC sseshasa
Red Hat Bugzilla 2022-12-31 22:43:40 UTC CC rfriedma
Red Hat Bugzilla 2022-12-31 23:43:47 UTC CC rzarzyns
Red Hat Bugzilla 2022-12-31 23:46:04 UTC CC akupczyk
Red Hat Bugzilla 2023-01-01 05:35:32 UTC Assignee ksirivad nojha
CC ksirivad
Red Hat Bugzilla 2023-01-01 05:40:00 UTC CC tserlin
Red Hat Bugzilla 2023-01-01 06:03:40 UTC CC kdreyer
Red Hat Bugzilla 2023-01-01 06:27:20 UTC CC lflores
Red Hat Bugzilla 2023-01-01 06:29:12 UTC CC choffman
Red Hat Bugzilla 2023-01-01 08:30:01 UTC CC bkunal
Red Hat Bugzilla 2023-01-01 08:39:05 UTC Assignee nojha nobody
CC nojha
Red Hat Bugzilla 2023-01-01 08:39:59 UTC CC pdhange
Red Hat Bugzilla 2023-01-01 08:47:55 UTC CC vereddy
Red Hat Bugzilla 2023-01-01 08:50:23 UTC CC vumrao
Alasdair Kergon 2023-01-04 04:40:45 UTC CC akupczyk
Alasdair Kergon 2023-01-04 04:43:11 UTC Assignee nobody ksirivad
Alasdair Kergon 2023-01-04 04:43:34 UTC CC amathuri
Alasdair Kergon 2023-01-04 04:56:54 UTC QA Contact pdhiran
Alasdair Kergon 2023-01-04 05:03:42 UTC CC kdreyer
Alasdair Kergon 2023-01-04 05:08:58 UTC CC ksirivad
Alasdair Kergon 2023-01-04 05:10:58 UTC CC lflores
Alasdair Kergon 2023-01-04 05:21:38 UTC CC nojha
Alasdair Kergon 2023-01-04 05:28:18 UTC CC pdhange
Alasdair Kergon 2023-01-04 05:30:13 UTC CC pdhiran
Alasdair Kergon 2023-01-04 05:34:52 UTC CC rfriedma
Alasdair Kergon 2023-01-04 05:37:37 UTC CC rzarzyns
Alasdair Kergon 2023-01-04 05:59:30 UTC CC vumrao
Alasdair Kergon 2023-01-04 06:09:44 UTC CC bkunal
Alasdair Kergon 2023-01-04 06:13:47 UTC CC choffman
Alasdair Kergon 2023-01-04 06:29:04 UTC CC vereddy
Alasdair Kergon 2023-01-04 06:56:31 UTC CC sseshasa
Red Hat Bugzilla 2023-01-09 08:28:41 UTC CC ceph-eng-bugs
Alasdair Kergon 2023-01-09 19:43:36 UTC CC ceph-eng-bugs
errata-xmlrpc 2023-03-20 18:59:13 UTC Resolution --- ERRATA
Status VERIFIED CLOSED
Last Closed 2023-03-20 18:59:13 UTC
errata-xmlrpc 2023-03-20 19:00:16 UTC Link ID Red Hat Product Errata RHBA-2023:1360

Back to bug 2142143