Bug 2310114
| Summary: | [Stretch Mode] Cluster unresponsive and commands are stuck during Netsplit scenario b/w the two data sites | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Kamoltat (Junior) Sirivadhna <ksirivad> |
| Component: | RADOS | Assignee: | Kamoltat (Junior) Sirivadhna <ksirivad> |
| Status: | CLOSED ERRATA | QA Contact: | Pawan <pdhiran> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 7.0 | CC: | akraj, bhubbard, ceph-eng-bugs, cephqe-warriors, hakumar, ksirivad, ngangadh, nojha, pdhange, pdhiran, rzarzyns, tserlin, vereddy, vumrao |
| Target Milestone: | --- | Keywords: | Automation, Regression |
| Target Release: | 7.1z2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-18.2.1-237.el9cp | Doc Type: | Bug Fix |
| Doc Text: |
.Monitors no longer get stuck in elections during crash/shutdown tests
Previously, the `disallowed_leaders` attribute of the MonitorMap was conditionally filled only when entering `stretch_mode`. However, there were instances wherein Monitors that just got revived would not enter `stretch_mode` right away because they would be in a `probing` state. This led to a mismatch in the `disallowed_leaders` set between the monitors across the cluster. Due to this, Monitors would fail to elect a leader, and the election would be stuck, resulting in Ceph being unresponsive.
With this fix, Monitors do not have to be in `stretch_mode` to fill the `disallowed_leaders` attribute. Monitors no longer get stuck in elections during crash/shutdown tests.
|
Story Points: | --- |
| Clone Of: | 2249962 | Environment: | |
| Last Closed: | 2024-11-07 14:39:19 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2249962 | ||
| Bug Blocks: | 2267614, 2298578, 2298579 | ||
|
Description
Kamoltat (Junior) Sirivadhna
2024-09-05 02:24:07 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 7.1 security, bug fix, and enhancement updates), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:9010 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |