Bug 1851347

Summary: [RFE][DOC] Disaster recovery steps to restore ceph-monitor quorum when 2 out of 3 monitors are lost
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Karun Josy <kjosy>
Component: documentationAssignee: Kusuma <kbg>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: high Docs Contact:
Priority: urgent    
Version: 4.3CC: akrai, assingh, bkunal, ebenahar, etamir, hnallurv, karunjosyc, kbg, ocs-bugs, tdesala, tnielsen
Target Milestone: ---Keywords: FutureFeature
Target Release: ---Flags: kbg: needinfo-
kbg: needinfo-
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-02 07:11:51 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 5 Elad 2020-09-24 12:36:10 UTC
This is not part of 4.6 content. Can we move this out?

Comment 20 Travis Nielsen 2021-10-04 17:18:03 UTC
After the mons scale up to 3 mons again, we should still see mon-a in the quorum. It's unexpected that mon-a is not showing up in quorum. Since all the daemons are crashing, this likely indicates that a new mon quorum has been created and the original mon quorum was lost. Something must not have worked when restoring the quorum.

After mon-b and mon-c are taken down, and the mon quorum is reset to only include mon-a, is the ceph status seen to be healthy with the single mon? Before continuing with the guide to restore up to 3 mons again, we need to see that the cluster is healthy with a single mon