Bug 2215880
| Summary: | [GSS] ceph status shows no mgr although mgr pod is running | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | amansan <amanzane> |
| Component: | ceph | Assignee: | Brad Hubbard <bhubbard> |
| ceph sub component: | Ceph-MGR | QA Contact: | Elad <ebenahar> |
| Status: | CLOSED DUPLICATE | Docs Contact: | |
| Severity: | high | ||
| Priority: | unspecified | CC: | bhubbard, bniver, hnallurv, muagarwa, nojha, ocs-bugs, odf-bz-bot, rzarzyns, sostapov |
| Version: | 4.11 | ||
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-20 15:02:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Description of problem (please be detailed as possible and provide log snippests): Ceph status shows mgr is missing although the mgr pod is still Running in ODF Version of all relevant components (if applicable): OCP 4.11 / ODF 4.11 ceph_versions { "mon": { "ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 3 }, "mgr": {}, "osd": { "ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 72 }, "mds": { "ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 2 }, "rgw": { "ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 1 }, "overall": { "ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 78 } } Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? The customer is in the phase of production and they can’t trust in ODF at this moment as the mgr dies randomly Is there any workaround available to the best of your knowledge? Deleting the pod in ODF makes the trick Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? No Can this issue reproduce from the UI? No Additional info: There’s a ceph bug similar to the situation we have https://bugzilla.redhat.com/show_bug.cgi?id=2192479 But here it seems that the problem arise when the standby goes to active, in ODF we just have one mgr so not sure if it’s the same.