Bug 2266583
| Summary: | Number failure domain value is hardcoded in CephMonLowNumber alert | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Joy John Pinto <jopinto> |
| Component: | ocs-operator | Assignee: | Nikhil Ladha <nladha> |
| Status: | CLOSED ERRATA | QA Contact: | Joy John Pinto <jopinto> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.15 | CC: | branto, muagarwa, odf-bz-bot |
| Target Milestone: | --- | ||
| Target Release: | ODF 4.15.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | 4.15.0-155 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-03-19 15:33:15 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Verifeid with OCP 4.15 and ODF 4.15.0-157 The alert text is changed to 'The number of failure zones available allow to increase the number of Ceph monitors from 3 to 5 in order to improve cluster resilience.'. Please refer monlow_alert_verified.png Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:1383 |
Description of problem (please be detailed as possible and provide log snippests): Number failure domain value is hardcoded in CephMonLowNumber alert Version of all relevant components (if applicable): OCP 4.15 ODF 4.15.0-150 Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? NA Is there any workaround available to the best of your knowledge? NA Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 1 Can this issue reproducible? Yes Can this issue reproduce from the UI? Yes If this is a regression, please provide more details to justify this: NA Steps to Reproduce: 1.Install 6 worker node cluster and label the worker nodes in 6 different racks 2. When more than five, say six failure domains are present wait for CephMonLowNumber alert. 3. Upon inspecting the alert message 'The number of zone failure domains available (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience' Actual results: The alert has hard coded value of failure domains available in 'The number of zone failure domains available (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience' Expected results: 'The number of zone failure domains available (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience' failure domain value should not be hardcoded Additional info: Please refer mon_low_no_alert.jpg failure domains available: [jopinto@jopinto ceph-csi]$ oc get storagecluster -o jsonpath='{.items[*].status.failureDomainValues}' -n openshift-storage | tr ',' '\n' | sort -u | wc -l 6 [jopinto@jopinto ceph-csi]$