Bug 2266583 - Number failure domain value is hardcoded in CephMonLowNumber alert
Summary: Number failure domain value is hardcoded in CephMonLowNumber alert
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.15
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: ODF 4.15.0
Assignee: Nikhil Ladha
QA Contact: Joy John Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-02-28 12:12 UTC by Joy John Pinto
Modified: 2024-03-19 15:33 UTC (History)
3 users (show)

Fixed In Version: 4.15.0-155
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-03-19 15:33:15 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 2490 0 None open Bug 2266583: [release-4.15] Update CephMonLowNumber alert description 2024-03-04 06:10:10 UTC
Red Hat Product Errata RHSA-2024:1383 0 None None None 2024-03-19 15:33:17 UTC

Description Joy John Pinto 2024-02-28 12:12:47 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Number failure domain value is hardcoded in CephMonLowNumber alert

Version of all relevant components (if applicable):
OCP 4.15
ODF 4.15.0-150

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
NA

Is there any workaround available to the best of your knowledge?
NA

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:
NA

Steps to Reproduce:
1.Install 6 worker node cluster and label the worker nodes in 6 different racks
2. When more than five, say six failure domains are present wait for CephMonLowNumber alert.
3. Upon inspecting the alert message 'The number of zone failure domains available  (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience'


Actual results:
The alert has hard coded value of failure domains available in 'The number of zone failure domains available  (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience'

Expected results:
'The number of zone failure domains available  (5) allow to increase the ceph monitors from 3 to 5 in order to improve cluster resilience' failure domain value should not be hardcoded

Additional info:
Please refer mon_low_no_alert.jpg

failure domains available: [jopinto@jopinto ceph-csi]$     oc get storagecluster -o jsonpath='{.items[*].status.failureDomainValues}' -n openshift-storage | tr ',' '\n' | sort -u | wc -l
6
[jopinto@jopinto ceph-csi]$

Comment 9 Joy John Pinto 2024-03-08 04:30:49 UTC
Verifeid with OCP 4.15 and ODF 4.15.0-157

The alert text is changed to 'The number of failure zones available allow to increase the number of Ceph monitors from 3 to 5 in order to improve cluster resilience.'. Please refer monlow_alert_verified.png

Comment 11 errata-xmlrpc 2024-03-19 15:33:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383


Note You need to log in before you can comment on or make changes to this bug.