Created attachment 1950560 [details] node warning symbol exists, no alert description Description of problem (please be detailed as possible and provide log snippests): When the pod has warnings it is expected that the same warning presented on Node level, of the node where the pod is deployed. rook-ceph-nfs-ocs-storagecluster-cephfs-a has 3 warnings described at alerts sidebar but the node does not show it. Version of all relevant components (if applicable): OC version: Client Version: 4.12.0-202208031327 Kustomize Version: v4.5.4 Server Version: 4.13.0-0.nightly-2023-03-11-033820 Kubernetes Version: v1.26.2+bc894ae OCS verison: ocs-operator.v4.13.0-98.stable OpenShift Container Storage 4.13.0-98.stable Succeeded Cluster version NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2023-03-11-033820 True False 22h Cluster version is 4.13.0-0.nightly-2023-03-11-033820 Rook version: rook: v4.13.0-0.78a2f4d47d1565993575ab9d2130d543eb1f27a4 go: go1.19.2 Ceph version: ceph version 17.2.5-69.el9cp (b7b25cbd1fb79976b1ec7eda3fa2e6fcc48246d6) quincy (stable) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? use dashboard to see alerts Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 2 Can this issue reproducible? Can this issue reproduce from the UI? yes If this is a regression, please provide more details to justify this: not regression. not supported now Steps to Reproduce: 1. deploy cluster ipi_6az_rhcos_3m_6w.yaml and trigger warning on one of the pod Actual results: node doesn't show the same warning as a pod (deployed on the same node) Expected results: node should show the same warning as a pod (deployed on the same node) has Additional info: issue discussed with dev team, assignee of the feature before opening Created attachment 1950560 [details] node warning symbol exists, no alert description must-gather logs: https://drive.google.com/drive/folders/1hUxWoh1QO_-zlY92IXsQf2ik3Vs6pS6W?usp=share_link
@badhikar I need to add to this BZ, that the cluster itself should present Alerts gathered from the nodes. I see that now the ode presents alerts from the hosted pods, but the cluster does not present the alerts from the nodes. ODF Server Version: 4.13.0-0.nightly-2023-03-17-161027 Kubernetes Version: v1.26.2+06e8c46
(In reply to Daniel Osypenko from comment #2) > @badhikar I need to add to this BZ, that the cluster itself > should present Alerts gathered from the nodes. > I see that now the ode presents alerts from the hosted pods, but the cluster > does not present the alerts from the nodes. > > ODF > Server Version: 4.13.0-0.nightly-2023-03-17-161027 > Kubernetes Version: v1.26.2+06e8c46 After going through the code and trying multiple iterations, it is not possible to show alerts of the Node at the OCS level and still make some sort of sense. After the node degrades and it finally affects the storage cluster then an alert should be generated on the OCS component which in turn would be shown on the Storage Cluster group level. Just combining node alerts on the Cluster Group level can be confusing. How can it be confusing? ==> We show the Cluster in a warning state because one of the Nodes is in warning due to some alert. When the user clicks the Storage Cluster group and opens the sidebar to see what's wrong he will not see any Alerts. This behavior can be more confusing than helpful. WDYT?
(In reply to Bipul Adhikari from comment #5) > After going through the code and trying multiple iterations, it is not > possible to show alerts of the Node at the OCS level and still make some > sort of sense. After the node degrades and it finally affects the storage > cluster then an alert should be generated on the OCS component which in turn > would be shown on the Storage Cluster group level. Just combining node > alerts on the Cluster Group level can be confusing. How can it be confusing? > ==> We show the Cluster in a warning state because one of the Nodes is in > warning due to some alert. When the user clicks the Storage Cluster group > and opens the sidebar to see what's wrong he will not see any Alerts. This > behavior can be more confusing than helpful. WDYT? If the question is, whenever the cluster is in warning state do we want to show the node/nodes warnings I think the logic should comply with the rest of topology states and transitions - to show every underlying warning. If the alert generation for the Storage Cluster group is a separate process and may not happen or have significant delay, we need to notify the user whenever user selects Storage Cluster group about that and show no warnings until they will be generated.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenShift Data Foundation 4.13.0 enhancement and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:3742