2260344 – There is a pending alert CephClusterWarningState for a brief time with a timestamp of firing alert when firing alert appears

Bug 2260344 - There is a pending alert CephClusterWarningState for a brief time with a timestamp of firing alert when firing alert appears

Summary: There is a pending alert CephClusterWarningState for a brief time with a time...

Keywords:
Status:	ASSIGNED
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ceph-monitoring
Sub Component:
Version:	4.13
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	arun kumar mohan
QA Contact:	Harish NV Rao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-01-25 13:57 UTC by Filip Balák
Modified:	2024-08-30 10:28 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Embargoed:

Attachments	(Terms of Use)

Description Filip Balák 2024-01-25 13:57:03 UTC

Description of problem (please be detailed as possible and provide log
snippests):
During testing of CephClusterWarningState alert (in stop 1 osd scenario) is observed following behaviour:
 - alert is raised correctly and in pending state
 - when the alert should be turned into `firing` state, there is a `pending` alert with changed timestamp
 - there is a `firing` alert with a correct timestamp

Version of all relevant components (if applicable):
ocs 4.13.6-1

Can this issue reproducible?
(2/2)
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/10153/testReport/junit/tests.manage.monitoring.prometheus/test_deployment_status/test_ceph_osd_stopped/
https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster-prod/10153/testReport/junit/tests.manage.monitoring.prometheus/test_deployment_status/test_ceph_osd_stopped/


Steps to Reproduce:
1. Downscale an osd deployment
2. Monitor incoming alerts

Actual results:
Collected alerts:

{'labels': {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}, 'annotations': {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}, 'state': 'pending', 'activeAt': '2023-12-22T20:58:22.229278715Z', 'value': '1e+00'}, 

{'labels': {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}, 'annotations': {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}, 'state': 'pending', 'activeAt': '2023-12-22T21:27:52.229278715Z', 'value': '1e+00'}, 

{'labels': {'alertname': 'CephClusterWarningState', 'container': 'mgr', 'endpoint': 'http-metrics', 'instance': '172.17.174.59:9283', 'job': 'rook-ceph-mgr', 'managedBy': 'ocs-storagecluster', 'namespace': 'openshift-storage', 'pod': 'rook-ceph-mgr-a-79989c4657-w48qm', 'service': 'rook-ceph-mgr', 'severity': 'warning'}, 'annotations': {'description': 'Storage cluster is in warning state for more than 15m.', 'message': 'Storage cluster is in degraded state', 'severity_level': 'warning', 'storage_type': 'ceph'}, 'state': 'firing', 'activeAt': '2023-12-22T21:27:52.229278715Z', 'value': '1e+00'}

Expected results:
There should be collected just 2 alerts - 1 pending and 1 firing.

Additional info:

Comment 2 arun kumar mohan 2024-05-08 14:08:22 UTC

Moving this out of 4.16, not a blocker...

Note You need to log in before you can comment on or make changes to this bug.