Bug 2009397

Summary: Alert CephMonQuorumAtRisk is not propagated to PagerDuty
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Filip Balák <fbalak>
Component: odf-managed-serviceAssignee: Dhruv Bindra <dbindra>
Status: CLOSED CURRENTRELEASE QA Contact: Filip Balák <fbalak>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.8CC: aeyal, dbindra, ocs-bugs, omitrani, sabose
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-12-16 19:50:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Filip Balák 2021-09-30 14:27:54 UTC
Description of problem:
When alert CephMonQuorumAtRisk is triggered in Prometheus, it is not propagated into PagerDuty.

Version-Release number of selected component (if applicable):
ocs-operator.v4.8.1
ocs-osd-deployer-qe.v1.1.0

How reproducible:
1/1

Steps to Reproduce:
1. Drain all nodes in one rack that contains one ceph monitor. Make sure that the monitor is not rescheduled elsewhere and that number of ceph monitors is even.
2. Check Prometheus that alert CephMonQuorumAtRisk is Pending.
3. Wait 15 minutes.
4. Check that the alert is propagated into PagerDuty.

Actual results:
Alert is not propagated into PagerDuty.

Expected results:
Alert is propagated into PagerDuty.

Additional info:
To check Prometheus, user needs to forward a port:
 $ oc port-forward svc/prometheus-operated 9090 -n openshift-storage
Then user can access http://localhost:9090/alerts in browser and see managed alerts.

Comment 2 Filip Balák 2021-12-14 15:12:53 UTC
Alert is propagated correctly and when nodes are uncordoned again, the alert is cleared correctly. --> VERIFIED

Tested with:
ocs-operator.v4.8.5
ocs-osd-deployer-qe.v1.1.2
ocp 4.9.9