Bug 1932620

Summary: Alerts during a test run should fail the test job, but were not
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: Bare Metal Hardware ProvisioningAssignee: sdasu
Bare Metal Hardware Provisioning sub component: cluster-baremetal-operator QA Contact: Amit Ugol <augol>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: high    
Priority: high CC: anpicker, bfournie, erooth, juzhao, surbania, tsedovic, wking
Version: 4.8Keywords: Triaged
Target Milestone: ---   
Target Release: 4.6.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1932619 Environment:
Last Closed: 2022-06-13 15:22:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1932618, 1932619    
Bug Blocks:    

Description Clayton Coleman 2021-02-24 19:13:17 UTC
+++ This bug was initially created as a clone of Bug #1932619 +++

+++ This bug was initially created as a clone of Bug #1932618 +++

https://github.com/openshift/cluster-baremetal-operator/pull/110 merged containing a failing alert ClusterOperatorBaremetalDown and TargetDown.

This was the passing run https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-baremetal-operator/110/pull-ci-openshift-cluster-baremetal-operator-master-e2e-agnostic/1364398641615212544

The query in the alert test "shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured" is subtly wrong.

For 4.6, 4.7, and 4.8, we can remove the broken filter clause because both KubeAPILatencyHigh and KubePodCrashLooping on kcm namespace were fixed. In the future we must use "unless X" instead of joining with a "-" because of the way the series match.

After this fix we will be correctly enforcing "no alerts may fire during a CI test run".

Comment 7 Tomas Sedovic 2022-06-13 15:22:43 UTC
This has been fixed in OpenShift 4.8. I'm closing this BZ, if a backport to 4.6 is requested please reopen.