Bug 1790825
Summary: | [ci] Flaky test: Prometheus when installed on the cluster shouldn't report any alerts in firing state: FailingOperator etcdoperator.v0.9.4 | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Martin André <m.andre> | |
Component: | OLM | Assignee: | Evan Cordell <ecordell> | |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | urgent | |||
Priority: | urgent | CC: | bparees, ccoleman, dcbw, jerzhang, jiazha, nhale, wking | |
Version: | 4.4 | |||
Target Milestone: | --- | |||
Target Release: | 4.5.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1805588 (view as bug list) | Environment: | ||
Last Closed: | 2020-07-13 17:13:17 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1805588, 1822232 |
Description
Martin André
2020-01-14 10:36:29 UTC
*** Bug 1796739 has been marked as a duplicate of this bug. *** *** Bug 1801907 has been marked as a duplicate of this bug. *** This bug has been identified by our buildcops as a significant blocker for our merge queue. Please ensure the fix is merged asap or provide updates here as to what progress is being made. If the etcd operator is unreliable, switch to a simpler operator that won't be so flaky. Likely this test is still flaking (due to other alerts). If someone can confirm that the OLM alert is no longer firing in recent CI jobs, we can self-verify. Nick can you do that? QE isn't generally in a good position to verify fixes to CI flakes. https://search.svc.ci.openshift.org/chart?search=%22olm-operator-metrics%22%2C%22name%22%3A%22etcdoperator.v0.9.4%22&maxAge=336h&context=2&type=all This shows only 2 instances of the issue over the last 14 days, down from a relatively high % of runs. The instances I looked at seemed to have underlying cluster etcd issues, which could manifest as stale cache in OLM (and thus see this issue). I picked one of the issues to use in the bug title, so it's more specific than just the generic alert unit name. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |