Bug 1872874 - [sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early]
Summary: [sig-instrumentation] Prometheus when installed on the cluster shouldn't repo...
Keywords:
Status: CLOSED DUPLICATE of bug 1886726
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Ben Bennett
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-26 19:08 UTC by Douglas Smith
Modified: 2020-12-04 18:03 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
[sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early]
Last Closed: 2020-10-12 14:51:31 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Douglas Smith 2020-08-26 19:08:43 UTC
test:
[sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early] 

is failing frequently in CI, see search results:
https://search.ci.openshift.org/?maxAge=168h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job&search=%5C%5Bsig-instrumentation%5C%5D+Prometheus+when+installed+on+the+cluster+shouldn%27t+report+any+alerts+in+firing+state+apart+from+Watchdog+and+AlertmanagerReceiversNotConfigured+%5C%5BEarly%5C%5D


https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-openstack-4.5/1298666591172431872

-----

[sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early] [Suite:openshift/conformance/parallel] expand_less	1m38s
fail [github.com/openshift/origin/test/extended/util/prometheus/helpers.go:174]: Expected
    <map[string]error | len:1>: {
        "ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards\",alertstate=\"firing\",severity!=\"info\"} >= 1": {
            s: "promQL query: ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|PrometheusRemoteWriteDesiredShards\",alertstate=\"firing\",severity!=\"info\"} >= 1 had reported incorrect results:\n[{\"metric\":{\"__name__\":\"ALERTS\",\"alertname\":\"KubeAPIDown\",\"alertstate\":\"firing\",\"severity\":\"critical\"},\"value\":[1598466340.548,\"1\"]},{\"metric\":{\"__name__\":\"ALERTS\",\"alertname\":\"KubeControllerManagerDown\",\"alertstate\":\"firing\",\"severity\":\"critical\"},\"value\":[1598466340.548,\"1\"]},{\"metric\":{\"__name__\":\"ALERTS\",\"alertname\":\"KubeSchedulerDown\",\"alertstate\":\"firing\",\"severity\":\"critical\"},\"value\":[1598466340.548,\"1\"]}]",
        },
    }
to be empty

Comment 1 Sergiusz Urbaniak 2020-08-27 07:43:08 UTC
KubeAPIDown, KubeControllerManagerDown indicate issues with the control plane, hence reassigning to kube-apiserver.

Comment 2 Stefan Schimanski 2020-08-27 10:37:56 UTC
This is not actionable. The query mixes many root causes already tracked elsewhere. Either give an analysis or point too some concrete issue (e.g. by platform, networking stack, component).

Comment 5 Mike Dame 2020-12-04 18:03:51 UTC
*** Bug 1891068 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.