Bug 1821708

Summary: TargetDown alert reported in firing state in e2e-gcp-4.5 ci tests run
Product: OpenShift Container Platform Reporter: Sinny Kumari <skumari>
Component: Management ConsoleAssignee: bpeterse
Status: CLOSED DUPLICATE QA Contact: Yadan Pei <yapei>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.5CC: aos-bugs, jokerman, spadgett
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
test: [sig-instrumentation][Late] Alerts shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Suite:openshift/conformance/parallel]
Last Closed: 2020-04-07 13:02:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sinny Kumari 2020-04-07 12:51:56 UTC
Seeing this in release-openshift-origin-installer-e2e-gcp-4.5 CI tests reported by https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-blocking#release-openshift-origin-installer-e2e-gcp-4.5&sort-by-flakiness=

"[sig-instrumentation][Late] Alerts shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Suite:openshift/conformance/parallel]"

Example of failing jobs: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-gcp-4.5/1260

Error message from one of the failing jobs:

fail [github.com/openshift/origin/test/extended/prometheus/prometheus_builds.go:167]: Expected
    <map[string]error | len:1>: {
        "count_over_time(ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|KubeAPILatencyHigh\",alertstate=\"firing\",severity!=\"info\"}[2h]) >= 1": {
            s: "promQL query: count_over_time(ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|KubeAPILatencyHigh\",alertstate=\"firing\",severity!=\"info\"}[2h]) >= 1 had reported incorrect results:\n[{\"metric\":{\"alertname\":\"TargetDown\",\"alertstate\":\"firing\",\"job\":\"metrics\",\"namespace\":\"openshift-console-operator\",\"service\":\"metrics\",\"severity\":\"warning\"},\"value\":[1586249135.461,\"25\"]}]",
        },
    }
to be empty

Comment 1 Samuel Padgett 2020-04-07 13:02:32 UTC

*** This bug has been marked as a duplicate of bug 1783881 ***