Bug 1738291 - Failing Test: Prometheus when installed on the cluster should report less than two alerts in firing or pending state
Summary: Failing Test: Prometheus when installed on the cluster should report less tha...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.4.0
Assignee: bpeterse
QA Contact: Yadan Pei
URL:
Whiteboard: buildcop
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-06 17:16 UTC by Russell Teague
Modified: 2020-02-21 14:54 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-02-21 14:54:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Russell Teague 2019-08-06 17:16:57 UTC
Description of problem:
Test failure:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_openshift-ansible/11801/pull-ci-openshift-openshift-ansible-master-e2e-aws-scaleup-rhel7/836

This appears to happen consistently on RHEL7 nodes that are being scaled up.  RHEL7 nodes should probably be ignored for this check.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Install 4.2 cluster
2. Scale up RHEL 7 nodes
3. Run tests

Actual results:


Expected results:


Additional info:

Comment 4 Russell Teague 2019-08-08 17:13:01 UTC
The failure was due to another issue with mco causing nodes to not join the cluster.  Once that issue was resolved the alert was no longer raised.

Comment 5 Hongkai Liu 2019-11-21 17:15:45 UTC
Saw this in the ci test:
https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-4.2/4933

Failing tests:
[Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report less than two alerts in firing or pending state [Suite:openshift/conformance/parallel/minimal]

reopening ...

Comment 7 Lokesh Mandvekar 2019-11-25 14:14:52 UTC
Seeing this at https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-metal-4.2/209

[Feature:Prometheus][Conformance] Prometheus when installed on the cluster should report less than two alerts in firing or pending state [Suite:openshift/conformance/parallel/minimal] expand_less 	9m48s
fail [github.com/openshift/origin/test/extended/prometheus/prometheus_builds.go:135]: Expected
    <map[string]error | len:1>: {
        "ALERTS{alertname!=\"Watchdog\",alertstate=\"firing\"} >= 1": {
            s: "promQL query: ALERTS{alertname!=\"Watchdog\",alertstate=\"firing\"} >= 1 had reported incorrect results: ALERTS{alertname=\"ClusterOperatorDegraded\", alertstate=\"firing\", condition=\"Degraded\", endpoint=\"metrics\", instance=\"147.75.69.131:9099\", job=\"cluster-version-operator\", name=\"ingress\", namespace=\"openshift-cluster-version\", pod=\"cluster-version-operator-57556d999d-n8wpf\", reason=\"IngressControllersDegraded\", service=\"cluster-version-operator\", severity=\"critical\"} => 1 @[1574685135.377]\nALERTS{alertname=\"ClusterOperatorDown\", alertstate=\"firing\", endpoint=\"metrics\", instance=\"147.75.69.131:9099\", job=\"cluster-version-operator\", name=\"ingress\", namespace=\"openshift-cluster-version\", pod=\"cluster-version-operator-57556d999d-n8wpf\", service=\"cluster-version-operator\", severity=\"critical\", version=\"4.2.0-0.nightly-2019-11-25-111442\"} => 1 @[1574685135.377]\nALERTS{alertname=\"KubeDeploymentReplicasMismatch\", alertstate=\"firing\", deployment=\"router-default\", endpoint=\"https-main\", instance=\"10.130.0.8:8443\", job=\"kube-state-metrics\", namespace=\"openshift-ingress\", pod=\"kube-state-metrics-5499974b5f-x95hx\", service=\"kube-state-metrics\", severity=\"critical\"} => 1 @[1574685135.377]\nALERTS{alertname=\"KubePodNotReady\", alertstate=\"firing\", namespace=\"openshift-ingress\", pod=\"router-default-5789d7d4c6-kkk8j\", severity=\"critical\"} => 1 @[1574685135.377]",
        },
    }
to be empty

Comment 9 bpeterse 2020-02-21 14:54:52 UTC
Closing as this seems to be a flake that we no longer see.


Note You need to log in before you can comment on or make changes to this bug.