Hide Forgot
Description of problem: HAProxyDown not Firing when all router pods (-n openshift-ingress) are down or all nodes on which router pods are scheduled are down Version-Release number of selected component (if applicable): RHOCP 4.6 Actual results: HAProxyDown not Firing Expected results: HAProxyDown should be Firing
The HAProxyDown alert fires when haproxy is down, not when there are no openshift router pods running. We will fix the message so that it reports that "haproxy is down" to avoid confusion. ClusterOperatorDegraded and ClusterOperatorDown alerts should fire if no router pods are scheduled or running. For example: https://github.com/openshift/cluster-version-operator/blob/master/install/0000_90_cluster-version-operator_02_servicemonitor.yaml#L73-L88
I will work on this bug during the 4.8 bug fix phase.
attempted to verify in 4.8.0-0.nightly-2021-04-21-084059, pull #597 is listed in release status for this build, but Prometheus rule definition is still in old way of description: HAProxy metrics are reporting that the router is down. Suspect pull #597 is not in this build. Will wait for next build to verify
verified https://github.com/openshift/cluster-ingress-operator/pull/597 in 4.8.0-0.nightly-2021-04-21-172405 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-04-21-172405 True False 42m Cluster version is 4.8.0-0.nightly-2021-04-21-172405 $ oc -n openshift-ingress-operator get PrometheusRule -oyaml <--snip--> rules: - alert: HAProxyReloadFail annotations: message: HAProxy reloads are failing on {{ $labels.pod }}. Router is not respecting recently created or modified routes expr: template_router_reload_failure == 1 for: 5m labels: severity: warning - alert: HAProxyDown annotations: message: HAProxy metrics are reporting that HAProxy is down on pod {{ $labels.namespace }} / {{ $labels.pod }} <--verified https://github.com/openshift/cluster-ingress-operator/pull/597/ expr: haproxy_up == 0 for: 5m labels: severity: critical
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438