Bug 2070047
Summary: | Kuryr: Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Maysa Macedo <mdemaced> |
Component: | Networking | Assignee: | Maysa Macedo <mdemaced> |
Networking sub component: | kuryr | QA Contact: | Itay Matza <imatza> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | imatza, mbooth, mdulko, prachaud |
Version: | 4.6 | Keywords: | Triaged |
Target Milestone: | --- | ||
Target Release: | 4.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-10 11:02:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2077384 |
Description
Maysa Macedo
2022-03-30 11:20:13 UTC
Verified with the following steps: - Installed OCP 4.10.0-0.nightly-2022-04-27-212741 on top of RHOS-16.1-RHEL-8-20220329.n.1 with Kuryr. - Make sure the cluster is up and the Watchdog and AlertmanagerReceiversNotConfigured alerts exist: ``` (shiftstack) [stack@undercloud-0 ~]$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' "Watchdog" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "APIRemovedInNextEUSReleaseInUse" "APIRemovedInNextEUSReleaseInUse" "AlertmanagerReceiversNotConfigured" ``` - Upgraded successfully to 4.11.0-0.nightly-2022-04-26-181148 using the upgrade command: ``` $ oc adm upgrade --to-image="registry.ci.openshift.org/ocp/release:4.11.0-0.nightly-2022-04-26-181148" --allow-explicit-upgrade --force=true ``` - Make sure the cluster is up. - Check the alerts, the Watchdog and AlertmanagerReceiversNotConfigured alerts exist, but the KuryrCNISlow is not. ``` (shiftstack) [stack@undercloud-0 ~]$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] | select(.labels.alertname) | .labels.alertname' "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "NodeClockNotSynchronising" "AlertmanagerReceiversNotConfigured" "Watchdog" ``` - Keep checking the alerts and make sure the KuryrCNISlow is not raised. - Destroy and create the cluster with OCP 4.11.0-0.nightly-2022-04-26-181148 version. - Keep checking the alerts and make sure the KuryrCNISlow is not raised. The similar issue is seen for version 4.8.45 Description of problem: test "[sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early] [Skipped:Disconnected] [Suite:openshift/conformance/parallel]" this test is failing consistently on latest 4.8.45 build. Version-Release number of selected component (if applicable): [root@rdr-zscurst-348a-bastion-0 ~]# oc version Client Version: 4.8.44 Server Version: 4.8.45 Kubernetes Version: v1.21.11+6b3cbdd How reproducible: Deploy the newly come 4.8.45 on power platform and run e2e test. Actual results: Test is failing. Flaky invariants: [sig-arch] Monitor cluster while tests execute Failing tests: [sig-instrumentation] Prometheus when installed on the cluster shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Early] [Skipped:Disconnected] [Suite:openshift/conformance/parallel] Expected results: Test should pass without any error. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |