Bug 1977470
Summary: | Monitoring operator is in degraded state for ~64 sec during the API server rollout in SNO | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Naga Ravi Chaitanya Elluri <nelluri> |
Component: | Monitoring | Assignee: | Jan Fajerski <jfajersk> |
Status: | CLOSED DUPLICATE | QA Contact: | Junqi Zhao <juzhao> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 4.8 | CC: | alegrand, anpicker, aos-bugs, erooth, kakkoyun, nelluri, pkrupa, pnair, spasquie, wking |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | chaos | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-07-26 06:34:55 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1984730 |
Description
Naga Ravi Chaitanya Elluri
2021-06-29 20:23:01 UTC
Possibly a dup of bug 1949840? I'm not sure how broad a change bug 1949840 is aiming for. $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=96h&type=junit&search=clusteroperator/monitoring+should+not+change+condition' | grep 'single-node.*failures match' | grep -v 'pull-ci-\|rehearse-' | sort periodic-ci-openshift-release-master-ci-4.8-e2e-aws-upgrade-single-node (all) - 4 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.8-e2e-azure-upgrade-single-node (all) - 4 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.9-e2e-aws-upgrade-single-node (all) - 3 runs, 100% failed, 67% of failures match = 67% impact periodic-ci-openshift-release-master-ci-4.9-e2e-azure-upgrade-single-node (all) - 4 runs, 75% failed, 133% of failures match = 100% impact periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-single-node (all) - 4 runs, 100% failed, 75% of failures match = 75% impact periodic-ci-openshift-release-master-nightly-4.9-e2e-aws-single-node (all) - 4 runs, 100% failed, 75% of failures match = 75% impact Yes I think its the same issue at the core. The fix for https://bugzilla.redhat.com/show_bug.cgi?id=1949840 should improve the SNO situation quite a bit, though I think there is still room for improvement. Lets keep this open for now to track the impact on SNO. The fix for the related bug was merged last week (PR https://github.com/openshift/cluster-monitoring-operator/pull/1193). I'd be interested if and how this improves the situation for SNO. Jan, we no longer see monitoring operator in degraded state during the API rollout in Single Node OpenShift which has been tuned to last for around 60 second in the latest builds. *** This bug has been marked as a duplicate of bug 1949840 *** |