Bug 1880337
Summary: | [release 4.6] cluster-monitoring-operator: Fix bug in reflector not recovering from "Too large resource version" | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Lukasz Szaszkiewicz <lszaszki> | |
Component: | Monitoring | Assignee: | Lili Cosic <lcosic> | |
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.5 | CC: | alegrand, anpicker, erooth, kakkoyun, lcosic, mloibl, pkrupa, spasquie, surbania | |
Target Milestone: | --- | Keywords: | Reopened | |
Target Release: | 4.6.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1881043 1881072 1881079 1881107 1881109 1892585 (view as bug list) | Environment: | ||
Last Closed: | 2020-10-27 16:42:20 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1880369 | |||
Bug Blocks: | 1881043, 1892585 |
Description
Lukasz Szaszkiewicz
2020-09-18 10:10:19 UTC
cluster-monitoring-operator 4.5 depends on k8s.io/client-go v0.17.1 [1] so it isn't affected by this issue. The same goes for prometheus-operator [2]. That being said, the 4.6 branches use k8s.io/client-go v0.18.3 and v0.18.2 [3][4] and they probably need to be fixed. @Lukasz Should we open another BZ? [1] https://github.com/openshift/cluster-monitoring-operator/blob/0c110b7edadad09182983e48013125a07284116d/go.mod#L37 [2] https://github.com/openshift/prometheus-operator/blob/99b893905d26d85d50d1178be195388e5c000322/go.mod#L42 [3] https://github.com/openshift/cluster-monitoring-operator/blob/922578d7d8a33f39b43b577e74c469b4374e90bd/go.mod#L31 [4] https://github.com/openshift/prometheus-operator/blob/52492b3b48ed1e4f851a78a51817e92404cf2767/go.mod#L36 (In reply to Simon Pasquier from comment #1) > cluster-monitoring-operator 4.5 depends on k8s.io/client-go v0.17.1 [1] so > it isn't affected by this issue. The same goes for prometheus-operator [2]. > That being said, the 4.6 branches use k8s.io/client-go v0.18.3 and v0.18.2 > [3][4] and they probably need to be fixed. > > @Lukasz Should we open another BZ? > > [1] > https://github.com/openshift/cluster-monitoring-operator/blob/ > 0c110b7edadad09182983e48013125a07284116d/go.mod#L37 > [2] > https://github.com/openshift/prometheus-operator/blob/ > 99b893905d26d85d50d1178be195388e5c000322/go.mod#L42 > [3] > https://github.com/openshift/cluster-monitoring-operator/blob/ > 922578d7d8a33f39b43b577e74c469b4374e90bd/go.mod#L31 > [4] > https://github.com/openshift/prometheus-operator/blob/ > 52492b3b48ed1e4f851a78a51817e92404cf2767/go.mod#L36 The Kube API in 4.5 is affected. It can return an error that the operators must understand and recover from. Basically anything that uses an informer. For 4.5/4.6 you should bump at least to 1.18.6 (which has the fix) Targeting this bug against 4.6.0. I'll create a clone for 4.5.z. tested with 4.6.0-0.nightly-2020-09-24-184015, the fix is in the payload and did not see the "Too large resource version" error Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196 |