Bug 1905599
Summary: | Errant change to lastupdatetime in copied CSV status can trigger runaway csv syncs | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Evan Cordell <ecordell> | |
Component: | OLM | Assignee: | Ben Luddy <bluddy> | |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | urgent | |||
Priority: | urgent | CC: | bluddy, dsover, krizza, mmohan, pneedle, rgregory, RUAIRIH5, shsaxena | |
Version: | 4.4 | |||
Target Milestone: | --- | |||
Target Release: | 4.7.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1906416 (view as bug list) | Environment: | ||
Last Closed: | 2021-02-24 15:41:14 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1906416 |
Description
Evan Cordell
2020-12-08 15:53:00 UTC
*** Bug 1905624 has been marked as a duplicate of this bug. *** Cluster version is 4.7.0-0.nightly-2020-12-09-112139 [root@preserve-olm-env data]# oc -n openshift-operator-lifecycle-manager exec catalog-operator-5bff7985dc-bc764 -- olm --version OLM version: 0.17.0 git commit: 2294bcc907c834c160c5b99fbf15988d0706853c LGTM verify it. 1, subscribe to an operator for the cluster scope. Such as, etcd. [root@preserve-olm-env data]# oc get sub -A NAMESPACE NAME PACKAGE SOURCE CHANNEL openshift-operators etcd etcd community-operators clusterwide-alpha [root@preserve-olm-env data]# oc get csv -n openshift-operators NAME DISPLAY VERSION REPLACES PHASE etcdoperator.v0.9.4-clusterwide etcd 0.9.4-clusterwide etcdoperator.v0.9.2-clusterwide Succeeded 2, Create many namespaces. 3, check the lastUpdateTime of the copied csv if is the same as the origin csv. [root@preserve-olm-env data]# oc get csv -n jian4 etcdoperator.v0.9.4-clusterwide -o yaml ... - lastTransitionTime: "2020-12-09T08:59:52Z" lastUpdateTime: "2020-12-09T08:59:52Z" message: install strategy completed with no errors phase: Succeeded reason: InstallSucceeded [root@preserve-olm-env data]# oc get csv -n openshift-operators etcdoperator.v0.9.4-clusterwide -o yaml ... - lastTransitionTime: "2020-12-09T08:59:52Z" lastUpdateTime: "2020-12-09T08:59:52Z" message: install strategy completed with no errors phase: Succeeded reason: InstallSucceeded Hi ecordell, I've tested the CSV update frequency issue and it still seems to be present in OCP 4.6.8 See the OCP 4.6.8 Cluster settings page attached https://bugzilla.redhat.com/attachment.cgi?id=1738461 Over 5 minutes there were 6549 PUT operations on etcd and 3899 of those were to CSVs sh-4.4# cat etcd_watch.log | grep "Key" | wc -l 6549 sh-4.4# cat etcd_watch.log | grep "Key" | grep "clusterserviceversions" | wc -l 3899 In one namespace I can see the lastUpdateTime of the CSV also still incrementing so suspect fix is not in place [ruairi@localhost ibm-apicatalog]$ oc get csv ibm-apiconnect.v2.1.0 -o yaml | grep lastUpdateTime lastUpdateTime: "2020-12-11T16:08:55Z" [ruairi@localhost ibm-apicatalog]$ oc get csv ibm-apiconnect.v2.1.0 -o yaml | grep lastUpdateTime lastUpdateTime: "2020-12-11T16:09:15Z" [ruairi@localhost ibm-apicatalog]$ oc get csv ibm-apiconnect.v2.1.0 -o yaml | grep lastUpdateTime lastUpdateTime: "2020-12-11T16:10:13Z" [ruairi@localhost ibm-apicatalog]$ oc get csv ibm-apiconnect.v2.1.0 -o yaml | grep lastUpdateTime lastUpdateTime: "2020-12-11T16:10:52Z" [ruairi@localhost ibm-apicatalog]$ oc get csv ibm-apiconnect.v2.1.0 -o yaml | grep lastUpdateTime lastUpdateTime: "2020-12-11T16:11:28Z" You can see the memory rising as well from the attached memory metrics graph https://bugzilla.redhat.com/attachment.cgi?id=1738462 Can you confirm that the fix didn't make 4.6.8 and that it should be available in the next release? Hi Ruairi, the 4.6 backport only merged a couple hours ago due to a test infrastructure problem. The progress of that backport is tracked in https://bugzilla.redhat.com/show_bug.cgi?id=1906416. At the moment, it's awaiting QE verification. Since the backport is also marked as urgent, it should be verified soon and I'd expect it to be present for the following z-release. Hi bluddy, Do you have a timeline on when the 4.6.9 release is scheduled which will have this fix in it? Thanks, Ruairi 4.6.9 is now released and has this hotfix in the payload -- ready for testing Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633 |