Bug 1749031 - OLM takes about 5 minutes to detect internal CatalogSource changes
Summary: OLM takes about 5 minutes to detect internal CatalogSource changes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: OLM
Version: 4.2.z
Hardware: All
OS: All
unspecified
medium
Target Milestone: ---
: 4.2.z
Assignee: Vu Dinh
QA Contact: Bruno Andrade
URL:
Whiteboard:
Depends On: 1775323
Blocks: 1775322
TreeView+ depends on / blocked
 
Reported: 2019-09-04 17:56 UTC by Bruno Andrade
Modified: 2020-01-14 16:46 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1775322 1775323 1779313 (view as bug list)
Environment:
Last Closed: 2020-01-14 16:46:31 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github operator-framework operator-lifecycle-manager pull 1118 0 'None' closed Bug 1749031: test for OLM takes a while to detect internal CatalogSource changes 2020-03-13 12:05:49 UTC
Github operator-framework operator-lifecycle-manager pull 1125 0 'None' closed Bug 1779313: Enable multiple namespaces sync if catsrc is updated in global ns 2020-03-13 12:05:54 UTC
Github operator-framework operator-lifecycle-manager pull 1169 0 'None' closed [release-4.2] Bug 1749031: Enable multiple namespaces sync if catsrc is updated in global ns 2020-03-13 12:05:54 UTC
Red Hat Product Errata RHBA-2020:0066 0 None None None 2020-01-14 16:46:34 UTC

Description Bruno Andrade 2019-09-04 17:56:21 UTC
Description of the problem:
OLM takes about 5 minutes to fetch changes from a configmap and execute some tasks like upgrade an operator that has a new csv in the same channel

Cluster Version: 4.2.0-0.nightly-2019-09-03-102130

OLM Version:
          "io.openshift.build.commit.id": "09537286f6e8ca771f99287b3d09e6e595f5b8e2",


How reproducible:
Always

Steps to reproduce:
1) Should be created a specific namespace for this test:
oc create ns test-operators

2) Create the ConfigMap and the Catalog Source.

oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcd.yaml -n openshift-marketplace
oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/catalogsource/catalogsource.yaml -n openshift-marketplace

3) Create the OperatorGroup
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: test-operators-og
  namespace: test-operators
spec:
  targetNamespaces:
  - test-operators
EOF

4) Create the subscription, as below:
oc create -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/subscription/test.yaml -n test-operators

5) Check the csv status.

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded

6) Update the configmap adding a new version of the operator at the same channel
oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcdv4.yaml -n openshift-marketplace

7) Wait some minutes and the csv should be automatically updated
oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

oc get ip -o jsonpath='{range .items[*]}{"\t"}{.metadata.name}{"\t"}{.metadata.creationTimestamp}{"\t"}{.spec.clusterServiceVersionNames}{"\n"}' -n test-operators
	install-6bmnd	2019-09-04T17:46:57Z	[etcdoperator.v0.9.4]
	install-hjt57	2019-09-04T17:40:23Z	[etcdoperator.v0.9.2]


Actual results:
OLM takes about 5 minutes to fetch changes from a configmap source


Expected results:
Updates like this should be prompt or take seconds to be synced

Comment 4 Daniel Sover 2019-11-06 22:04:49 UTC
I tested this out on 4.2.0-0.nightly-2019-09-04-102339

When updating the configmap I saw that the  installed-community-global-operators catalog source pod restarted instantly as expected. However, when starting the new catalog source the Last Observed State:  CONNECTING was consistently present for several minutes. This is likely causing the delay this bug - I have been looking at this bug in a separate report: https://bugzilla.redhat.com/show_bug.cgi?id=1768819.

To verify run 
k describe catalogsources.operators.coreos.com installed-community-global-operators and check the Last Observed State in the status field 

after creating the updated configmap.

*** This bug has been marked as a duplicate of bug 1768819 ***

Comment 5 Vu Dinh 2019-11-12 15:36:01 UTC
Hi Bruno,

This is the expected behavior in term fo sync period for OLM, specifically for catalog operator which handle resources (including subscription) sync. When you update ConfigMap for CatalogSource in openshift-marketplace namespace, the CatalogSource will be updated almost instantly as it is a modification in existing object. At the same time, OLM (catalog operator) will detect the change and will trigger a resource sync but only at the namespace where the change has occurred which in this case is openshift-marketplace. It doesn't trigger the sync at other namespaces. OLM will trigger resource sync across all namespaces only during normal resync period (every 15 mins). Given you create the Subscription in test-operators namespace, it may take up to 15 minutes to sync the Subscription and detect the change.

Instead of creating a global CatalogSource in openshift-marketplace like you did, you can create a local CatalogSource/ConfigMap in test-operators and then update the ConfigMap. Then, you will see the sync process happens right away as it is in the same namespace.

Thanks,
Vu

Comment 8 Bruno Andrade 2019-12-23 19:03:56 UTC
It took approximately 10 seconds to update the catalog and also update the operator with the newer version available version. Marking as VERIFIED.


Cluster version: 4.2.0-0.nightly-2019-12-19-211218
OLM version: 0.11.0
git commit: e77d11535ab96c39a00ec5f732f26b8dd5023281


Steps used to verify:

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded

date
Mon Dec 23 18:59:34 UTC 2019

oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcdv4.yaml -n openshift-marketplace
configmap/installed-community-global-operators configured

date
Mon Dec 23 18:59:44 UTC 2019

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.2   etcd      0.9.2                           Replacing
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Installing

Comment 10 errata-xmlrpc 2020-01-14 16:46:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0066


Note You need to log in before you can comment on or make changes to this bug.