Bug 1749031

Summary: OLM takes about 5 minutes to detect internal CatalogSource changes
Product: OpenShift Container Platform Reporter: Bruno Andrade <bandrade>
Component: OLMAssignee: Vu Dinh <vdinh>
OLM sub component: OLM QA Contact: Bruno Andrade <bandrade>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: chuo, jiazha, nhale, scolange, vdinh
Version: 4.2.zKeywords: Reopened
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1775322 1775323 1779313 (view as bug list) Environment:
Last Closed: 2020-01-14 16:46:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1775323    
Bug Blocks: 1775322    

Description Bruno Andrade 2019-09-04 17:56:21 UTC
Description of the problem:
OLM takes about 5 minutes to fetch changes from a configmap and execute some tasks like upgrade an operator that has a new csv in the same channel

Cluster Version: 4.2.0-0.nightly-2019-09-03-102130

OLM Version:
          "io.openshift.build.commit.id": "09537286f6e8ca771f99287b3d09e6e595f5b8e2",


How reproducible:
Always

Steps to reproduce:
1) Should be created a specific namespace for this test:
oc create ns test-operators

2) Create the ConfigMap and the Catalog Source.

oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcd.yaml -n openshift-marketplace
oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/catalogsource/catalogsource.yaml -n openshift-marketplace

3) Create the OperatorGroup
oc create -f - <<EOF
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: test-operators-og
  namespace: test-operators
spec:
  targetNamespaces:
  - test-operators
EOF

4) Create the subscription, as below:
oc create -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/subscription/test.yaml -n test-operators

5) Check the csv status.

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded

6) Update the configmap adding a new version of the operator at the same channel
oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcdv4.yaml -n openshift-marketplace

7) Wait some minutes and the csv should be automatically updated
oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Succeeded

oc get ip -o jsonpath='{range .items[*]}{"\t"}{.metadata.name}{"\t"}{.metadata.creationTimestamp}{"\t"}{.spec.clusterServiceVersionNames}{"\n"}' -n test-operators
	install-6bmnd	2019-09-04T17:46:57Z	[etcdoperator.v0.9.4]
	install-hjt57	2019-09-04T17:40:23Z	[etcdoperator.v0.9.2]


Actual results:
OLM takes about 5 minutes to fetch changes from a configmap source


Expected results:
Updates like this should be prompt or take seconds to be synced

Comment 4 Daniel Sover 2019-11-06 22:04:49 UTC
I tested this out on 4.2.0-0.nightly-2019-09-04-102339

When updating the configmap I saw that the  installed-community-global-operators catalog source pod restarted instantly as expected. However, when starting the new catalog source the Last Observed State:  CONNECTING was consistently present for several minutes. This is likely causing the delay this bug - I have been looking at this bug in a separate report: https://bugzilla.redhat.com/show_bug.cgi?id=1768819.

To verify run 
k describe catalogsources.operators.coreos.com installed-community-global-operators and check the Last Observed State in the status field 

after creating the updated configmap.

*** This bug has been marked as a duplicate of bug 1768819 ***

Comment 5 Vu Dinh 2019-11-12 15:36:01 UTC
Hi Bruno,

This is the expected behavior in term fo sync period for OLM, specifically for catalog operator which handle resources (including subscription) sync. When you update ConfigMap for CatalogSource in openshift-marketplace namespace, the CatalogSource will be updated almost instantly as it is a modification in existing object. At the same time, OLM (catalog operator) will detect the change and will trigger a resource sync but only at the namespace where the change has occurred which in this case is openshift-marketplace. It doesn't trigger the sync at other namespaces. OLM will trigger resource sync across all namespaces only during normal resync period (every 15 mins). Given you create the Subscription in test-operators namespace, it may take up to 15 minutes to sync the Subscription and detect the change.

Instead of creating a global CatalogSource in openshift-marketplace like you did, you can create a local CatalogSource/ConfigMap in test-operators and then update the ConfigMap. Then, you will see the sync process happens right away as it is in the same namespace.

Thanks,
Vu

Comment 8 Bruno Andrade 2019-12-23 19:03:56 UTC
It took approximately 10 seconds to update the catalog and also update the operator with the newer version available version. Marking as VERIFIED.


Cluster version: 4.2.0-0.nightly-2019-12-19-211218
OLM version: 0.11.0
git commit: e77d11535ab96c39a00ec5f732f26b8dd5023281


Steps used to verify:

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded

date
Mon Dec 23 18:59:34 UTC 2019

oc apply -f https://raw.githubusercontent.com/bandrade/v3-testfiles/v4.1/olm/configmap/configmap_etcdv4.yaml -n openshift-marketplace
configmap/installed-community-global-operators configured

date
Mon Dec 23 18:59:44 UTC 2019

oc get csv -n test-operators
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.2   etcd      0.9.2                           Replacing
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Installing

Comment 10 errata-xmlrpc 2020-01-14 16:46:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0066