Verifying with pre-merge build failed. Build a payload with the linked PR: $ build operator-framework/operator-lifecycle-manager#2192 job https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1409806438900961280 succeeded Install a cluster the payload $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.ci.test-2021-06-29-094313-ci-ln-7kgwcsb-latest True False 39m Cluster version is 4.7.0-0.ci.test-2021-06-29-094313-ci-ln-7kgwcsb-latest Check the packageserver in audit log $ masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}') $ oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | grep "cluster-version" | sort system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-06-29T11:28:10.934577Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-06-29T11:31:28.956997Z {"metadata":{},"code":200} There are 2 records indicating that CVO is updating the packageserver After a few minutes, check it again # oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | grep "cluster-version" | sort The records disappear. After a few minutes, check it once more. # oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | grep "cluster-version" | sort system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-06-29T11:47:59.213667Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-06-29T11:51:17.181699Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-06-29T11:54:35.202873Z {"metadata":{},"code":200} CVO re-creates the packageserver constantly once more.
Right, we're waiting for https://github.com/operator-framework/operator-lifecycle-manager/pull/2192 to be approved for cherry-picking (same fields missing in CVO logs of latest e2e-serial run in 4.7)
Verifying with 4.7.0-0.nightly-2021-08-16-204701 failed. $ oc adm release info --commits registry.ci.openshift.org/ocp/release:4.7.0-0.nightly-2021-08-16-204701 | grep operator-lifecycle-manager operator-lifecycle-manager https://github.com/operator-framework/operator-lifecycle-manager 4bfcded86491aa59e42d5b4288a79fcc673afc91 operator-lifecycle-manager/ (master) $ git --no-pager log --first-parent --oneline -3 origin/release-4.7 [13:47:58] 4bfcded86 (origin/release-4.7) packageserver CSV: add missing properties (#2192) b0453a0cb Fix for resolution error on channels with deprecated inner entries. (#2313) dcbab9aff Merge pull request #2287 from dinhxuanvu/test-4.7 So the payload includes the olm fix. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-08-16-204701 True False 6m38s Cluster version is 4.7.0-0.nightly-2021-08-16-204701 # masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}') # oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | sort system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.015350Z {"metadata":{},"code":200} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.028274Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.048081Z {"metadata":{},"code":200} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.374924Z {"metadata":{},"code":200} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.389804Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.410616Z {"metadata":{},"code":200} system:serviceaccount:kube-system:deployment-controller deployments openshift-operator-lifecycle-manager packageserver apps 2021-08-17T06:11:14.420149Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:19:38.715374Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:43:56.864047Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:47:14.952895Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:50:32.963959Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:53:51.035193Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:57:09.090293Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:00:27.203482Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:03:45.506066Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:07:03.421687Z {"metadata":{},"code":200} system:serviceaccount:openshift-cluster-version:default clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:10:21.476466Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:13:02.936977Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:13:03.053232Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:13:03.216338Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:13:04.001196Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:13:04.640870Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:16:00.965559Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:16:00.996226Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:16:01.014947Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:19:38.736421Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:19:38.755656Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:19:38.767784Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T06:19:38.767877Z {"metadata":{},"status":"Failure","reason":"Conflict","code":409} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:10:21.495879Z {"metadata":{},"code":200} system:serviceaccount:openshift-operator-lifecycle-manager:olm-operator-serviceaccount clusterserviceversions openshift-operator-lifecycle-manager packageserver operators.coreos.com 2021-08-17T07:10:21.512740Z {"metadata":{},"code":200} CVO still loops on the clusterserviceversions packageserver. Checking CVO log: # grep "Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver" /tmp/cvo.log I0817 06:12:46.514363 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:16:00.823235 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:19:38.585097 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:23:54.677414 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:27:26.615236 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:30:44.640852 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:34:02.749232 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:37:20.793513 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:40:38.797883 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:43:56.846988 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:47:14.936912 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:50:32.935138 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:53:51.018298 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 06:57:09.073345 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 07:00:27.186033 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 07:03:45.491120 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ I0817 07:07:03.401386 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ Re-opening it.
``` I0817 06:12:46.514363 1 generic.go:81] Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver due to diff: &unstructured.Unstructured{ Object: map[string]interface{}{ "apiVersion": string("operators.coreos.com/v1alpha1"), "kind": string("ClusterServiceVersion"), "metadata": map[string]interface{}{"annotations": map[string]interface{}{"include.release.openshift.io/self-managed-high-availability": string("true"), "olm.operatorGroup": string("olm-operators"), "olm.operatorNamespace": string("openshift-operator-lifecycle-manager"), "olm.targetNamespaces": string("openshift-operator-lifecycle-manager")}, "creationTimestamp": string("2021-08-17T05:58:20Z"), "generation": int64(1), "labels": map[string]interface{}{"olm.api.4bca9f23e412d79d": string("provided"), "olm.clusteroperator.name": string("operator-lifecycle-manager-packageserver"), "olm.version": string("0.17.0")}, ...}, "spec": map[string]interface{}{ "apiservicedefinitions": map[string]interface{}{"owned": []interface{}{map[string]interface{}{"containerPort": int64(5443), "deploymentName": string("packageserver"), "description": string("A PackageManifest is a resource generated from existing CatalogS"...), "displayName": string("PackageManifest"), ...}}}, - "cleanup": map[string]interface{}{"enabled": bool(false)}, ``` Created https://github.com/operator-framework/operator-lifecycle-manager/pull/2329 to remove cleanup, which is not available in 4.7
Pre-merge verification gets passed. 1. Build a payload with operator-framework/operator-lifecycle-manager#2329 using cluster-bot build operator-framework/operator-lifecycle-manager#2329 2. Install a cluster with the payload built in step 1. 3. Check the audit log # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.ci.test-2021-08-18-061332-ci-ln-6jdrj8k-latest True False 18m Cluster version is 4.7.0-0.ci.test-2021-08-18-061332-ci-ln-6jdrj8k-latest # masters=$(oc get no -l node-role.kubernetes.io/master | sed '1d' | awk '{print $1}') # oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | sort null 4. Check CVO log # grep "Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver" /tmp/cvo1.log null CVO no longer hotloops on packageserver.
It looks like PR[1] doesn't allow me to do the PR pre-merge verification. I'll test it again when it goes to ON_QA. [1] https://github.com/operator-framework/operator-lifecycle-manager/pull/2329
OpenShift engineering has decided to NOT ship 4.8.6 on 8/23 due to the following issue. https://bugzilla.redhat.com/show_bug.cgi?id=1995785 All the fixes part will be now included in 4.8.7 on 8/30.
OpenShift engineering has decided to NOT ship 4.7.25 on 8/24 due to the following issue. https://bugzilla.redhat.com/show_bug.cgi?id=1995785 All the fixes part will be now included in 4.7.26 on 8/31.
Verifying with 4.7.0-0.nightly-2021-09-09-200524 passed. # oc adm release info --commits registry.ci.openshift.org/ocp/release:4.7.0-0.nightly-2021-09-09-200524 | grep operator-lifecycle-manager operator-lifecycle-manager https://github.com/operator-framework/operator-lifecycle-manager cfea59e1905c10386581a991b989c55d0d8ff393 $ git --no-pager log --first-parent --oneline -3 origin/release-4.7 [10:18:33] ebe84506e (origin/release-4.7) installplans: retry crd updates on conflicts (#2293) (#2314) cfea59e19 manifests/0000_50_olm_15-packageserver.clusterserviceversion.yaml: remove spec.cleanup (#2329) cf5ef8521 Merge pull request #2351 from timflannagan/fix-build-root-dockerfile # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-09-09-200524 True False 5m29s Cluster version is 4.7.0-0.nightly-2021-09-09-200524 # oc adm node-logs $masters --path=kube-apiserver/audit.log --raw | zgrep -h '"verb":"update".*"resource":".*"packageserver"' 2>/dev/null | jq -r '.user.username + " " + (.objectRef | .resource + " " + .namespace + " " + .name + " " + .apiGroup) + " " + .stageTimestamp + " " + (.responseStatus | tostring)' | grep "cluster-version" | sort null # oc logs cluster-version-operator-7d4bdf9578-7ngp5 -n openshift-cluster-version | grep "Updating ClusterServiceVersion openshift-operator-lifecycle-manager/packageserver" null CVO no longer hotloops on packageserver. Moving it to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.31 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3510