From [1]: annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" include.release.openshift.io/ibm-cloud-managed: "true" You want ibm-cloud-managed in that IBM-specific manifest, but you don't want the other two, because they're covered by the sibling, non-IBM manifest [2]. You should at least drop self-managed-high-availability from the IBM-specific manifest, to avoid the self-managed-high-availability cluster-version operator trying to simultaneously reconcile both the IBM-specific and non-IBM manifests for that one deployment. Depending on how much you want to clean up, you can also drop the unused single-node-developer profile across the board; see [3]. Seems like this affects 4.9 too, and a backport is probably worth the trouble: $ git grep include.release.openshift.io/self-managed-high-availability origin/release-4.9 -- manifests/ | grep ibm origin/release-4.9:manifests/0000_50_olm_06-psm-operator.deployment.ibm-cloud-managed.yaml: include.release.openshift.io/self-managed-high-availability: "true" to avoid the CVO flapping the nodeSelector: $ git checkout origin/release-4.9 $ git --no-pager log -1 --oneline 5fc4c78bb (HEAD, origin/release-4.9) Merge pull request #215 from dinhxuanvu/upgrade-delay-4.9 $ diff -u manifests/0000_50_olm_06-psm-operator.deployment.yaml manifests/0000_50_olm_06-psm-operator.deployment.ibm-cloud-managed.yaml --- manifests/0000_50_olm_06-psm-operator.deployment.yaml 2022-01-04 22:34:58.219169459 -0800 +++ manifests/0000_50_olm_06-psm-operator.deployment.ibm-cloud-managed.yaml 2022-01-04 22:34:58.219169459 -0800 @@ -8,6 +8,7 @@ annotations: include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" + include.release.openshift.io/ibm-cloud-managed: "true" spec: strategy: type: RollingUpdate @@ -64,7 +65,6 @@ terminationMessagePolicy: FallbackToLogsOnError nodeSelector: kubernetes.io/os: linux - node-role.kubernetes.io/master: "" tolerations: - effect: NoSchedule key: node-role.kubernetes.io/master Poking at recent 4.9 CI [4]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-aws/1478247345723281408/artifacts/e2e-aws/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-6f8b969579-q8dx4_cluster-version-operator.log | grep 'Running sync.*in state\|openshift-operator-lifecycle-manager/package-server-manager' | tail I0104 07:12:57.829476 1 sync_worker.go:542] Running sync 4.9.0-0.nightly-2022-01-04-060802 (force=false) on generation 2 in state Reconciling at attempt 0 I0104 07:13:25.186757 1 sync_worker.go:753] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (547 of 737) I0104 07:13:25.286909 1 sync_worker.go:765] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (547 of 737) I0104 07:13:25.286941 1 sync_worker.go:753] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (548 of 737) I0104 07:13:25.384516 1 sync_worker.go:765] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (548 of 737) I0104 07:16:16.647386 1 sync_worker.go:542] Running sync 4.9.0-0.nightly-2022-01-04-060802 (force=false) on generation 2 in state Reconciling at attempt 0 I0104 07:16:44.002400 1 sync_worker.go:753] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (547 of 737) I0104 07:16:44.102762 1 sync_worker.go:765] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (547 of 737) I0104 07:16:44.102795 1 sync_worker.go:753] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (548 of 737) I0104 07:16:44.204445 1 sync_worker.go:765] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (548 of 737) So you're currently not actually getting CVO contention because our nodeSelector merge strategy is "require the cluster to contain everything in the manifest, but do not remove unrecognized entries" [5]. But still, assuming that 4.9 CVO will never become more strict about nodeSelector reconciliation is brittle, and asking the CVO to reconcile the same Deployment twice in each sync cycle isn't very efficient. [1]: https://github.com/openshift/operator-framework-olm/blame/ca5d761a86bd1556b7bea1250fcd7a02f2fff337/manifests/0000_50_olm_06-psm-operator.deployment.ibm-cloud-managed.yaml#L9-L10 [2]: https://github.com/openshift/operator-framework-olm/blob/ca5d761a86bd1556b7bea1250fcd7a02f2fff337/manifests/0000_50_olm_06-psm-operator.deployment.yaml#L9-L10 [3]: https://github.com/openshift/cluster-version-operator/pull/685 [4]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.9-e2e-aws/1478247345723281408 [5]: https://github.com/openshift/cluster-version-operator/blob/a14f4e2b87e04d6b81aaa55890be088281f5a550/lib/resourcemerge/core.go#L50
[cloud-user@preserve-olm-env jian]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-10-144202 True False 8h Cluster version is 4.10.0-0.nightly-2022-01-10-144202 The `single-node-developer` and `self-managed-high-availability` annotations for PSM had been removed, as follows, [cloud-user@preserve-olm-env jian]$ oc get deployment package-server-manager -o=jsonpath='{.metadata.annotations}' {"deployment.kubernetes.io/revision":"1","include.release.openshift.io/self-managed-high-availability":"true"} [cloud-user@preserve-olm-env jian]$ oc get deployment packageserver -o=jsonpath='{.metadata.annotations}' {"deployment.kubernetes.io/revision":"1"} [cloud-user@preserve-olm-env jian]$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws/1480634289883189248/artifacts/e2e-aws/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-76dfccdf84-bsfpx_cluster-version-operator.log | grep 'Running sync.*in state\|openshift-operator-lifecycle-manager/package-server-manager' | tail I0110 21:01:49.045384 1 sync_worker.go:771] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:05:11.528899 1 sync_worker.go:546] Running sync 4.10.0-0.ci-2022-01-10-042939 (force=false) on generation 2 in state Reconciling at attempt 0 I0110 21:05:39.847068 1 sync_worker.go:759] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:05:39.939473 1 sync_worker.go:771] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:09:02.425954 1 sync_worker.go:546] Running sync 4.10.0-0.ci-2022-01-10-042939 (force=false) on generation 2 in state Reconciling at attempt 0 I0110 21:09:30.680512 1 sync_worker.go:759] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:09:30.780506 1 sync_worker.go:771] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:12:53.266564 1 sync_worker.go:546] Running sync 4.10.0-0.ci-2022-01-10-042939 (force=false) on generation 2 in state Reconciling at attempt 0 I0110 21:13:21.572470 1 sync_worker.go:759] Running sync for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) I0110 21:13:21.671508 1 sync_worker.go:771] Done syncing for deployment "openshift-operator-lifecycle-manager/package-server-manager" (573 of 766) Looks good to me, verify it.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056