Hide Forgot
4.7's bug 1905221, backported to 4.6.19 [1] avoided a too-early transition from install-mode to update-mode by ignoring overrides changes while we were in install-mode. Touching overrides during installation had been infrequent, but yesterday 4.10 (bug 2008119) and 4.9 (bug 2009342) had the installer updated to set overrides in the initial ClusterVersion resource, and then clear them as bootstrapping was wrapping up. This is causing some pain in CI, like [2]: : [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] 1s fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Oct 1 09:56:05.536: ClusterVersion Progressing=True: : Working towards 4.10.0-0.ci-2021-10-01-091426: 381 of 744 done (51% complete) where the cluster is taking an extra 30s after going Available=True (and unblocking the installer to declare install-complete) and going Progressing=False: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443867156340019200/artifacts/e2e-aws-serial/gather-extra/artifacts/clusterversion.json | jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' 2021-10-01T09:55:37Z Available=True -: Done applying 4.10.0-0.ci-2021-10-01-091426 2021-10-01T09:55:37Z Failing=False -: - 2021-10-01T09:56:07Z Progressing=False -: Cluster version is 4.10.0-0.ci-2021-10-01-091426 2021-10-01T09:30:40Z RetrievedUpdates=False NoChannel: The update channel has not been configured. because of the deferred overrides change: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443867156340019200/artifacts/e2e-aws-serial/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-5c577f8d67-4wzqm_cluster-version-operator.log | grep 'overrides\|Running sync.*in state' W1001 09:37:51.814042 1 sync_worker.go:490] Ignoring detected version and overrides changes during payload initialization I1001 09:37:51.814176 1 sync_worker.go:542] Running sync 4.10.0-0.ci-2021-10-01-091426 (force=false) on generation 1 in state Initializing at attempt 0 W1001 09:37:58.407651 1 sync_worker.go:490] Ignoring detected overrides change during payload initialization ... W1001 09:55:07.524312 1 sync_worker.go:490] Ignoring detected overrides change during payload initialization I1001 09:55:19.050794 1 sync_worker.go:542] Running sync 4.10.0-0.ci-2021-10-01-091426 (force=false) on generation 2 in state Initializing at attempt 4 W1001 09:55:22.520587 1 sync_worker.go:490] Ignoring detected overrides change during payload initialization W1001 09:55:37.520888 1 sync_worker.go:490] Ignoring detected overrides change during payload initialization I1001 09:55:37.528893 1 sync_worker.go:493] Detected overrides change I1001 09:55:37.529222 1 sync_worker.go:493] Detected overrides change I1001 09:55:37.529329 1 sync_worker.go:542] Running sync 4.10.0-0.ci-2021-10-01-091426 (force=false) on generation 2 in state Updating at attempt 0 I1001 09:58:58.297770 1 sync_worker.go:542] Running sync 4.10.0-0.ci-2021-10-01-091426 (force=false) on generation 2 in state Reconciling at attempt 0 We can be more responsive to users, and avoid the CI issues, if we adjust the CVO to respect 'overrides' changes during install-time, but to stay in the installation mode. We should also consider how we treat 'desiredUpdate'. Things might break if we use install-mode's flattened manifest graph instead of update-mode's dependency-ordered graph. So I think we should continue to defer accepting target-release changes until install completes, unless someone gives us a solid use-case for install-time target-release pivots. But I'm open to having that discussion now, if folks don't agree with my proposal to defer acceptance. We should also confirm that observedGeneration: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443867156340019200/artifacts/e2e-aws-serial/gather-extra/artifacts/clusterversion.json | jq '.items[].status.observedGeneration' 2 gets incremented to match metadata.generation when the CVO notices and acts on a spec change. For example, when we see the overrides change, we should update observedGeneration when we start acting on the new spec. When we see a desiredUpdate bump, we should increment observedGeneration, even if we are not yet accepting applying the new target. Users can make the "CVO realizes I've asked for something different, but is still applying an earlier target" vs. "CVO is applying the new target" distinction by checking status.desired. But there is no status.updates, so we need to have accepted that change by the time we bump observedGeneration. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1921180#c5 [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial/1443867156340019200
[1] is temporary mitigation to keep CI happy while we work on this. [1]: https://github.com/openshift/release/pull/22434
I see the error from a recent(<6h) ci job: # [sig-arch][Early] Managed cluster should start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel] fail [github.com/onsi/ginkgo.0-origin.0+incompatible/internal/leafnodes/runner.go:113]: Jan 11 07:42:29.166: ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-10-144202: 369 of 765 done (48% complete) # curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn-upgrade/1480795473177481216/artifacts/e2e-azure-ovn-upgrade/gather-extra/artifacts/clusterversion.json| jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' 2022-01-11T07:42:09Z Available=True -: Done applying 4.10.0-0.nightly-2022-01-10-144202 2022-01-11T07:40:24Z Failing=False -: - 2022-01-11T07:42:54Z Progressing=False -: Cluster version is 4.10.0-0.nightly-2022-01-10-144202 2022-01-11T07:17:26Z RetrievedUpdates=False NoChannel: The update channel has not been configured. Hi @Trevor, @Jack Any suggestion for QE to reproduce the issue? Or is it enough to verify the bug from above ci job search?
(In reply to liujia from comment #3) > I see the error from a recent(<6h) ci job: > ... > Hi @Trevor, @Jack > Any suggestion for QE to reproduce the issue? Or is it enough to verify the > bug from above ci job search? I think that's fine. But you can also just capture a cluster's initialization and now you should see: I1217 21:21:11.059604 1 sync_worker.go:496] Detected version change W1217 21:21:11.060024 1 sync_worker.go:499] Ignoring detected version change during payload initialization rather than: W0111 07:32:39.299462 1 sync_worker.go:494] Ignoring detected overrides change during payload initialization And the Progressing=False transition time should not differ by 30 seconds.
Can not reproduce the deferred transition time on either a normal installation or hacked installation with cvo-overrides with 4.10.0-fc.0. # ./oc get clusterversion -ojson | jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' 2022-01-14T06:08:08Z Upgradeable=False ClusterVersionOverridesSet: Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing. 2022-01-14T06:30:14Z Available=True -: Done applying 4.10.0-fc.0 2022-01-14T06:30:14Z Failing=False -: - 2022-01-14T06:30:14Z Progressing=False -: Cluster version is 4.10.0-fc.0 So refer to comment 3 and comment 4, verified the issue with specific ci job. There are not nightly build(>=4.10.0-0.nightly-2022-01-13-061145) failed for the same error in past 3 days. # w3m -r -dump -cols 200 'https://search.ci.openshift.org/?search=Managed+cluster+should+start+all+core+operators&maxAge=72h&context=2&type=junit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job'|grep -o 'ClusterVersion Progressing=True.*4.10.*nightly.*2022-01.*' ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-11-065245: 752 of 765 done (98% complete) ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-11-065245: 369 of 765 done (48% complete) ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-11-065245: 375 of 765 done (49% complete) ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-11-065245: 389 of 765 done (50% complete) ClusterVersion Progressing=True: : Working towards 4.10.0-0.nightly-2022-01-11-233044: 2 of 765 done (0% complete) No time deferred on 4.10.0-0.nightly-2022-01-13-061145 # curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn-upgrade/1481715413581565952/artifacts/e2e-azure-ovn-upgrade/gather-extra/artifacts/clusterversion.json| jq -r '.items[].status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + (.reason // "-") + ": " + (.message // "-")' 2022-01-13T20:12:19Z RetrievedUpdates=False NoChannel: The update channel has not been configured. 2022-01-13T20:38:33Z Available=True -: Done applying 4.10.0-0.nightly-2022-01-13-061145 2022-01-13T20:38:33Z Failing=False -: - 2022-01-13T20:38:33Z Progressing=False -: Cluster version is 4.10.0-0.nightly-2022-01-13-061145 Double check from log to drop `overrides` # curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn-upgrade/1481715413581565952/artifacts/e2e-azure-ovn-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-6895f7f7c4-2gjhd_cluster-version-operator.log|grep -E "Detected|detected|overrides" I0113 20:25:47.739431 1 sync_worker.go:496] Detected version change W0113 20:25:47.739436 1 sync_worker.go:499] Ignoring detected version change during payload initialization
Pulling this back to Assigned, because we're seeing failures like [1]: level=info msg=Waiting up to 40m0s (until 4:06PM) for the cluster at https://api.ci-op-7bzs3rvq-a9175.**********************:6443 to initialize... ... level=error msg=Cluster initialization failed because one or more operators are not functioning properly. ... level=fatal msg=failed to initialize the cluster: Multiple errors are preventing progress: level=fatal msg=* Could not update clusterrole "cloud-controller-manager" (135 of 770) level=fatal msg=* Could not update clusterrolebinding "cluster-storage-operator-role" (467 of 770) level=fatal msg=* Could not update configmap "openshift-machine-config-operator/machine-config-osimageurl" (648 of 770) ... Seems fairly common: $ w3m -dump -cols 200 'https://search.ci.openshift.org/?maxAge=48h&type=build-log&search=failed+to+initialize+the+cluster:+Multiple+errors+are+preventing+progress' | grep '^periodic-.*failures match' | sort periodic-ci-openshift-aws-efs-csi-driver-operator-release-4.10-nightly-operator-e2e (all) - 4 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-aws-efs-csi-driver-operator-release-4.11-nightly-operator-e2e (all) - 4 runs, 100% failed, 25% of failures match = 25% impact periodic-ci-openshift-driver-toolkit-release-4.11-e2e-aws-driver-toolkit (all) - 6 runs, 33% failed, 50% of failures match = 17% impact periodic-ci-openshift-insights-operator-release-4.11-insights-operator-e2e-tests-periodic (all) - 2 runs, 50% failed, 100% of failures match = 50% impact periodic-ci-openshift-openshift-tests-private-release-4.11-extended-vsphere-ipi-parallel (all) - 2 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-ovn-kubernetes-release-4.11-e2e-ibmcloud-ipi-ovn-periodic (all) - 2 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-release-master-ci-4.10-e2e-aws-crun (all) - 16 runs, 25% failed, 25% of failures match = 6% impact periodic-ci-openshift-release-master-ci-4.10-e2e-aws-ovn-upgrade (all) - 94 runs, 69% failed, 2% of failures match = 1% impact periodic-ci-openshift-release-master-ci-4.10-e2e-aws-serial (all) - 10 runs, 10% failed, 100% of failures match = 10% impact periodic-ci-openshift-release-master-ci-4.10-e2e-azure-ovn-upgrade (all) - 51 runs, 82% failed, 7% of failures match = 6% impact periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-cilium (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-techpreview-serial (all) - 5 runs, 40% failed, 50% of failures match = 20% impact periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade (all) - 109 runs, 56% failed, 8% of failures match = 5% impact periodic-ci-openshift-release-master-nightly-4.10-console-aws (all) - 5 runs, 80% failed, 25% of failures match = 20% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-cgroupsv2 (all) - 16 runs, 38% failed, 17% of failures match = 6% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-fips (all) - 5 runs, 20% failed, 100% of failures match = 20% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-fips-serial (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-azure-csi (all) - 6 runs, 50% failed, 67% of failures match = 33% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-azure-deploy-cnv (all) - 7 runs, 57% failed, 25% of failures match = 14% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-gcp-csi-migration (all) - 1 runs, 100% failed, 100% of failures match = 100% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-techpreview-serial (all) - 10 runs, 40% failed, 25% of failures match = 10% impact periodic-ci-openshift-release-master-nightly-4.10-e2e-vsphere-upi-serial (all) - 10 runs, 40% failed, 25% of failures match = 10% impact periodic-ci-openshift-verification-tests-master-stable-4.10-e2e-vsphere-cucushift-upi (all) - 2 runs, 100% failed, 50% of failures match = 50% impact periodic-ci-openshift-windows-machine-config-operator-release-4.10-vsphere-e2e-periodic (all) - 6 runs, 33% failed, 50% of failures match = 17% impact periodic-ci-openshift-windows-machine-config-operator-release-4.11-vsphere-e2e-periodic (all) - 6 runs, 50% failed, 33% of failures match = 17% impact although that query will also pick up jobs which fail around this point for other reasons. Back to [1]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade/1483093759888461824/artifacts/e2e-gcp-upgrade/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-675f8bcf96-q7mlx_cluster-version-operator.log | grep -B3 'Notify the sync worker that new work is available' | tail -n8 I0117 16:10:31.122027 1 sync_worker.go:496] Detected overrides change I0117 16:10:31.122036 1 sync_worker.go:253] Cancel the sync worker's current loop I0117 16:10:31.122097 1 sync_worker.go:259] Notify the sync worker that new work is available -- I0117 16:10:46.121492 1 cvo.go:552] Desired version from operator is v1.Update{Version:"4.10.0-0.ci-2022-01-17-080901", Image:"registry.build03.ci.openshift.org/ci-op-7bzs3rvq/release@sha256:bed94c1e285b46597091bdd130a13716bf898fba036747d89fbbcf2103c5f2bc", Force:false} I0117 16:10:46.121509 1 sync_worker.go:496] Detected overrides change I0117 16:10:46.121515 1 sync_worker.go:253] Cancel the sync worker's current loop I0117 16:10:46.121560 1 sync_worker.go:259] Notify the sync worker that new work is available Strangely, there are no overrides in that run: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade/1483093759888461824/artifacts/e2e-gcp-upgrade/gather-extra/artifacts/clusterversion.json | jq -r '.items[].spec' { "clusterID": "86304ee7-13ab-4229-87fd-d3493e4f83c5" } [1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.10-e2e-gcp-upgrade/1483093759888461824
#729 (just attached) should get test coverage like: 1. Begin installing version A. 2. During installation, 'oc adm upgrade --to-image "${VERSION_B_PULLSPEC}"' or similar (might need some --allow-explicit-upgrade and similar to make oc happy enough) to retarget to version B. 3. Confirm: $ oc -n openshift-cluster-version logs -l k8s-app=cluster-version-operator --tail -1 | grep 'Running sync.*in state' gives output like: ...Running sync ${VERSION_A} (force=false) on generation ... in state Initializing at attempt ... to confirm that the attempt to retarget is being ignored. 4. Insert some overrides [1] (still during install). Look at the CVO logs to ensure that the overridden objects are no longer being reconciled: $ oc -n openshift-cluster-version logs -l k8s-app=cluster-version-operator --tail -1 | grep "Running sync for ${KIND} "${NAMESPACE}/${NAME}" which should give output up until the new override was inserted, but stop giving output after. 5. Once install completes, confirm that the cluster immediately transitions to an update to version B, with 'oc adm upgrade' or similar. [1]: https://github.com/openshift/enhancements/blob/12546519c04f3a25245352d722466c7786fb44b3/dev-guide/cluster-version-operator/dev/clusterversion.md#setting-objects-unmanaged
For clusters that don't retarget the version during install, #729 is irrelevant, and we can look at existing nightly CI to see how #728 is doing (it was built in [1]). e2e-aws-serial [2]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-serial/1483666868551028736/artifacts/e2e-aws-serial/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-79bd96bf66-jhxpm_cluster-version-operator.log | grep 'Running sync.*in state\|Detected while\|Ignoring detected version\|Propagating initial target version' I0119 05:30:50.439864 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-19-050236 registry.build04.ci.openshift.org/ci-op-0h0k3h7d/release@sha256:a7fd0f26d9a6bc21f1e749bf17b261cd479ea19354d0bda86673c732b53a9b6c false} to sync worker loop in state Initializing. I0119 05:30:50.440381 1 sync_worker.go:507] Detected while calculating next work: version change (from { false} to {4.10.0-0.nightly-2022-01-19-050236 registry.build04.ci.openshift.org/ci-op-0h0k3h7d/release@sha256:a7fd0f26d9a6bc21f1e749bf17b261cd479ea19354d0bda86673c732b53a9b6c false}) I0119 05:30:50.440433 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 0 I0119 05:33:37.206016 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 1 I0119 05:36:41.726962 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 2 I0119 05:40:33.255167 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 3 I0119 05:44:46.367113 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Reconciling at attempt 0 ... That looks good to me, and apparently nothing touched overrides while that pod was running. But the installer's overrides touches are on the bootstrap node, so let's see if we can dig those out of Loki for this run. Following [2]'s custom-link-tools.html to Loki and running: {invoker="openshift-internal-ci/periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-serial/1483666868551028736"} | unpack | container_name="cluster-version-operator" |~ "Running sync.*in state|Detected while|Ignoring detected version|Propagating initial target version" only gives hits for cluster-version-operator-79bd96bf66-jhxpm, so I guess no access to the static bootstrap pod. Luckily (ish) console-aws [3] failed during bootstrapping on bug 2040533. Looking at the gathered log bundle [4]: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-console-aws/1483666878021767168/artifacts/console-aws/ipi-install-install/artifacts/log-bundle-20220119054933.tar >log-bundle.tar.gz $ tar tvz <log-bundle.tar.gz | grep bootkube -rw------- 1003220000/root 115809 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/journals/bootkube.log -rw------- 1003220000/root 3397 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/services/bootkube.json $ tar xOz log-bundle-20220119054933/bootstrap/journals/bootkube.log <log-bundle.tar.gz | grep -A1 overrides | tail -n2 Jan 19 05:25:56 ip-10-0-41-215 bootkube.sh[2338]: Sending bootstrap-finished event.Restoring CVO overrides Jan 19 05:26:21 ip-10-0-41-215 bootkube.sh[2338]: clusterversion.config.openshift.io/version patched So the overrides were cleared at 05:26:21. Checking the overrides that existed before: $ tar tvz <log-bundle.tar.gz | grep 'rendered-assets.*override' -rw------- 1003220000/root 0 2022-01-18 21:50 log-bundle-20220119054933/rendered-assets/openshift/cvo-overrides.done -rw------- 1003220000/root 2098 2022-01-18 21:50 log-bundle-20220119054933/rendered-assets/openshift/manifests/cvo-overrides.yaml -rw------- 1003220000/root 27 2022-01-18 21:50 log-bundle-20220119054933/rendered-assets/openshift/original_cvo_overrides.patch $ tar xOz log-bundle-20220119054933/rendered-assets/openshift/manifests/cvo-overrides.yaml <log-bundle.tar.gz apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: name: version namespace: openshift-cluster-version spec: clusterID: e0fabcfa-c14f-4a29-b122-cc7dc7a03563 overrides: - group: "" kind: ConfigMap name: cluster-config-v1 namespace: kube-system unmanaged: true - group: config.openshift.io kind: DNS name: cluster namespace: "" unmanaged: true ... $ tar xOz log-bundle-20220119054933/rendered-assets/openshift/original_cvo_overrides.patch <log-bundle.tar.gz {"spec":{"overrides":null}} Checking CRI-O for details about the bootstrap CVO pod: $ tar tvz <log-bundle.tar.gz | grep 'bootstrap.*crio.log' -rw------- 1003220000/root 95625 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/journals/crio.log $ tar xOz log-bundle-20220119054933/bootstrap/journals/crio.log <log-bundle.tar.gz | grep cluster-version-operator Jan 19 05:18:55 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:18:55.020873287Z" level=info msg="Running pod sandbox: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/POD" id=0b545baa-a393-4d9e-ab36-90ea6b4afc1f name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Jan 19 05:18:55 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:18:55.040700182Z" level=info msg="Ran pod sandbox 493e6975cb94bc535f599b901ec247b3bd872b92060fe87cb882acd2cd2dbae4 with infra container: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/POD" id=0b545baa-a393-4d9e-ab36-90ea6b4afc1f name=/runtime.v1alpha2.RuntimeService/RunPodSandbox Jan 19 05:18:55 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:18:55.399877167Z" level=info msg="Creating container: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/cluster-version-operator" id=b4eb70ba-8878-4857-9ef1-df610ddff4cf name=/runtime.v1alpha2.RuntimeService/CreateContainer Jan 19 05:18:55 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:18:55.464250669Z" level=info msg="Created container b179297c2037711f22dc9c2e765efb99e44186f9a92a0b4484ddd0d5cb034630: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/cluster-version-operator" id=b4eb70ba-8878-4857-9ef1-df610ddff4cf name=/runtime.v1alpha2.RuntimeService/CreateContainer Jan 19 05:18:55 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:18:55.473935725Z" level=info msg="Started container" PID=6988 containerID=b179297c2037711f22dc9c2e765efb99e44186f9a92a0b4484ddd0d5cb034630 description=openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/cluster-version-operator id=24f4a512-0b1a-4806-b116-f3b55b8e5d81 name=/runtime.v1alpha2.RuntimeService/StartContainer sandboxID=493e6975cb94bc535f599b901ec247b3bd872b92060fe87cb882acd2cd2dbae4 Jan 19 05:26:22 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:26:22.604051678Z" level=info msg="Stopped container b179297c2037711f22dc9c2e765efb99e44186f9a92a0b4484ddd0d5cb034630: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/cluster-version-operator" id=ced43978-f3e1-4335-b816-16754f916234 name=/runtime.v1alpha2.RuntimeService/StopContainer Jan 19 05:27:22 ip-10-0-41-215 crio[2039]: time="2022-01-19 05:27:22.358425543Z" level=info msg="Removed container b179297c2037711f22dc9c2e765efb99e44186f9a92a0b4484ddd0d5cb034630: openshift-cluster-version/bootstrap-cluster-version-operator-ip-10-0-41-215/cluster-version-operator" id=e654c8c8-5a6e-45f1-b660-0a80b426558a name=/runtime.v1alpha2.RuntimeService/RemoveContainer Ok, so the static pod was removed shortly after the overrides were cleared. Did we gather logs from that static CVO pod? $ tar tvz <log-bundle.tar.gz | grep bootstrap/containers drwxr-xr-x 1003220000/root 0 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers drwxr-xr-x 1003220000/root 0 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/bootstrap-control-plane -rw------- 1003220000/root 93449 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/bootstrap-control-plane/cluster-policy-controller.log -rw------- 1003220000/root 1435023 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/bootstrap-control-plane/kube-apiserver.log -rw------- 1003220000/root 930242 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/bootstrap-control-plane/kube-controller-manager.log -rw------- 1003220000/root 97390 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/bootstrap-control-plane/kube-scheduler.log -rw------- 1003220000/root 45868 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/etcd-4799c9eb4370c2bd4e561f8f8bae4c50a8304923f8cf9d03737fbf73fd4077dc.inspect -rw------- 1003220000/root 841840 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/etcd-4799c9eb4370c2bd4e561f8f8bae4c50a8304923f8cf9d03737fbf73fd4077dc.log -rw------- 1003220000/root 43520 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/etcdctl-cd9bdd7b98ef5f3f516c3ce6e996c4cdf4f27cecc3c125d212faf295d12deeb4.inspect -rw------- 1003220000/root 0 2022-01-18 21:50 log-bundle-20220119054933/bootstrap/containers/etcdctl-cd9bdd7b98ef5f3f516c3ce6e996c4cdf4f27cecc3c125d212faf295d12deeb4.log Nope. Ah well. There were clearly overrides. The overrides were cleared. Ah, I guess we can look at the control-plane CVO logs to confirm that the clearance was active: $ tar tvz <log-bundle.tar.gz | grep 'cluster-version-operator.*log' -rw------- 1003220000/root 2123471 2022-01-18 21:50 log-bundle-20220119054933/control-plane/10.0.216.199/containers/cluster-version-operator-7f580d7bcd19a0c9d9d6e02141679fc765d4b5e911c7f1bb3050912fed839922.log $ tar xOz log-bundle-20220119054933/control-plane/10.0.216.199/containers/cluster-version-operator-7f580d7bcd19a0c9d9d6e02141679fc765d4b5e911c7f1bb3050912fed839922.log <log-bundle.tar.gz | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version' I0119 05:26:33.055769 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-19-050236 registry.build04.ci.openshift.org/ci-op-k7bw6ht7/release@sha256:a7fd0f26d9a6bc21f1e749bf17b261cd479ea19354d0bda86673c732b53a9b6c false} to sync worker loop in state Initializing. I0119 05:26:33.056453 1 sync_worker.go:507] Detected while calculating next work: version change (from { false} to {4.10.0-0.nightly-2022-01-19-050236 registry.build04.ci.openshift.org/ci-op-k7bw6ht7/release@sha256:a7fd0f26d9a6bc21f1e749bf17b261cd479ea19354d0bda86673c732b53a9b6c false}) I0119 05:26:33.056512 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 0 I0119 05:26:40.138413 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) I0119 05:29:15.098599 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 1 I0119 05:29:18.903251 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) I0119 05:32:16.364124 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 2 I0119 05:32:22.664397 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) I0119 05:35:51.226422 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Initializing at attempt 3 I0119 05:35:57.484032 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) I0119 05:38:53.968561 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Reconciling at attempt 0 I0119 05:38:54.187795 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) I0119 05:46:25.927309 1 sync_worker.go:555] Running sync 4.10.0-0.nightly-2022-01-19-050236 (force=false) on generation 2 in state Reconciling at attempt 1 I0119 05:46:53.449977 1 sync_worker.go:768] Running sync for dns "cluster" (23 of 769) So it's clear that the cluster DNS override did get cleared and the CVO began reconciling it while staying in Initializing, which is good. So all signs point to things working well for these overrides-touching installs, but it would be nice to have the static bootstrap CVO pod logs to confirm things were working as expected. Probably not worth blocking this bug's verification on access to logs like that, though. [1]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4.10.0-0.nightly/release/4.10.0-0.nightly-2022-01-19-050236 [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-aws-serial/1483666868551028736 [3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-console-aws/1483666878021767168 [4]: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-console-aws/1483666878021767168/artifacts/console-aws/ipi-install-install/artifacts/
Hi, Trevor I want to confirm the verify steps about the bug, especially for the new prs. For pr #729, should i follow https://bugzilla.redhat.com/show_bug.cgi?id=2009845#c10 to verify it? It seems a hacked way for the verification from logs, right? I don't think we will support such scenario that allowing overrides or oc adm upgrade during installation. For pr #728, it is for the original issue which was fixed in #713 earlier. Should i verify it the same way with https://bugzilla.redhat.com/show_bug.cgi?id=2009845#c7 and check if regression on installation failures does not happen, or follow the steps you recorded in https://bugzilla.redhat.com/show_bug.cgi?id=2009845#c11?
(In reply to liujia from comment #14) > I don't think we will support such scenario that allowing overrides or oc adm upgrade during installation. It's not install-specific, we don't support overrides at all [1]: Setting a CVO override puts the entire cluster in an unsupported state. Reported issues must be reproduced after removing any overrides for support to proceed. But, since bug 2008119 [2], the installer will set overrides for some fraction of bootstrapping, and we support those installs. You can probably supply some manifests to the installer and watch the installer's injected overrides to confirm that the CVO's adjusting overrides-handling is working without having to inject your own overrides, but the timing there would be pretty tight, because the bootstrap CVO will exit very shortly after the installer clears those overrides. Maybe you could get at it by streaming down the bootstrap CVO's logs? I don't think we explicitly talk about folks asking for updates mid-install in docs, probably because it seems like it might fall under common sense "this is probably a bad idea". So while I expect us to gracefully ignore these mid-install retarget requests, I don't have a problem with us deciding that its so unsupported that it's not worth QEing. > For pr #728, it is for the original issue which was fixed in #713 earlier. Should i verify it the same way with https://bugzilla.redhat.com/show_bug.cgi?id=2009845#c7 and check if regression on installation failures does not happen, or follow the steps you recorded in https://bugzilla.redhat.com/show_bug.cgi?id=2009845#c11? I'm still a bit fuzzy on why #713 played poorly in CI. The fact #728 recovered CI confirms that there was a problem with #713 hotlooping on overrides-change detection. The fact that #728 doesn't seem to have caused CI trouble is good sign that we actually did fix that hotlooping. #729 isn't exercised by CI. So I'm fine with a repeat of the earlier verification steps, or something like comment 11 that pulls down bootstrap-CVO logs, either way. [1]: https://docs.openshift.com/container-platform/4.9/updating/understanding-the-update-service.html#unmanaged-operators_understanding-the-update-service [2]: https://github.com/openshift/installer/pull/5258
Version: 4.10.0-0.nightly-2022-01-20-082726 1. Trigger a cluster with cv.overrides setting in manifest before the installation. … overrides: - group: imageregistry.operator.openshift.io kind: Config name: cluster namespace: '’ unmanaged: true …. 2.ssh into the bootstrap vm to check the cvo behavior during the installation stage. # cat /var/opt/openshift/original_cvo_overrides.patch {"spec":{"overrides":[{"group":"imageregistry.operator.openshift.io","kind":"Config","name":"cluster","namespace":"","unmanaged":true}]}} //It proves the cv.overrides is expected. # cat /var/opt/openshift/manifests/cvo-overrides.yaml|grep -A4 "DNS" kind: DNS name: cluster namespace: "" unmanaged: true - group: config.openshift.io //from above we can see DNS/cluster should be from installer-provided overrides. # journalctl -u bootkube.service |grep "overrides\|patched" Jan 21 07:09:02 jliu-410-8wrv8-bootstrap.c.openshift-qe.internal bootkube.sh[2503]: Created "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version Jan 21 07:14:41 jliu-410-8wrv8-bootstrap.c.openshift-qe.internal bootkube.sh[2503]: Skipped "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version as it already exists Jan 21 07:14:46 jliu-410-8wrv8-bootstrap.c.openshift-qe.internal bootkube.sh[2503]: Sending bootstrap-finished event.Restoring CVO overrides Jan 21 07:16:02 jliu-410-8wrv8-bootstrap.c.openshift-qe.internal bootkube.sh[2503]: clusterversion.config.openshift.io/version patched //Bootkube said it patched cv at 07:16:02(cluster-bootstrap complete and cleared the cvo-overrides from installer), but from the container log of cvo pod on bootstrap vm, we see before the patch(07:16:02), cvo was still running sync for resources in cvo-overrides. ? I thought the overrides should be respected before the cv restored to the patched one. But if seems not. $ sudo crictl logs 75234b8d0a708 2>&1|grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster' I0121 07:09:02.705099 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-20-082726 registry.ci.openshift.org/ocp/release@sha256:bdc27b9ff4a1a482d00fc08924f1157d782ded9f3e91af09fe9f3596bcea877c false} to sync worker loop in state Initializing. I0121 07:09:02.706104 1 sync_worker.go:510] Detected while calculating next work: version (from { false} to {4.10.0-0.nightly-2022-01-20-082726 registry.ci.openshift.org/ocp/release@sha256:bdc27b9ff4a1a482d00fc08924f1157d782ded9f3e91af09fe9f3596bcea877c false}) and overrides changes I0121 07:09:02.706166 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 1 in state Initializing at attempt 0 I0121 07:09:06.387267 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:09:21.034533 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:11:38.447158 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 1 in state Initializing at attempt 1 I0121 07:11:39.243413 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:11:45.743278 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:14:30.519224 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 1 in state Initializing at attempt 2 I0121 07:14:31.876562 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) 3. After the installation finish, check the final cvo log(not from bootstrap vm) through oc. ? There is still syncing for the overridden resource Config/cluster. # ./oc logs cluster-version-operator-674c64b446-hfh7q|grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster' I0121 07:16:07.192798 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-20-082726 registry.ci.openshift.org/ocp/release@sha256:bdc27b9ff4a1a482d00fc08924f1157d782ded9f3e91af09fe9f3596bcea877c false} to sync worker loop in state Initializing. I0121 07:16:07.193745 1 sync_worker.go:510] Detected while calculating next work: version (from { false} to {4.10.0-0.nightly-2022-01-20-082726 registry.ci.openshift.org/ocp/release@sha256:bdc27b9ff4a1a482d00fc08924f1157d782ded9f3e91af09fe9f3596bcea877c false}) and overrides changes I0121 07:16:07.193790 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Initializing at attempt 0 I0121 07:16:09.825347 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:16:15.451087 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:20:14.316656 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Initializing at attempt 1 I0121 07:20:16.624234 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:20:21.099399 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:24:50.373206 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Initializing at attempt 2 I0121 07:24:51.829592 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:24:56.554058 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:30:17.812583 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Initializing at attempt 3 I0121 07:30:19.752220 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:30:24.176248 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:34:13.980893 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Reconciling at attempt 0 I0121 07:34:14.177649 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:34:24.537535 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:38:20.102591 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Reconciling at attempt 0 I0121 07:38:20.303590 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0121 07:38:30.709982 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0121 07:42:26.234541 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-20-082726 (force=false) on generation 2 in state Reconciling at attempt 0 ... Based on the above check, it looks unexpected. Trevor, could you help correct my verify?
(In reply to liujia from comment #17) > I0121 07:09:06.387267 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) Config "cluster" isn't very specific without the group information, which the CVO doesn't log today. See bug 2022509 for some of the CVO manifests that have a 'Config' kind, a 'cluster' name, and assorted groups. > I0121 07:09:21.034533 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) This one is more surprising to me. Checking the source [1], I see that we are actually still logging that, but then following up with "Skipping ... as unmanaged". Can we try again with: $ ... | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' And possibly with an override that's not a Config cluster entry? Maybe: overrides: - group: monitoring.coreos.com kind: PrometheusRule namespace: openshift-machine-api name: machine-api-operator-prometheus-rules unmanaged: true or similar? [1]: https://github.com/openshift/cluster-version-operator/blob/404b1970876b64f7adec4e38710c69594e42e317/pkg/cvo/sync_worker.go#L771-L775
Attempting with an override on 4.10.0-0.nightly-2022-01-24-070025 overrides: - group: monitoring.coreos.com kind: PrometheusRule namespace: openshift-machine-api name: machine-api-operator-prometheus-rules unmanaged: true [core@yanyang0125e-lcfd7-bootstrap ~]$ cat /var/opt/openshift/original_cvo_overrides.patch | jq { "spec": { "overrides": [ { "group": "monitoring.coreos.com", "kind": "PrometheusRule", "name": "machine-api-operator-prometheus-rules", "namespace": "openshift-machine-api", "unmanaged": true } ] } } [core@yanyang0125e-lcfd7-bootstrap ~]$ cat /var/opt/openshift/manifests/cvo-overrides.yaml apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: name: version namespace: openshift-cluster-version spec: channel: stable-4.10 clusterID: b985ec34-98f9-4e1e-b55e-c8f9f9e54542 overrides: - group: "" kind: ConfigMap name: ingress-uid namespace: kube-system unmanaged: true - group: "" kind: ConfigMap name: cloud-provider-config namespace: openshift-config unmanaged: true - group: "" kind: ConfigMap name: cluster-config-v1 namespace: kube-system unmanaged: true - group: config.openshift.io kind: DNS name: cluster namespace: "" unmanaged: true - group: config.openshift.io kind: Infrastructure name: cluster namespace: "" unmanaged: true - group: config.openshift.io kind: Ingress name: cluster namespace: "" unmanaged: true - group: apiextensions.k8s.io kind: CustomResourceDefinition name: networks.config.openshift.io namespace: "" unmanaged: true - group: config.openshift.io kind: Network name: cluster namespace: "" unmanaged: true - group: config.openshift.io kind: Proxy name: cluster namespace: "" unmanaged: true - group: config.openshift.io kind: Scheduler name: cluster namespace: "" unmanaged: true - group: "" kind: Secret name: kube-cloud-cfg namespace: kube-system unmanaged: true - group: "" kind: ConfigMap name: root-ca namespace: kube-system unmanaged: true - group: "" kind: Secret name: machine-config-server-tls namespace: openshift-machine-config-operator unmanaged: true - group: "" kind: Secret name: pull-secret namespace: openshift-config unmanaged: true - group: "" kind: Secret name: gcp-credentials namespace: kube-system unmanaged: true - group: "" kind: Secret name: kubeadmin namespace: kube-system unmanaged: true - group: rbac.authorization.k8s.io kind: Role name: gcp-creds-secret-reader namespace: kube-system unmanaged: true - group: "" kind: ConfigMap name: openshift-install-manifests namespace: openshift-config unmanaged: true - group: monitoring.coreos.com kind: PrometheusRule name: machine-api-operator-prometheus-rules namespace: openshift-machine-api unmanaged: true [core@yanyang0125e-lcfd7-bootstrap ~]$ journalctl -u bootkube.service |grep "overrides\|patched" Jan 25 11:25:19 yanyang0125e-lcfd7-bootstrap.c.openshift-qe.internal bootkube.sh[2423]: Created "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version Jan 25 11:34:18 yanyang0125e-lcfd7-bootstrap.c.openshift-qe.internal bootkube.sh[2423]: Skipped "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version as it already exists Jan 25 11:34:23 yanyang0125e-lcfd7-bootstrap.c.openshift-qe.internal bootkube.sh[2423]: Sending bootstrap-finished event.Restoring CVO overrides Jan 25 11:35:01 yanyang0125e-lcfd7-bootstrap.c.openshift-qe.internal bootkube.sh[2423]: clusterversion.config.openshift.io/version patched [core@yanyang0125e-lcfd7-bootstrap ~]$ sudo crictl logs beee526925a7e 2>&1 | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' I0125 11:25:19.920164 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-24-070025 registry.ci.openshift.org/ocp/release@sha256:87c1f0019d44107031c662a3c387bba46d1f2710909b583e5f4c0ad61cdba36e false} to sync worker loop in state Initializing. I0125 11:25:19.920940 1 sync_worker.go:510] Detected while calculating next work: version (from { false} to {4.10.0-0.nightly-2022-01-24-070025 registry.ci.openshift.org/ocp/release@sha256:87c1f0019d44107031c662a3c387bba46d1f2710909b583e5f4c0ad61cdba36e false}) and overrides changes I0125 11:25:19.921017 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 1 in state Initializing at attempt 0 I0125 11:25:23.462739 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 11:25:37.633450 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 11:25:37.633465 1 sync_worker.go:775] Skipping dns "cluster" (23 of 769) as unmanaged I0125 11:25:45.658020 1 sync_worker.go:775] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 769) as unmanaged I0125 11:25:50.259162 1 sync_worker.go:775] Skipping infrastructure "cluster" (26 of 769) as unmanaged I0125 11:25:50.259180 1 sync_worker.go:775] Skipping ingress "cluster" (27 of 769) as unmanaged I0125 11:25:50.259195 1 sync_worker.go:775] Skipping network "cluster" (28 of 769) as unmanaged I0125 11:25:58.934214 1 sync_worker.go:775] Skipping proxy "cluster" (32 of 769) as unmanaged I0125 11:25:58.934233 1 sync_worker.go:775] Skipping scheduler "cluster" (33 of 769) as unmanaged I0125 11:26:03.934560 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 11:29:46.323880 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 1 in state Initializing at attempt 1 I0125 11:29:46.446552 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 11:29:50.557416 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 11:29:55.981236 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 11:29:55.981247 1 sync_worker.go:775] Skipping dns "cluster" (23 of 769) as unmanaged I0125 11:29:57.855967 1 sync_worker.go:775] Skipping infrastructure "cluster" (26 of 769) as unmanaged I0125 11:29:57.855999 1 sync_worker.go:775] Skipping ingress "cluster" (27 of 769) as unmanaged I0125 11:29:57.856018 1 sync_worker.go:775] Skipping network "cluster" (28 of 769) as unmanaged I0125 11:29:58.707702 1 sync_worker.go:775] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 769) as unmanaged I0125 11:30:02.181421 1 sync_worker.go:775] Skipping proxy "cluster" (32 of 769) as unmanaged I0125 11:30:02.181437 1 sync_worker.go:775] Skipping scheduler "cluster" (33 of 769) as unmanaged Looks right to me. Bootstrap CVO skips prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules". Looking at the CVO log after installation: # oc -n openshift-cluster-version logs cluster-version-operator-5c6ff84ff4-lqczv | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' I0125 11:35:00.432109 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-24-070025 registry.ci.openshift.org/ocp/release@sha256:87c1f0019d44107031c662a3c387bba46d1f2710909b583e5f4c0ad61cdba36e false} to sync worker loop in state Initializing. I0125 11:35:00.433405 1 sync_worker.go:510] Detected while calculating next work: version (from { false} to {4.10.0-0.nightly-2022-01-24-070025 registry.ci.openshift.org/ocp/release@sha256:87c1f0019d44107031c662a3c387bba46d1f2710909b583e5f4c0ad61cdba36e false}) and overrides changes I0125 11:35:00.433568 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 1 in state Initializing at attempt 0 I0125 11:35:01.050566 1 sync_worker.go:510] Detected while considering cluster version generation 2: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0125 11:35:01.216362 1 sync_worker.go:510] Detected while calculating next work: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0125 11:35:01.216430 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Updating at attempt 0 I0125 11:35:33.509703 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 11:43:25.104619 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Updating at attempt 1 I0125 11:43:28.469841 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 11:48:32.337140 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 11:52:37.232509 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Updating at attempt 2 I0125 11:52:40.290296 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 11:52:47.195681 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 11:56:16.943332 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 12:00:01.607790 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Reconciling at attempt 0 I0125 12:00:05.719082 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 12:00:24.066865 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 12:00:31.319361 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 12:04:27.084960 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Reconciling at attempt 0 I0125 12:04:31.144522 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 12:04:45.844369 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 12:04:52.896325 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 12:08:48.659965 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Reconciling at attempt 0 I0125 12:08:52.717931 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 12:09:07.366991 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 12:09:18.368424 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 12:13:14.133998 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Reconciling at attempt 0 I0125 12:13:18.194508 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 12:13:32.843531 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 12:13:39.944947 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged I0125 12:17:35.716483 1 sync_worker.go:558] Running sync 4.10.0-0.nightly-2022-01-24-070025 (force=false) on generation 2 in state Reconciling at attempt 0 I0125 12:17:39.775264 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 769) I0125 12:17:54.425639 1 sync_worker.go:771] Running sync for config "cluster" (155 of 769) I0125 12:18:01.527160 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged Well, we can still see sync for dns "cluster" and config "cluster". Trevor, could you please help confirm if it's expected? Thanks!
Comment 19's bootstrapping looks good to me, up through the 11:35:01 patch to remove all of the installer-injected overrides. Logs from cluster-version-operator-5c6ff84ff4-lqczv successfully detect the overrides change and begin syncing everything except the user-injected PrometheusRule override, which is good because that one wasn't cleared. But I don't understand: Running sync ... on generation 2 in state Updating at attempt 0 after the overrides change. We want to stay in state Initializing, per bug 1905221. I've opened [1] to poke around. [1]: https://github.com/openshift/cluster-version-operator/pull/730
Thanks Trevor for helping me understand it. I'm re-opening it at the moment. Please revert it back to ON_QA if you'd like to address the issue in comment#20 separately. Thanks.
# oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-01-26T06:44:25Z" generation: 2 name: version resourceVersion: "26635" uid: e6e23de2-6358-48be-bd00-b7fc3a3a8bf7 spec: channel: stable-4.10 clusterID: 2704b169-c64b-4534-ad83-5a53edb4d163 overrides: - group: monitoring.coreos.com kind: PrometheusRule name: machine-api-operator-prometheus-rules namespace: openshift-machine-api unmanaged: true status: availableUpdates: null conditions: - lastTransitionTime: "2022-01-26T06:44:25Z" message: Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing. reason: ClusterVersionOverridesSet status: "False" type: Upgradeable - lastTransitionTime: "2022-01-26T07:02:28Z" message: Done applying 4.10.0-0.nightly-2022-01-25-023600 status: "True" type: Available - lastTransitionTime: "2022-01-26T07:00:57Z" status: "False" type: Failing - lastTransitionTime: "2022-01-26T07:02:28Z" message: Cluster version is 4.10.0-0.nightly-2022-01-25-023600 status: "False" type: Progressing - lastTransitionTime: "2022-01-26T06:44:25Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.10.0-0.nightly-2022-01-25-023600 not found in the "stable-4.10" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates desired: image: registry.ci.openshift.org/ocp/release@sha256:19fd4b9a313f2dfcdc982f088cffcc5859484af3cb8966cde5ec0be90b262dbc version: 4.10.0-0.nightly-2022-01-25-023600 history: - completionTime: "2022-01-26T07:02:28Z" image: registry.ci.openshift.org/ocp/release@sha256:19fd4b9a313f2dfcdc982f088cffcc5859484af3cb8966cde5ec0be90b262dbc startedTime: "2022-01-26T06:44:25Z" state: Completed verified: false version: 4.10.0-0.nightly-2022-01-25-023600 observedGeneration: 2 versionHash: 1rTvIGx_lXw= kind: List metadata: resourceVersion: "" selfLink: ""
Looking in Loki for non-update jobs that still had a CVO transition to Updating: {invoker!~"|.*upgrade.*"} | unpack | container_name="cluster-version-operator" |~ "Running sync .* in state Updating at attempt 0" Turned up an ...-e2e-aws-serial/1486077917686403072 invoker (among a handful of others). Finding the run-detail page from that job while Prow still remembers it: $ curl -s https://prow.ci.openshift.org/prowjobs.js | jq . | grep 'https://.*1486077917686403072' "url": "https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_oc/851/pull-ci-openshift-oc-master-e2e-aws-serial/1486077917686403072", And it does indeed look like it reproduces the comment 19 transition to Updating: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_oc/851/pull-ci-openshift-oc-master-e2e-aws-serial/1486077917686403072/artifacts/e2e-aws-serial/gather-extra/artifacts/pods/openshift-cluster-version_cluster-version-operator-6f8f6947c4-pfqqz_cluster-version-operator.log | grep 'ClusterVersionOperator\|became leader\|Running sync.*in state\|Result of work\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' I0125 21:16:07.740960 1 start.go:23] ClusterVersionOperator v1.0.0-752-g037f1986-dirty I0125 21:19:07.639366 1 event.go:285] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"openshift-cluster-version", Name:"version", UID:"5f934228-6ba4-4843-8abe-4f9a2c2dfb96", APIVersion:"v1", ResourceVersion:"14083", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' ip-10-0-177-211_32d9affd-d048-4d50-a5f5-16778218d71a became leader I0125 21:19:07.639536 1 cvo.go:338] Starting ClusterVersionOperator with minimum reconcile period 2m42.77595144s I0125 21:19:07.640196 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest registry.build02.ci.openshift.org/ci-op-42ntw0i2/release@sha256:8929f52954b25a8ce26c4fb62a9268b6fbf3d11b8676887b9fe908ae7930c318 false} to sync worker loop in state Initializing. I0125 21:19:07.641298 1 sync_worker.go:510] Detected while calculating next work: version (from { false} to {4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest registry.build02.ci.openshift.org/ci-op-42ntw0i2/release@sha256:8929f52954b25a8ce26c4fb62a9268b6fbf3d11b8676887b9fe908ae7930c318 false}) and overrides changes I0125 21:19:07.641364 1 sync_worker.go:558] Running sync 4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest (force=false) on generation 1 in state Initializing at attempt 0 I0125 21:19:08.201327 1 sync_worker.go:771] Running sync for config "cluster" (155 of 770) I0125 21:19:14.772127 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 770) I0125 21:19:14.772134 1 sync_worker.go:775] Skipping dns "cluster" (23 of 770) as unmanaged I0125 21:19:16.800720 1 sync_worker.go:775] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0125 21:19:16.800728 1 sync_worker.go:775] Skipping ingress "cluster" (27 of 770) as unmanaged I0125 21:19:16.800737 1 sync_worker.go:775] Skipping network "cluster" (28 of 770) as unmanaged I0125 21:19:17.417768 1 sync_worker.go:510] Detected while considering cluster version generation 2: overrides change ([...blah blah lots... to []) I0125 21:19:17.457437 1 task_graph.go:546] Result of work: [Could not update oauth "cluster" (29 of 770) ...blah blah lots...] I0125 21:19:17.458850 1 sync_worker.go:510] Detected while calculating next work: overrides change ([...blah blah lots...] to []) I0125 21:19:17.458933 1 sync_worker.go:558] Running sync 4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest (force=false) on generation 2 in state Updating at attempt 0 I0125 21:19:18.378231 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 770) I0125 21:24:43.012398 1 task_graph.go:546] Result of work: [Cluster operator kube-apiserver is not available] I0125 21:25:24.602709 1 sync_worker.go:558] Running sync 4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest (force=false) on generation 2 in state Updating at attempt 1 I0125 21:25:24.692595 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 770) I0125 21:30:21.270700 1 sync_worker.go:771] Running sync for config "cluster" (155 of 770) I0125 21:30:50.155119 1 task_graph.go:546] Result of work: [Cluster operator authentication is updating versions Cluster operator console is not available] I0125 21:32:23.824640 1 sync_worker.go:558] Running sync 4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest (force=false) on generation 2 in state Updating at attempt 2 I0125 21:32:23.946566 1 sync_worker.go:771] Running sync for dns "cluster" (23 of 770) I0125 21:32:28.329871 1 sync_worker.go:771] Running sync for config "cluster" (155 of 770) I0125 21:32:54.329864 1 task_graph.go:546] Result of work: [] I0125 21:35:37.109595 1 sync_worker.go:558] Running sync 4.10.0-0.ci.test-2022-01-25-205921-ci-op-42ntw0i2-latest (force=false) on generation 2 in state Reconciling at attempt 0 ... I still don't understand why it's making that transition though...
Logging the test result for #729. The cluster starts from 4.10.0-0.nightly-2022-01-25-023600 and upgrades to 4.10.0-0.nightly-2022-01-26-234447 ***When cluster is bootstrapping, upgrade it # oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d --allow-explicit-upgrade --force warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. error: the cluster is already upgrading: Reason: MultipleErrors Message: Unable to apply 4.10.0-0.nightly-2022-01-25-023600: an unknown error has occurred: MultipleErrors If you want to upgrade anyway, use --allow-upgrade-with-warnings. ***Adding --allow-upgrade-with-warnings to the oc adm upgrade command line. # oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d --allow-explicit-upgrade --force --allow-upgrade-with-warnings warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. warning: --allow-upgrade-with-warnings is bypassing: the cluster is already upgrading: Reason: Message: Working towards 4.10.0-0.nightly-2022-01-25-023600: 637 of 769 done (82% complete) Updating to release image registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 6m24s Working towards 4.10.0-0.nightly-2022-01-25-023600: 643 of 769 done (83% complete) # oc -n openshift-cluster-version logs -l k8s-app=cluster-version-operator --tail -1 | grep 'Running sync.*in state' I0127 11:35:53.245929 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 3 in state Initializing at attempt 0 ... # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version False True 14m Unable to apply 4.10.0-0.nightly-2022-01-26-234447: some cluster operators have not yet rolled out # oc -n openshift-cluster-version logs -l k8s-app=cluster-version-operator --tail -1 | grep 'Running sync.*in state' I0127 11:36:29.968216 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 3 in state Initializing at attempt 0 I0127 11:37:46.119764 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 4 in state Updating at attempt 0 # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-26-234447 True False 5m44s Cluster version is 4.10.0-0.nightly-2022-01-26-234447 # oc -n openshift-cluster-version logs -l k8s-app=cluster-version-operator --tail -1 | grep 'Running sync.*in state' I0127 11:52:55.748789 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 4 in state Initializing at attempt 0 I0127 11:55:41.909815 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 4 in state Reconciling at attempt 0 I0127 11:59:22.383694 1 sync_worker.go:558] Running sync registry.ci.openshift.org/ocp/release@sha256:694c18617c3ba2d8b9fa2cfd5ddf19919603f6179fe630ccf1a5f25b84d74d1d (force=true) on generation 4 in state Reconciling at attempt 0 Well, after bootstrapping completes, cluster installation goes to cluster initialization stage. At the moment the cluster is upgrading to the 4.10.0-0.nightly-2022-01-26-234447.
Attempting with 4.11.0-0.nightly-2022-01-28-013835 for #728 $ journalctl -u bootkube.service |grep "overrides\|patched" Jan 28 03:01:44 yanyang0128a-d6vjr-bootstrap.c.openshift-qe.internal bootkube.sh[2405]: Created "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version Jan 28 03:10:32 yanyang0128a-d6vjr-bootstrap.c.openshift-qe.internal bootkube.sh[2405]: Skipped "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version as it already exists Jan 28 03:10:37 yanyang0128a-d6vjr-bootstrap.c.openshift-qe.internal bootkube.sh[2405]: Sending bootstrap-finished event.Restoring CVO overrides $ sudo crictl logs c53539a62c0f1 2>&1 | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' I0128 03:01:44.129776 1 sync_worker.go:255] Propagating initial target version {4.11.0-0.nightly-2022-01-28-013835 registry.ci.openshift.org/ocp/release@sha256:5b00033968582c10546c1c3c1b6b3fc7836e088da36570b19f9a1f38555594c2 false} to sync worker loop in state Initializing. I0128 03:01:44.130076 1 sync_worker.go:511] Detected while calculating next work: version (from { false} to {4.11.0-0.nightly-2022-01-28-013835 registry.ci.openshift.org/ocp/release@sha256:5b00033968582c10546c1c3c1b6b3fc7836e088da36570b19f9a1f38555594c2 false}) and overrides changes I0128 03:01:44.130128 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 1 in state Initializing at attempt 0 I0128 03:01:47.595537 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:02:05.749177 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:02:05.749187 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0128 03:02:10.774614 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0128 03:02:14.041505 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0128 03:02:14.041519 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0128 03:02:14.041531 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0128 03:02:19.691541 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0128 03:02:19.691557 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0128 03:02:24.808726 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:05:01.601783 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 1 in state Initializing at attempt 1 I0128 03:05:02.435896 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:05:04.886746 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:05:10.135932 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:05:10.135942 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0128 03:05:14.812545 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0128 03:05:15.160240 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0128 03:05:15.160260 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0128 03:05:15.160278 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0128 03:05:17.010804 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0128 03:05:17.010819 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0128 03:08:45.839430 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 1 in state Initializing at attempt 2 I0128 03:08:47.048941 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:08:49.674497 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:08:55.873266 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:08:55.873279 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0128 03:08:58.650169 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0128 03:09:09.787946 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0128 03:09:09.787967 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0128 03:09:09.787990 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0128 03:09:09.839780 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0128 03:09:09.841401 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged # oc -n openshift-cluster-version logs cluster-version-operator-7f585665db-vd4nv | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged\|Result of work' I0128 03:19:50.073278 1 sync_worker.go:255] Propagating initial target version {4.11.0-0.nightly-2022-01-28-013835 registry.ci.openshift.org/ocp/release@sha256:5b00033968582c10546c1c3c1b6b3fc7836e088da36570b19f9a1f38555594c2 false} to sync worker loop in state Initializing. I0128 03:19:50.073545 1 sync_worker.go:511] Detected while calculating next work: version (from { false} to {4.11.0-0.nightly-2022-01-28-013835 registry.ci.openshift.org/ocp/release@sha256:5b00033968582c10546c1c3c1b6b3fc7836e088da36570b19f9a1f38555594c2 false}) and overrides changes I0128 03:19:50.073694 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Initializing at attempt 0 I0128 03:19:51.813710 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:19:59.699347 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:20:05.191465 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:23:48.449944 1 task_graph.go:546] Result of work: [Cluster operator authentication is not available Cluster operator monitoring is not available] I0128 03:24:21.659487 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Initializing at attempt 1 I0128 03:24:23.192583 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:24:25.518616 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:24:27.869683 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:26:51.460456 1 task_graph.go:546] Result of work: [] I0128 03:30:49.824637 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:30:50.263254 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:31:00.385524 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:31:05.702367 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:31:18.635183 1 task_graph.go:546] Result of work: [] I0128 03:35:16.999976 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:35:17.206185 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:35:27.608428 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:35:32.928148 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:35:45.861166 1 task_graph.go:546] Result of work: [] I0128 03:39:44.226120 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:39:44.462258 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:39:54.785176 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:40:00.109708 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:40:13.013744 1 task_graph.go:546] Result of work: [] I0128 03:44:11.378505 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:44:11.614544 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:44:21.987465 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:44:27.311034 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:44:40.165856 1 task_graph.go:546] Result of work: [] I0128 03:48:38.530995 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:48:38.779480 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:48:49.138897 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:48:54.460949 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:49:07.287722 1 task_graph.go:546] Result of work: [] I0128 03:53:05.651942 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:53:05.909901 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:53:16.211026 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:53:21.526338 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:53:34.396584 1 task_graph.go:546] Result of work: [] I0128 03:57:32.762925 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 03:57:32.987048 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 03:57:43.369621 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 03:57:48.692071 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 03:58:01.497199 1 task_graph.go:546] Result of work: [] I0128 04:01:59.861504 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 04:02:00.107528 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 04:02:10.469386 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 04:02:15.797134 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 04:02:28.601345 1 task_graph.go:546] Result of work: [] I0128 04:06:26.966923 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-01-28-013835 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 04:06:27.192029 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 04:06:37.574027 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 04:06:42.897887 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 04:06:55.717041 1 task_graph.go:546] Result of work: [] We see the delay between post-bootstrap CVO and bootstrap CVO. "state Updating" is not observed. The installation passed. Cluster installation in comment#19 passed as well.
My bad, I used incorrect build in comment#29. Re-testing it with 4.10.0-0.nightly-2022-01-28-042103 and observing "state Updating" once more! # oc -n openshift-cluster-version logs cluster-version-operator-7c678c568b-j2j46 | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged\|Result of work\|Work changed' I0128 05:23:50.171958 1 sync_worker.go:255] Propagating initial target version {4.10.0-0.nightly-2022-01-28-042103 registry.ci.openshift.org/ocp/release@sha256:2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 false} to sync worker loop in state Initializing. I0128 05:23:50.181040 1 sync_worker.go:511] Detected while calculating next work: version (from { false} to {4.10.0-0.nightly-2022-01-28-042103 registry.ci.openshift.org/ocp/release@sha256:2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 false}) and overrides changes I0128 05:23:50.181148 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 1 in state Initializing at attempt 0 I0128 05:23:53.967907 1 sync_worker.go:511] Detected while considering cluster version generation 2: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0128 05:23:54.501144 1 task_graph.go:546] Result of work: [Could not update configmap "openshift-config-managed/admin-gates" (3 of 770) Could not update servicemonitor "openshift-cluster-version/cluster-version-operator" (10 of 770) Could not update apiserver "cluster" (19 of 770) Could not update customresourcedefinition "builds.config.openshift.io" (37 of 770) Could not update etcd "cluster" (64 of 770) Could not update clusterrole "system:openshift:scc:anyuid" (75 of 770) Could not update namespace "openshift-kube-controller-manager-operator" (101 of 770) Could not update namespace "openshift-kube-scheduler-operator" (110 of 770) Could not update namespace "openshift-cloud-controller-manager-operator" (119 of 770) Could not update customresourcedefinition "bmceventsubscriptions.metal3.io" (150 of 770) Could not update namespace "openshift-config-operator" (154 of 770) Could not update credentialsrequest "openshift-cloud-credential-operator/openshift-machine-api-aws" (164 of 770) Could not update configmap "openshift-machine-api/cbo-trusted-ca" (208 of 770) Could not update clusterrole "system:openshift:cloud-credential-operator:cluster-reader" (224 of 770) Could not update namespace "openshift-authentication-operator" (241 of 770) Could not update customresourcedefinition "clusterautoscalers.autoscaling.openshift.io" (256 of 770) Could not update customresourcedefinition "csisnapshotcontrollers.operator.openshift.io" (278 of 770) Could not update customresourcedefinition "configs.imageregistry.operator.openshift.io" (296 of 770) Could not update clusterrole "openshift-ingress-operator" (318 of 770) Could not update customresourcedefinition "storageversionmigrations.migration.k8s.io" (335 of 770) Could not update namespace "openshift-cluster-machine-approver" (345 of 770) Could not update customresourcedefinition "alertmanagerconfigs.monitoring.coreos.com" (357 of 770) Could not update credentialsrequest "openshift-cloud-credential-operator/openshift-cloud-network-config-controller-gcp" (378 of 770) Could not update namespace "openshift-cluster-node-tuning-operator" (381 of 770) Could not update namespace "openshift-apiserver-operator" (398 of 770) Could not update namespace "openshift-controller-manager-operator" (410 of 770) Could not update namespace "openshift-cluster-samples-operator" (421 of 770) Could not update namespace "openshift-cluster-csi-drivers" (451 of 770) Could not update priorityclass "openshift-user-critical" (471 of 770) Could not update helmchartrepository "openshift-helm-charts" (472 of 770) Could not update imagestream "openshift/driver-toolkit" (526 of 770) Could not update namespace "openshift-insights" (527 of 770) Could not update customresourcedefinition "catalogsources.operators.coreos.com" (556 of 770) Could not update namespace "openshift-marketplace" (589 of 770) Could not update clusterrolebinding "system:openshift:operator:service-ca-operator" (604 of 770) Could not update namespace "openshift-network-operator" (613 of 770) Could not update clusterrole "openshift-dns-operator" (622 of 770) Could not update clusterrole "system:openshift:machine-config-operator:cluster-reader" (632 of 770) Could not update rolebinding "openshift-cloud-credential-operator/prometheus-k8s" (651 of 770) Could not update role "openshift-authentication/prometheus-k8s" (655 of 770) Could not update clusterrolebinding "registry-monitoring" (665 of 770) Could not update rolebinding "openshift-cluster-machine-approver/prometheus-k8s" (671 of 770) Could not update operatorgroup "openshift-monitoring/openshift-cluster-monitoring" (674 of 770) Could not update rolebinding "openshift-cluster-storage-operator/prometheus" (689 of 770) Could not update role "openshift-console-operator/prometheus-k8s" (692 of 770) Could not update role "openshift-console/prometheus-k8s" (695 of 770) Could not update rolebinding "openshift-dns-operator/prometheus-k8s" (699 of 770) Could not update rolebinding "openshift-etcd-operator/prometheus-k8s" (703 of 770) Could not update rolebinding "openshift-ingress-operator/prometheus-k8s" (707 of 770) Could not update rolebinding "openshift-kube-apiserver-operator/prometheus-k8s" (711 of 770) Could not update rolebinding "openshift-kube-controller-manager-operator/prometheus-k8s" (720 of 770) Could not update rolebinding "openshift-kube-scheduler-operator/prometheus-k8s" (727 of 770) Could not update servicemonitor "openshift-machine-api/machine-api-operator" (735 of 770) Could not update servicemonitor "openshift-machine-config-operator/machine-config-daemon" (738 of 770) Could not update rolebinding "openshift-operator-lifecycle-manager/operator-lifecycle-manager-metrics" (741 of 770) Could not update rolebinding "openshift-apiserver-operator/prometheus-k8s" (746 of 770) Could not update rolebinding "openshift-controller-manager-operator/prometheus-k8s" (754 of 770) Could not update rolebinding "openshift-service-ca-operator/prometheus-k8s" (760 of 770) Could not update credentialsrequest "openshift-cloud-credential-operator/openshift-machine-api" (762 of 770)] I0128 05:23:54.503066 1 sync_worker.go:511] Detected while calculating next work: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0128 05:23:54.503129 1 sync_worker.go:442] Work changed, transitioning from Initializing and the suggested incoming Updating state to Updating I0128 05:23:54.503328 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Updating at attempt 0 I0128 05:23:55.805780 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:31:15.138816 1 task_graph.go:546] Result of work: [Cluster operator kube-apiserver is not available] I0128 05:32:14.583090 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Updating at attempt 1 I0128 05:32:14.757400 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:34:41.490702 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:39:08.311104 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:39:11.898243 1 task_graph.go:546] Result of work: [] I0128 05:42:52.216291 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 05:42:52.417818 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:43:02.775162 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:43:08.096067 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:43:20.953035 1 task_graph.go:546] Result of work: [] I0128 05:47:01.273745 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 05:47:01.499238 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:47:11.880847 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:47:17.200723 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:47:30.021159 1 task_graph.go:546] Result of work: [] I0128 05:51:10.338927 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 05:51:10.539055 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:51:20.946109 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:51:26.270105 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:51:39.078904 1 task_graph.go:546] Result of work: [] I0128 05:55:19.397703 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 05:55:19.597123 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:55:30.007108 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:55:35.247921 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:55:48.161754 1 task_graph.go:546] Result of work: [] I0128 05:59:28.479700 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 05:59:28.685770 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 05:59:39.087118 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 05:59:44.405217 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 05:59:57.216546 1 task_graph.go:546] Result of work: [] I0128 06:03:37.535971 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 06:03:37.740963 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 06:03:48.155910 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 06:03:53.536760 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 06:04:06.300595 1 task_graph.go:546] Result of work: [] I0128 06:07:46.618448 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 06:07:46.813701 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 06:07:57.276383 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 06:08:02.595566 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 06:08:15.466740 1 task_graph.go:546] Result of work: [] I0128 06:11:55.785389 1 sync_worker.go:559] Running sync 4.10.0-0.nightly-2022-01-28-042103 (force=false) on generation 2 in state Reconciling at attempt 0 I0128 06:11:55.985479 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0128 06:12:06.395114 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0128 06:12:11.716093 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0128 06:12:24.527108 1 task_graph.go:546] Result of work: [] # oc get clusterversion/version -oyaml apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-01-28T05:11:13Z" generation: 2 name: version resourceVersion: "28283" uid: 790e4038-e956-4ed6-942f-b41a3ce4491f spec: channel: stable-4.10 clusterID: d44c2e22-4544-4c74-a0d1-a035cb847f96 overrides: - group: monitoring.coreos.com kind: PrometheusRule name: machine-api-operator-prometheus-rules namespace: openshift-machine-api unmanaged: true status: availableUpdates: null conditions: - lastTransitionTime: "2022-01-28T05:11:13Z" message: Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing. reason: ClusterVersionOverridesSet status: "False" type: Upgradeable - lastTransitionTime: "2022-01-28T05:39:23Z" message: Done applying 4.10.0-0.nightly-2022-01-28-042103 status: "True" type: Available - lastTransitionTime: "2022-01-28T05:32:23Z" status: "False" type: Failing - lastTransitionTime: "2022-01-28T05:39:23Z" message: Cluster version is 4.10.0-0.nightly-2022-01-28-042103 status: "False" type: Progressing - lastTransitionTime: "2022-01-28T05:11:13Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.10.0-0.nightly-2022-01-28-042103 not found in the "stable-4.10" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates desired: image: registry.ci.openshift.org/ocp/release@sha256:2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 version: 4.10.0-0.nightly-2022-01-28-042103 history: - completionTime: "2022-01-28T05:39:23Z" image: registry.ci.openshift.org/ocp/release@sha256:2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 startedTime: "2022-01-28T05:11:13Z" state: Completed verified: false version: 4.10.0-0.nightly-2022-01-28-042103 observedGeneration: 2 versionHash: OEdVOFmx3QQ= Trevor, It doesn't bother me too much since the installation passed. I'm re-opening it in case you worry about it.
#729 gets 2nd time attempt: Starting the installation with 4.10.0-0.nightly-2022-01-28-042103 (version_A) and then upgrading to 4.10.0-fc.3 (version_B) # oc get clusterversion/version -oyaml apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-01-28T05:40:27Z" generation: 4 name: version resourceVersion: "47727" uid: 2701bfea-2cd4-4f70-9935-acff3fd996ab spec: channel: stable-4.10 clusterID: bc00530c-7732-4106-9d66-d0475a31b82e desiredUpdate: force: false image: quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 version: "" overrides: - group: monitoring.coreos.com kind: PrometheusRule name: machine-api-operator-prometheus-rules namespace: openshift-machine-api unmanaged: true status: availableUpdates: null conditions: - lastTransitionTime: "2022-01-28T05:40:27Z" message: Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing. reason: ClusterVersionOverridesSet status: "False" type: Upgradeable - lastTransitionTime: "2022-01-28T06:00:16Z" message: Done applying 4.10.0-fc.3 status: "True" type: Available - lastTransitionTime: "2022-01-28T06:21:58Z" status: "False" type: Failing - lastTransitionTime: "2022-01-28T06:00:16Z" message: Cluster version is 4.10.0-fc.3 status: "False" type: Progressing - lastTransitionTime: "2022-01-28T05:40:27Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.10.0-fc.3 not found in the "stable-4.10" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates desired: image: quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 version: 4.10.0-fc.3 history: - completionTime: "2022-01-28T06:00:16Z" image: quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 startedTime: "2022-01-28T05:48:13Z" state: Completed verified: true version: 4.10.0-fc.3 - completionTime: "2022-01-28T05:48:13Z" image: registry.ci.openshift.org/ocp/release@sha256:2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 startedTime: "2022-01-28T05:40:27Z" state: Partial verified: false version: 4.10.0-0.nightly-2022-01-28-042103 observedGeneration: 4 versionHash: kPcwc_Mohu4= Well, once bootstrapping completes, the cluster transitions to the updates version_B. So we see version_A is partially completed. Seems different from comment#10 # grep -r 'Running sync.*in state' cvo/ | sort cvo/2022-01-28T05:45:55+0000-cluster-version-operator-7c678c568b-gvmb8.log:I0128 05:48:13.408538 1 sync_worker.go:559] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 2 in state Initializing at attempt 0 cvo/2022-01-28T05:45:55+0000-cluster-version-operator-7c678c568b-gvmb8.log:I0128 05:49:11.049978 1 sync_worker.go:559] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 3 in state Updating at attempt 0 cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 05:57:45.919323 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Initializing at attempt 0 cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 06:02:59.124324 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:05:34.050689 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:10:48.006766 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 1 cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:16:25.533610 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:21:07.328686 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 1 cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:24:04.521057 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:27:01.714045 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:29:58.905327 1 sync_worker.go:558] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 4 in state Reconciling at attempt 0 # grep -r 'prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules"' cvo/ | sort cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 05:57:50.366409 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 05:57:50.366421 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 06:03:35.770675 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:00:31+0000-cluster-version-operator-9b946b49d-9sgd7.log:I0128 06:03:35.770688 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:06:01.200271 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:06:01.200311 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:11:08.817030 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:01:59+0000-cluster-version-operator-9b946b49d-hvjkv.log:I0128 06:11:08.817040 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:16:54.530675 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:16:54.530683 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:21:28.190978 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:21:28.190987 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:24:34.230948 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:24:34.230962 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:27:27.526164 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:27:27.526185 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:30:24.714642 1 sync_worker.go:771] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) cvo/2022-01-28T06:12:49+0000-cluster-version-operator-9b946b49d-65299.log:I0128 06:30:24.714652 1 sync_worker.go:775] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (736 of 769) as unmanaged The overrides resource is skipped which looks good to me.
No hits in Loki: {invoker!~"|.*upgrade.*"} | unpack | container_name="cluster-version-operator" |= "Work changed, transitioning from" since [1] landed, outside of a few older releases with the "and the suggested incoming" that I dropped in [1]. Not a lot of data yet, but that's a promising start. [1]: https://github.com/openshift/cluster-version-operator/pull/733
Verifying #728 730 733 with 4.11.0-0.nightly-2022-02-08-180554 $ journalctl -u bootkube.service |grep "overrides\|patched" Feb 09 09:31:58 yanyang0209a-s6hkw-bootstrap.c.openshift-qe.internal bootkube.sh[2431]: Created "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version Feb 09 09:42:51 yanyang0209a-s6hkw-bootstrap.c.openshift-qe.internal bootkube.sh[2431]: Skipped "cvo-overrides.yaml" clusterversions.v1.config.openshift.io/version -n openshift-cluster-version as it already exists Feb 09 09:42:56 yanyang0209a-s6hkw-bootstrap.c.openshift-qe.internal bootkube.sh[2431]: Sending bootstrap-finished event.Restoring CVO overrides Feb 09 09:45:36 yanyang0209a-s6hkw-bootstrap.c.openshift-qe.internal bootkube.sh[2431]: Trying again to restore CVO overrides Feb 09 09:45:37 yanyang0209a-s6hkw-bootstrap.c.openshift-qe.internal bootkube.sh[2431]: clusterversion.config.openshift.io/version patched $ sudo crictl logs e4dde355b988e 2>&1 | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged' I0209 09:31:58.859133 1 sync_worker.go:255] Propagating initial target version {4.11.0-0.nightly-2022-02-08-180554 registry.ci.openshift.org/ocp/release@sha256:5c3731a7afc9a8dc31fcafabe71dd4b83b6ac7273cd1fa10d12f916445a64709 false} to sync worker loop in state Initializing. I0209 09:31:58.859526 1 sync_worker.go:511] Detected while calculating next work: version (from { false} to {4.11.0-0.nightly-2022-02-08-180554 registry.ci.openshift.org/ocp/release@sha256:5c3731a7afc9a8dc31fcafabe71dd4b83b6ac7273cd1fa10d12f916445a64709 false}) and overrides changes I0209 09:31:58.859600 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 1 in state Initializing at attempt 0 I0209 09:32:02.265415 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:32:20.915218 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:32:20.915230 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0209 09:32:24.098689 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0209 09:32:30.996565 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0209 09:32:30.996582 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0209 09:32:30.996600 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0209 09:32:38.728092 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0209 09:32:38.728107 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0209 09:32:43.841755 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:34:51.521114 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 1 in state Initializing at attempt 1 I0209 09:34:51.976528 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:34:52.102641 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:34:57.377693 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:34:57.377703 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0209 09:35:02.152195 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0209 09:35:02.152209 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0209 09:35:02.152227 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0209 09:35:03.354162 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0209 09:35:08.901910 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0209 09:35:08.901934 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0209 09:38:05.251105 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 1 in state Initializing at attempt 2 I0209 09:38:06.509014 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:38:09.457928 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:38:15.811300 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:38:15.811310 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0209 09:38:17.584653 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0209 09:38:20.639361 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0209 09:38:20.639380 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0209 09:38:20.639398 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0209 09:38:22.140200 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0209 09:38:22.141445 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0209 09:41:59.924559 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 1 in state Initializing at attempt 3 I0209 09:42:01.232882 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:42:06.365187 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0209 09:42:08.391388 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged Bootstrap CVO looks good. # oc -n openshift-cluster-version logs cluster-version-operator-98bcdcc86-cpn69 | grep 'Running sync.*in state\|Running sync.*dns.*cluster\|Detected while\|Ignoring detected version\|Propagating initial target version\|Running sync.* config .*cluster\|Skipping .* as unmanaged\|Result of work\|Work changed' I0209 09:43:25.819298 1 sync_worker.go:255] Propagating initial target version {4.11.0-0.nightly-2022-02-08-180554 registry.ci.openshift.org/ocp/release@sha256:5c3731a7afc9a8dc31fcafabe71dd4b83b6ac7273cd1fa10d12f916445a64709 false} to sync worker loop in state Initializing. I0209 09:43:25.822631 1 sync_worker.go:511] Detected while calculating next work: version (from { false} to {4.11.0-0.nightly-2022-02-08-180554 registry.ci.openshift.org/ocp/release@sha256:5c3731a7afc9a8dc31fcafabe71dd4b83b6ac7273cd1fa10d12f916445a64709 false}) and overrides changes I0209 09:43:25.822700 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 1 in state Initializing at attempt 0 I0209 09:43:28.205027 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:43:30.432653 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:43:32.939159 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:43:32.939170 1 sync_worker.go:776] Skipping dns "cluster" (23 of 770) as unmanaged I0209 09:43:35.011136 1 sync_worker.go:776] Skipping infrastructure "cluster" (26 of 770) as unmanaged I0209 09:43:35.011154 1 sync_worker.go:776] Skipping ingress "cluster" (27 of 770) as unmanaged I0209 09:43:35.011171 1 sync_worker.go:776] Skipping network "cluster" (28 of 770) as unmanaged I0209 09:43:37.698614 1 sync_worker.go:776] Skipping proxy "cluster" (32 of 770) as unmanaged I0209 09:43:37.698631 1 sync_worker.go:776] Skipping scheduler "cluster" (33 of 770) as unmanaged I0209 09:43:38.158585 1 sync_worker.go:776] Skipping customresourcedefinition "networks.config.openshift.io" (47 of 770) as unmanaged I0209 09:45:38.549286 1 sync_worker.go:511] Detected while considering cluster version generation 2: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0209 09:45:38.630583 1 task_graph.go:546] Result of work: [Cluster operator kube-apiserver is not available Cluster operator authentication is not available Cluster operator image-registry is not available Cluster operator ingress is not available Cluster operator monitoring is not available Cluster operator openshift-samples is not available Cluster operator console is not available] I0209 09:45:38.648484 1 sync_worker.go:511] Detected while calculating next work: overrides change ([{ConfigMap kube-system ingress-uid true} {ConfigMap openshift-config cloud-provider-config true} {ConfigMap kube-system cluster-config-v1 true} {DNS config.openshift.io cluster true} {Infrastructure config.openshift.io cluster true} {Ingress config.openshift.io cluster true} {CustomResourceDefinition apiextensions.k8s.io networks.config.openshift.io true} {Network config.openshift.io cluster true} {Proxy config.openshift.io cluster true} {Scheduler config.openshift.io cluster true} {Secret kube-system kube-cloud-cfg true} {ConfigMap kube-system root-ca true} {Secret openshift-machine-config-operator machine-config-server-tls true} {Secret openshift-config pull-secret true} {Secret kube-system gcp-credentials true} {Secret kube-system kubeadmin true} {Role rbac.authorization.k8s.io kube-system gcp-creds-secret-reader true} {ConfigMap openshift-config openshift-install-manifests true} {PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}] to [{PrometheusRule monitoring.coreos.com openshift-machine-api machine-api-operator-prometheus-rules true}]) I0209 09:45:38.648660 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Initializing at attempt 1 I0209 09:45:41.023666 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:45:42.895609 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:45:44.436761 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:48:18.350215 1 task_graph.go:546] Result of work: [Cluster operator kube-apiserver is not available Cluster operator authentication is not available Cluster operator ingress is not available Cluster operator monitoring is not available Cluster operator console is not available] I0209 09:49:06.242723 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Initializing at attempt 2 I0209 09:49:07.569792 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:49:10.087267 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:49:15.283317 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:51:45.944372 1 task_graph.go:546] Result of work: [Cluster operator kube-apiserver is not available Cluster operator authentication is not available Cluster operator console is not available] I0209 09:53:07.442439 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Initializing at attempt 3 I0209 09:53:09.476108 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:53:11.302903 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:53:13.755557 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:55:24.942482 1 task_graph.go:546] Result of work: [] I0209 09:58:04.644601 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 09:58:04.845645 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 09:58:15.253443 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 09:58:20.583140 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 09:58:33.388160 1 task_graph.go:546] Result of work: [] I0209 10:01:13.089653 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:01:13.290250 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:01:23.698845 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:01:29.070506 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:01:41.828826 1 task_graph.go:546] Result of work: [] I0209 10:04:21.530633 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:04:21.728784 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:04:32.138515 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:04:37.457063 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:04:50.365215 1 task_graph.go:546] Result of work: [] I0209 10:07:30.066659 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:07:30.274922 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:07:40.676567 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:07:45.995535 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:07:58.814672 1 task_graph.go:546] Result of work: [] I0209 10:10:38.515931 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:10:38.711635 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:10:49.124265 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:10:54.445850 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:11:07.254945 1 task_graph.go:546] Result of work: [] I0209 10:13:46.957555 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:13:47.161303 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:13:57.516142 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:14:02.840961 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:14:15.695227 1 task_graph.go:546] Result of work: [] I0209 10:16:55.398579 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:16:55.606493 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:17:05.958672 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:17:11.277708 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:17:24.162985 1 task_graph.go:546] Result of work: [] I0209 10:20:03.864557 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:20:04.062980 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:20:14.471667 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:20:19.746068 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:20:32.604947 1 task_graph.go:546] Result of work: [] I0209 10:23:12.306493 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:23:12.503724 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:23:22.914980 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:23:28.187451 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:23:41.037732 1 task_graph.go:546] Result of work: [] I0209 10:26:20.738867 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:26:20.935367 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:26:31.346095 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:26:36.669632 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:26:49.492239 1 task_graph.go:546] Result of work: [] I0209 10:29:29.194558 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:29:29.391015 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:29:39.801679 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:29:45.123273 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:29:57.942004 1 task_graph.go:546] Result of work: [] I0209 10:32:37.643452 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:32:37.843874 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:32:48.251471 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:32:53.576501 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:33:06.380358 1 task_graph.go:546] Result of work: [] I0209 10:35:46.081615 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:35:46.280946 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:35:56.688856 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:36:02.009086 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:36:14.822498 1 task_graph.go:546] Result of work: [] I0209 10:38:54.524957 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:38:54.718921 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:39:05.082343 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:39:10.401286 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:39:23.262575 1 task_graph.go:546] Result of work: [] I0209 10:42:02.963625 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:42:03.161844 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:42:13.577747 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:42:18.895183 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:42:31.721121 1 task_graph.go:546] Result of work: [] I0209 10:45:11.422724 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:45:11.624706 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:45:21.981476 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:45:27.301605 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:45:40.203716 1 task_graph.go:546] Result of work: [] I0209 10:48:19.905441 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:48:20.139336 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:48:30.513643 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:48:35.832514 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:48:48.640087 1 task_graph.go:546] Result of work: [] I0209 10:51:28.341580 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:51:28.539252 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:51:38.949724 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:51:44.270843 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:51:57.082121 1 task_graph.go:546] Result of work: [] I0209 10:54:36.787532 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:54:36.988823 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:54:47.395701 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:54:52.715829 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:55:05.531716 1 task_graph.go:546] Result of work: [] I0209 10:57:45.234883 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 10:57:45.437206 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 10:57:55.844700 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 10:58:01.261798 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 10:58:14.072135 1 task_graph.go:546] Result of work: [] I0209 11:00:53.773560 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:00:53.973840 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:01:04.381726 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:01:09.702085 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:01:22.518212 1 task_graph.go:546] Result of work: [] I0209 11:04:02.219243 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:04:02.428450 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:04:12.829033 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:04:18.160905 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:04:30.965196 1 task_graph.go:546] Result of work: [] I0209 11:07:10.673942 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:07:10.941351 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:07:21.301089 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:07:26.667742 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:07:39.431591 1 task_graph.go:546] Result of work: [] I0209 11:10:19.132591 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:10:19.339973 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:10:29.692298 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:10:35.016510 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:10:47.873403 1 task_graph.go:546] Result of work: [] I0209 11:13:27.574900 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:13:27.770469 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:13:38.183337 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:13:43.502852 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:13:56.314521 1 task_graph.go:546] Result of work: [] I0209 11:16:36.015924 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:16:36.221870 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:16:46.624936 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:16:51.942818 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:17:04.760443 1 task_graph.go:546] Result of work: [] I0209 11:19:44.462988 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:19:44.662915 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:19:55.071746 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:20:00.392989 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:20:13.201607 1 task_graph.go:546] Result of work: [] I0209 11:22:52.902555 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:22:53.104947 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:23:03.512205 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:23:08.831249 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:23:21.655319 1 task_graph.go:546] Result of work: [] I0209 11:26:01.359717 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:26:01.557560 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:26:11.966429 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:26:17.287877 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:26:30.122640 1 task_graph.go:546] Result of work: [] I0209 11:29:09.825291 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-08-180554 (force=false) on generation 2 in state Reconciling at attempt 0 I0209 11:29:10.036769 1 sync_worker.go:772] Running sync for dns "cluster" (23 of 770) I0209 11:29:20.433540 1 sync_worker.go:772] Running sync for config "cluster" (155 of 770) I0209 11:29:25.755072 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged I0209 11:29:38.557356 1 task_graph.go:546] Result of work: [] Control-plane CVO looks good as well. Updating is not found. I think verification for #728 passed. For #729, Trevor, could you please take a look at comment#32? Thanks.
Comment 39 looks good to me too. Going back to comment 32: (In reply to Yang Yang from comment #32) > - completionTime: "2022-01-28T05:48:13Z" > image: > registry.ci.openshift.org/ocp/release@sha256: > 2f40c0fb545eb81b2bb4dbc87d381a1ab33beb9b6f42c33b3fb28806f63dea84 > startedTime: "2022-01-28T05:40:27Z" > state: Partial > verified: false > version: 4.10.0-0.nightly-2022-01-28-042103 Yeah, we want 'state: Completed' here. > # grep -r 'Running sync.*in state' cvo/ | sort > cvo/2022-01-28T05:45:55+0000-cluster-version-operator-7c678c568b-gvmb8.log: I0128 05:48:13.408538 1 sync_worker.go:559] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 2 in state Initializing at attempt 0 > cvo/2022-01-28T05:45:55+0000-cluster-version-operator-7c678c568b-gvmb8.log: I0128 05:49:11.049978 1 sync_worker.go:559] Running sync quay.io/openshift-release-dev/ocp-release@sha256:33a77dfec9665e932183b36541dbaba5f6b873c37d4cc3cc7d094ff2dff057f2 (force=false) on generation 3 in state Updating at attempt 0 Hmm. The full logs from this would be helpful. 'Result of work' would let us know whether we had completed the Initializing attempt without error (unlikely but possible) and if we had not successfully completed that round, perhaps there is other context that explains the CVO's reasoning for the switch. And maybe #729 needs #733 or some later change to avoid the update transition?
Verifying #729 with 4.11.0-0.nightly-2022-02-10-031822 1) Install a cluster with 4.11.0-0.nightly-2022-02-10-031822 2) Overrides prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" 3) Upgrade to 4.11.0-0.nightly-2022-02-10-031822 during installation # oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 --allow-explicit-upgrade --force --allow-upgrade-with-warnings warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. warning: --allow-upgrade-with-warnings is bypassing: the cluster is already upgrading: Reason: Message: Working towards 4.11.0-0.nightly-2022-02-10-031822: 656 of 770 done (85% complete) Updating to release image registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 After installation completes, # oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-02-11T07:48:15Z" generation: 3 name: version resourceVersion: "45975" uid: cb868617-7052-4168-a25d-e05356e113ee spec: channel: stable-4.10 clusterID: 56b6f356-b7a6-4559-a8cb-d333f05d7554 desiredUpdate: force: true image: registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 version: "" overrides: - group: monitoring.coreos.com kind: PrometheusRule name: machine-api-operator-prometheus-rules namespace: openshift-machine-api unmanaged: true status: availableUpdates: null capabilities: {} conditions: - lastTransitionTime: "2022-02-11T07:48:15Z" message: Disabling ownership via cluster version overrides prevents upgrades. Please remove overrides before continuing. reason: ClusterVersionOverridesSet status: "False" type: Upgradeable - lastTransitionTime: "2022-02-11T08:15:49Z" message: Done applying 4.11.0-0.nightly-2022-02-10-180317 status: "True" type: Available - lastTransitionTime: "2022-02-11T08:15:49Z" status: "False" type: Failing - lastTransitionTime: "2022-02-11T08:15:49Z" message: Cluster version is 4.11.0-0.nightly-2022-02-10-180317 status: "False" type: Progressing - lastTransitionTime: "2022-02-11T07:48:15Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-02-10-180317 not found in the "stable-4.10" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates desired: image: registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 version: 4.11.0-0.nightly-2022-02-10-180317 history: - completionTime: "2022-02-11T08:15:49Z" image: registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 startedTime: "2022-02-11T08:07:07Z" state: Completed verified: true version: 4.11.0-0.nightly-2022-02-10-180317 - completionTime: "2022-02-11T08:07:07Z" image: registry.ci.openshift.org/ocp/release@sha256:fcbbfdc2d06fe1587222dc2f69317a4031d113894fe913c3fd4fde76507169cd startedTime: "2022-02-11T07:48:15Z" state: Partial verified: false version: 4.11.0-0.nightly-2022-02-10-031822 observedGeneration: 3 versionHash: BpkwDMmFa2M= kind: List metadata: resourceVersion: "" selfLink: "" Well, seems like the install-time version change is not a big issue. $ sudo crictl logs ffa028b2a3c94 2>&1 | grep 'Running sync.*in state' I0211 07:48:15.362013 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-10-031822 (force=false) on generation 1 in state Initializing at attempt 0 I0211 07:50:58.479015 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-10-031822 (force=false) on generation 1 in state Initializing at attempt 1 I0211 07:53:59.227421 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-10-031822 (force=false) on generation 1 in state Initializing at attempt 2 I0211 07:57:39.202030 1 sync_worker.go:559] Running sync 4.11.0-0.nightly-2022-02-10-031822 (force=false) on generation 2 in state Initializing at attempt 3 # grep -r 'Running sync.*in state' | sort -k 4,5 2022-02-11T08:04:32+0000-cluster-version-operator-55974f4bd6-s9sd6.log:I0211 08:07:07.223373 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Initializing at attempt 0 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:07:51.476199 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Initializing at attempt 0 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:10:44.573443 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Initializing at attempt 1 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:11:31.922276 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Initializing at attempt 0 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:15:25.815035 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Initializing at attempt 1 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:19:13.117953 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Reconciling at attempt 0 2022-02-11T08:20:49+0000-cluster-version-operator-85485ddd46-8xcc7.log:I0211 08:21:34.292985 1 sync_worker.go:559] Running sync registry.ci.openshift.org/ocp/release@sha256:2df40e16d9933cbf68737aee761604c22d1d4d1635c3a13a4c34b8efaafb7755 (force=true) on generation 3 in state Reconciling at attempt 0 The state transitions from Initializing (Version_A) -> Initializing (Version_B) -> Reconciling (Version_B). Looks good to me. # grep -r 'prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules"' 2022-02-11T08:20:49+0000-cluster-version-operator-85485ddd46-8xcc7.log:I0211 08:22:02.108384 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:20:49+0000-cluster-version-operator-85485ddd46-8xcc7.log:I0211 08:22:02.108395 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:11:39.228554 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:11:39.228652 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:15:29.279041 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:15:29.279054 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:19:29.049442 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:10:48+0000-cluster-version-operator-85485ddd46-g9mxm.log:I0211 08:19:29.049450 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:07:58.020101 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:07:58.020533 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:10:49.209627 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:07:09+0000-cluster-version-operator-85485ddd46-4xgrz.log:I0211 08:10:49.209637 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged 2022-02-11T08:04:32+0000-cluster-version-operator-55974f4bd6-s9sd6.log:I0211 08:07:18.010050 1 sync_worker.go:772] Running sync for prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) 2022-02-11T08:04:32+0000-cluster-version-operator-55974f4bd6-s9sd6.log:I0211 08:07:18.010061 1 sync_worker.go:776] Skipping prometheusrule "openshift-machine-api/machine-api-operator-prometheus-rules" (737 of 770) as unmanaged The overrides resource is skipped. Looks good to me. So based on the comment and comment#39, moving it to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069