The image-registry folks added shareProcessNamespace to a Deployment between 4.1 and 4.2: $ git --no-pager log --oneline -G shareProcessNamespace origin/release-4.2..origin/master -- manifests ...no hits... $ git --no-pager log --oneline -G shareProcessNamespace origin/release-4.1..origin/master -- manifests cc9e9fe05 (origin/pr/364) Integrating watchdog as a sidecar to registry operator. 3803d25ff Revert "Integrating watchdog as a sidecar to registry operator." ffbb403ef (origin/pr/342) Integrating watchdog as a sidecar to registry operator. But the CVO does not reconcile that property today. That means that whatever the value was when the manifest was created would be preserved regardless of the value in future manifests. We should start reconciling this property and probably audit for other missing pod properties, and then port that fix back probably as far as we can excepting end-of-life versions. Spun out from bug 1857782.
Test plan for this based on [1]: 1. Create a 4.1 cluster. 2. Confirm that 'oc -n openshift-image-registry get -o yaml cluster-image-registry-operator' does not include shareProcessNamespace. 3. Confirm that 'oc -n openshift-dns-operator get -o yaml dns-operator' does not include .spec.template.spec.terminationGracePeriodSeconds (it will include .spec.template.spec.containers[].terminationGracePeriodSeconds). 4. Update 4.2 -> 4.3 -> 4.4 -> 4.5. 5. Confirm that the registry deployment still lacks shareProcessNamespace and the DNS operator still has the mispositioned terminationGracePeriodSeconds. 6. Update to a 4.6 nightly with the fix. 7. Confirm that the registry deployment includes 'shareProcessNamespace: true' and DNS operator has the correctly-positioned terminationGracePeriodSeconds after the manifest change from [2] (both of which landed in 4.2 manifests but had no effect because of this CVO bug). The only dnsPolicy consumer in 4.6.0-fc.0 is [3], which hasn't changed since 4.1. So no easy testing ideas there, but also no downside if we've somehow botched it in this fix. Standard regression testing should be sufficient. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1857782#c9 [2]: https://github.com/openshift/cluster-dns-operator/pull/114 [3]: https://github.com/openshift/cluster-dns-operator/blame/4f21006b681c1165abc96173c8cc51ad1f50f90e/manifests/0000_70_dns-operator_02-deployment.yaml#L16
No available payload to verify this bug till now.
Since 4.1 and 4.2 is sunset and I tried to upgrade from 4.1 to 4.2->4.3->4.4->4.5->4.6, machine-config-operator pod cannot startup with below error: $ oc logs pods/machine-config-operator-677c5786c8-zdvhn -n openshift-machine-config-operator I0817 07:31:12.744568 1 start.go:46] Version: 4.6.0-0.nightly-2020-08-16-072105 (Raw: v4.6.0-202008130129.p0-dirty, Hash: b7f3c7043aa9e6a5ca4718f53e26a1db9c5716f6) I0817 07:31:12.747468 1 leaderelection.go:242] attempting to acquire leader lease openshift-machine-config-operator/machine-config... I0817 07:33:10.566770 1 leaderelection.go:252] successfully acquired lease openshift-machine-config-operator/machine-config I0817 07:33:11.183070 1 operator.go:270] Starting MachineConfigOperator E0817 07:33:13.375289 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference) goroutine 239 [running]: k8s.io/apimachinery/pkg/util/runtime.logPanic(0x18151e0, 0x2a37200) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0xa3 k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x82 panic(0x18151e0, 0x2a37200) /opt/rh/go-toolset-1.14/root/usr/lib/go-toolset-1.14-golang/src/runtime/panic.go:969 +0x166 github.com/openshift/machine-config-operator/lib/resourcemerge.ensureControllerConfigSpec(0xc00082f80f, 0xc000133798, 0xc000c7d5e0, 0xb, 0x0, 0x0, 0xc000c7d718, 0x3, 0xc000dfced0, 0x26, ...) /go/src/github.com/openshift/machine-config-operator/lib/resourcemerge/machineconfig.go:83 +0x19f github.com/openshift/machine-config-operator/lib/resourcemerge.EnsureControllerConfig(0xc00082f80f, 0xc000133680, 0x16c3311, 0x10, 0xc000dfcf60, 0x24, 0xc000d7bc60, 0x19, 0x0, 0x0, ...) /go/src/github.com/openshift/machine-config-operator/lib/resourcemerge/machineconfig.go:19 +0xd4 github.com/openshift/machine-config-operator/lib/resourceapply.ApplyControllerConfig(0x7f10896bc890, 0xc000096a90, 0xc000133400, 0x7f10896bc890, 0xc000096a90, 0x5aba, 0x5b33) /go/src/github.com/openshift/machine-config-operator/lib/resourceapply/machineconfig.go:67 +0x185 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).syncMachineConfigController(0xc000596000, 0xc000121880, 0xc01334abea, 0x6e164d6c948) /go/src/github.com/openshift/machine-config-operator/pkg/operator/sync.go:468 +0x438 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).syncAll(0xc000596000, 0xc00082fca8, 0x6, 0x6, 0xc0007c2c01, 0x413893) /go/src/github.com/openshift/machine-config-operator/pkg/operator/sync.go:69 +0x177 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).sync(0xc000596000, 0xc0007dac90, 0x30, 0x0, 0x0) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:362 +0x40a github.com/openshift/machine-config-operator/pkg/operator.(*Operator).processNextWorkItem(0xc000596000, 0x203000) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:318 +0xd2 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).worker(0xc000596000) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:307 +0x2b k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000428030) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000428030, 0x1cc05a0, 0xc000bc0000, 0xc000406001, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xa3 k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000428030, 0x3b9aca00, 0x0, 0x1, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0xe2 k8s.io/apimachinery/pkg/util/wait.Until(0xc000428030, 0x3b9aca00, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/openshift/machine-config-operator/pkg/operator.(*Operator).Run /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:276 +0x3dc panic: runtime error: invalid memory address or nil pointer dereference [recovered] panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x1a8 pc=0x13ab99f] goroutine 239 [running]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:55 +0x105 panic(0x18151e0, 0x2a37200) /opt/rh/go-toolset-1.14/root/usr/lib/go-toolset-1.14-golang/src/runtime/panic.go:969 +0x166 github.com/openshift/machine-config-operator/lib/resourcemerge.ensureControllerConfigSpec(0xc00082f80f, 0xc000133798, 0xc000c7d5e0, 0xb, 0x0, 0x0, 0xc000c7d718, 0x3, 0xc000dfced0, 0x26, ...) /go/src/github.com/openshift/machine-config-operator/lib/resourcemerge/machineconfig.go:83 +0x19f github.com/openshift/machine-config-operator/lib/resourcemerge.EnsureControllerConfig(0xc00082f80f, 0xc000133680, 0x16c3311, 0x10, 0xc000dfcf60, 0x24, 0xc000d7bc60, 0x19, 0x0, 0x0, ...) /go/src/github.com/openshift/machine-config-operator/lib/resourcemerge/machineconfig.go:19 +0xd4 github.com/openshift/machine-config-operator/lib/resourceapply.ApplyControllerConfig(0x7f10896bc890, 0xc000096a90, 0xc000133400, 0x7f10896bc890, 0xc000096a90, 0x5aba, 0x5b33) /go/src/github.com/openshift/machine-config-operator/lib/resourceapply/machineconfig.go:67 +0x185 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).syncMachineConfigController(0xc000596000, 0xc000121880, 0xc01334abea, 0x6e164d6c948) /go/src/github.com/openshift/machine-config-operator/pkg/operator/sync.go:468 +0x438 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).syncAll(0xc000596000, 0xc00082fca8, 0x6, 0x6, 0xc0007c2c01, 0x413893) /go/src/github.com/openshift/machine-config-operator/pkg/operator/sync.go:69 +0x177 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).sync(0xc000596000, 0xc0007dac90, 0x30, 0x0, 0x0) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:362 +0x40a github.com/openshift/machine-config-operator/pkg/operator.(*Operator).processNextWorkItem(0xc000596000, 0x203000) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:318 +0xd2 github.com/openshift/machine-config-operator/pkg/operator.(*Operator).worker(0xc000596000) /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:307 +0x2b k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000428030) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155 +0x5f k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000428030, 0x1cc05a0, 0xc000bc0000, 0xc000406001, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156 +0xa3 k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc000428030, 0x3b9aca00, 0x0, 0x1, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133 +0xe2 k8s.io/apimachinery/pkg/util/wait.Until(0xc000428030, 0x3b9aca00, 0xc000094180) /go/src/github.com/openshift/machine-config-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90 +0x4d created by github.com/openshift/machine-config-operator/pkg/operator.(*Operator).Run /go/src/github.com/openshift/machine-config-operator/pkg/operator/operator.go:276 +0x3dc
(In reply to Wenjing Zheng from comment #5) > Since 4.1 and 4.2 is sunset and I tried to upgrade from 4.1 to > 4.2->4.3->4.4->4.5->4.6, machine-config-operator pod cannot startup with > below error: ... That looks like a different issue. Did image-registry pod had spec.shareProcessNamespace set?
comment #5 is about https://bugzilla.redhat.com/show_bug.cgi?id=1861404
(In reply to Vadim Rutkovsky from comment #6) > (In reply to Wenjing Zheng from comment #5) > > Since 4.1 and 4.2 is sunset and I tried to upgrade from 4.1 to > > 4.2->4.3->4.4->4.5->4.6, machine-config-operator pod cannot startup with > > below error: > ... > > That looks like a different issue. Did image-registry pod had > spec.shareProcessNamespace set? After upgrade to 4.6 latest nightly build, image-registry pod has NO spec.shareProcessNamespace set with current cluster(machine-config remain in 4.5).
Checked the DNS operator settings in the same upgrade cluster (with machine-config-operator pod error) but seems the DNS operator has the correctly-positioned terminationGracePeriodSeconds and dnsPolicy: $ oc -n openshift-dns-operator get deploy/dns-operator -o go-template='{{.spec.template.spec.terminationGracePeriodSeconds}}' 2 $ oc -n openshift-dns-operator get deploy/dns-operator -o go-template='{{.spec.template.spec.dnsPolicy}}' Default
I am confused about the earlier verification attempt. Bug 1861404 is talking about a 4.6.0-0.nightly-2020-07-25-091217 target, but that is quite old, long before the PR addressing this bug landed: $ oc adm release info --commits registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-07-25-091217 | grep cluster-version-operator cluster-version-operator https://github.com/openshift/cluster-version-operator a49fef5c66c6b0707c54fd93f84d2f51d3d28aca $ git log --oneline origin/master | grep -n 'a49fef5c\|a03a8957' 2:a03a8957 Merge pull request #428 from vrutkovs/shareProcessNamespace 39:a49fef5c Merge pull request #411 from deads2k/emit-events-on-update And there should be no need to involve nightlies for earlier 4.y, where we have official releases available. You should be able to use: $ for V in 4.1 4.2 4.3 4.4 4.5; do curl -sH 'Accept:application/json' "https://api.openshift.com/api/upgrades_info/v1/graph?channel=stable-${V}" | jq -r '.nodes[].version' | sort -V | tail -n1; done 4.1.41 4.2.36 4.3.31 4.4.16 4.5.5 and then a hop to a recent 4.6 nightly. In fact, 4.6.0-fc.1 is modern enough to include the patch: $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.6.0-fc.1-x86_64 | grep cluster-version-operator cluster-version-operator https://github.com/openshift/cluster-version-operator 71aef74480d199fe96a590f2f1e4e8056a9cb687 $ git log --oneline origin/master | grep -n '71aef744\|a03a8957' 1:71aef744 Merge pull request #430 from wking/drop-available-deployment-check 2:a03a8957 Merge pull request #428 from vrutkovs/shareProcessNamespace If you do not see the expected behavior in the next verification attempt, can you link a must-gather with the final state from this bug?
I still cannot see the option "shareProcessNamespace" when cluster is upgraded to 4.6.0-0.nightly-2020-08-18-055142 (we can ignore bug https://bugzilla.redhat.com/show_bug.cgi?id=1861404, since it has not been fixed and will be existing in current latest 4.6 nightly build, it is just reported on 4.6.0-0.nightly-2020-07-25-091217); Upgrade path: 4.1.41-x86_64->4.2.36-x86_64->4.3.31-x86_64->4.4.16-x86_64->4.5.5-x86_64->4.6.0-0.nightly-2020-08-18-055142 4.2-4.5 has "shareProcessNamespace": https://github.com/openshift/cluster-image-registry-operator/blob/release-4.5/manifests/07-operator.yaml#L20 4.6 has no "shareProcessNamespace": https://github.com/openshift/cluster-image-registry-operator/blob/master/manifests/07-operator.yaml https://github.com/openshift/cluster-image-registry-operator/pull/587
Are we sure these nightlies are up to date? I know the name is but are the contents?
(In reply to Wenjing Zheng from comment #11) > 4.2-4.5 has "shareProcessNamespace"... In your test cluster? Or are you just talking about the source repositories? I would expect the born-in-4.1 test cluster to lack shareProcessNamespace until it was updated to a release which had both the CVO patch from this bug and a manifest which requested shareProcessNamespace be set. > https://github.com/openshift/cluster-image-registry-operator/pull/587 Huh, I hadn't realized that they'd removed it in 4.6. I dunno if there are any 4.6 nightlies which have both our CVO patch from this bug landed (Aug. 12th and later [1]), but which still have shareProcessNamespace set (Aug. 5th and earlier [2]). Seems unlikely. I guess we could build a release image like that, if we wanted. [1]: https://github.com/openshift/cluster-version-operator/pull/428#event-3649049638 [2]: https://github.com/openshift/cluster-image-registry-operator/pull/587#event-3626507070
Ah, the 4.6 change means we can verify via a shorter update path that sticks to 4.6: $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.6.0-fc.0-x86_64 | grep 'cluster-version-operator\|cluster-image-registry-operator' cluster-image-registry-operator https://github.com/openshift/cluster-image-registry-operator 8eb457b2b93324c1954f5af439fb9c4612a93fc9 cluster-version-operator https://github.com/openshift/cluster-version-operator d2fc678353769e10a614fb98c15279da3b2b0ca5 $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.6.0-fc.1-x86_64 | grep 'cluster-version-operator\|cluster-image-registry-operator' cluster-image-registry-operator https://github.com/openshift/cluster-image-registry-operator 5eda706684c9ec69ae8be5745f0daf740fe947fe cluster-version-operator https://github.com/openshift/cluster-version-operator 71aef74480d199fe96a590f2f1e4e8056a9cb687 $ git --no-pager -C cluster-image-registry-operator log --oneline --first-parent origin/master | grep '66bf2fe\|8eb457b2\|5eda7066' 5eda70668 Merge pull request #586 from ricardomaraschini/bz-1857684 66bf2feb3 Merge pull request #587 from ricardomaraschini/remove-shared-namespace 8eb457b2b Merge pull request #584 from dmage/ignore-invalid-refs $ git --no-pager -C cluster-version-operator log --oneline --first-parent origin/master | grep 'a03a895\|d2fc6783\|71aef744' 71aef744 Merge pull request #430 from wking/drop-available-deployment-check a03a8957 Merge pull request #428 from vrutkovs/shareProcessNamespace d2fc6783 Merge pull request #423 from wking/clarify-currently-installed So you should be able to validate with: 1. Install 4.6.0-fc.0. Verify that shareProcessNamespace is set. 2. Update to 4.6.0-fc.1. Verify that shareProcessNamespace is not set.
Testing: 1. Launch 4.6.0-fc.0 with cluster-bot: launch quay.io/openshift-release-dev/ocp-release:4.6.0-fc.0-x86_64 2. Set a channel: $ oc patch clusterversion version --type json -p '[{"op": "add", "path": "/spec/channel", "value": "candidate-4.6"}]' 3. Check that the property is set: $ oc -n openshift-image-registry get -o jsonpath='{.spec.template.spec.shareProcessNamespace}{"\n"}' deployment cluster-image-registry-operator true 4. Update to 4.6.0-fc.1: $ oc adm upgrade --to 4.6.0-fc.1 $ sleep # wait for the update to complete 5. Check that the property is not set: $ oc -n openshift-image-registry get -o jsonpath='{.spec.template.spec.shareProcessNamespace}{"\n"}' deployment cluster-image-registry-operator true Huh. Ah, because setBoolPtr is treating "unset in the manifest" as "operator does not care", not "return to the Kube default for this property" [1]. I'll round with the registry folks about that distinction. Logs and artifacts and whatnot for my run will be in [2] once cluster-bot times the job out and collects them. [1]: https://github.com/openshift/cluster-version-operator/blob/47d87e1083cbc6921e0485a1c71eb91525ae5d4f/lib/resourcemerge/core.go#L539-L540 [2]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-gcp/1295851130839896064
(In reply to W. Trevor King from comment #16) > Testing: > > 4. Update to 4.6.0-fc.1: > > $ oc adm upgrade --to 4.6.0-fc.1 > $ sleep # wait for the update to complete > I am stucking at step #4, cannot upgrde to 4.6.0-fc.1. Here are some information: $ oc adm upgrade Cluster version is 4.6.0-fc.0 Updates: VERSION IMAGE 4.6.0-fc.1 quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c $ oc adm upgrade --to=4.6.0-fc.1 Updating to 4.6.0-fc.1 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-fc.0 True False 121m Cluster version is 4.6.0-fc.0 $ oc get all -n openshift-cluster-version W0819 15:38:53.829116 5731 warnings.go:67] batch/v1beta1 CronJob is deprecated in v1.22+, unavailable in v1.25+ NAME READY STATUS RESTARTS AGE pod/cluster-version-operator-77555b6fd9-g86w5 1/1 Running 0 4h2m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/cluster-version-operator ClusterIP 172.30.48.43 <none> 9099/TCP 4h29m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/cluster-version-operator 1/1 1 1 4h29m NAME DESIRED CURRENT READY AGE replicaset.apps/cluster-version-operator-5cf8c489d9 0 0 0 4h29m replicaset.apps/cluster-version-operator-77555b6fd9 1 1 1 4h29m $ oc get clusterversion -o json|jq -r '.items[].spec' { "channel": "candidate-4.6", "clusterID": "fd5470e8-1ab1-4350-9334-ff097c9d2364", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" } $ oc version Client Version: 4.6.0-fc.1 Server Version: 4.6.0-fc.0
(In reply to W. Trevor King from comment #14) > (In reply to Wenjing Zheng from comment #11) > > 4.2-4.5 has "shareProcessNamespace"... > > In your test cluster? Or are you just talking about the source > repositories? I would expect the born-in-4.1 test cluster to lack > shareProcessNamespace until it was updated to a release which had both the > CVO patch from this bug and a manifest which requested shareProcessNamespace > be set. > My cluster always has no "shareProcessNamespace" as your expectation, I am saying it has in source repo. Sorry for confusing you!
Correction for the output of command $oc get clusterversion -o json|jq -r '.items[].spec', the output should be { "channel": "candidate-4.6", "clusterID": "fd5470e8-1ab1-4350-9334-ff097c9d2364", "desiredUpdate": { "force": false, "image": "quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c", "version": "4.6.0-fc.1" }, "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" }
> Ah, because setBoolPtr is treating "unset in the manifest" as "operator does not care", not "return to the Kube default for this property"... Registry folks have [1] in flight with an explicit 'false'. > I am stucking at step #4, cannot upgrde to 4.6.0-fc.1 > ... > "clusterID": "fd5470e8-1ab1-4350-9334-ff097c9d2364", Huh. Pulling more of the ClusterVersion from the final Insights tarball from that cluster: $ tar -xOz config/version.json <20200819071410-807d045c114f47899b5ea002f7c1a7aa | jq '{spec, status}' { "spec": { "clusterID": "fd5470e8-1ab1-4350-9334-ff097c9d2364", "desiredUpdate": { "version": "4.6.0-fc.1", "image": "quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c", "force": false }, "upstream": "xxxxx://xxx.xxxxxxxxx.xxx/xxx/xxxxxxxxxxxxx/xx/xxxxx", "channel": "candidate-4.6" }, "status": { "desired": { "version": "4.6.0-fc.0", "image": "quay.io/openshift-release-dev/ocp-release@sha256:45e6bc583040384efb4033b22c58f054b12ac32c7874554885d74a0faf6fef79", "force": false }, "history": [ { "state": "Completed", "startedTime": "2020-08-19T03:09:41Z", "completionTime": "2020-08-19T03:34:06Z", "version": "4.6.0-fc.0", "image": "quay.io/openshift-release-dev/ocp-release@sha256:45e6bc583040384efb4033b22c58f054b12ac32c7874554885d74a0faf6fef79", "verified": false } ], "observedGeneration": 2, "versionHash": "kVdi1UOYMBM=", "conditions": [ { "type": "Available", "status": "True", "lastTransitionTime": "2020-08-19T03:34:06Z", "message": "Done applying 4.6.0-fc.0" }, { "type": "Failing", "status": "False", "lastTransitionTime": "2020-08-19T03:34:06Z" }, { "type": "Progressing", "status": "False", "lastTransitionTime": "2020-08-19T03:34:06Z", "message": "Cluster version is 4.6.0-fc.0" }, { "type": "RetrievedUpdates", "status": "True", "lastTransitionTime": "2020-08-19T05:29:32Z" } ], "availableUpdates": [ { "version": "4.6.0-fc.1", "image": "quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c", "force": false } ] } } Not clear to me why the CVO is neither accepting the requested desiredUpdate nor complaining with a condition about why it isn't accepting it. If you can reproduce, can you attach CVO logs from your stuck cluster? [1]: https://github.com/openshift/cluster-image-registry-operator/pull/591
-bash-4.2$ ./oc adm upgrade Cluster version is 4.6.0-fc.0 Updates: VERSION IMAGE 4.6.0-fc.1 quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c -bash-4.2$ ./oc get clusterversion -o json|jq -r '.items[].spec' { "channel": "candidate-4.6", "clusterID": "8882329b-b6b6-4752-a560-915775f4b1b4", "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" } -bash-4.2$ ./oc adm upgrade --to 4.6.0-fc.1 Updating to 4.6.0-fc.1 -bash-4.2$ ./oc get clusterversion -o json|jq -r '.items[].spec' { "channel": "candidate-4.6", "clusterID": "8882329b-b6b6-4752-a560-915775f4b1b4", "desiredUpdate": { "force": false, "image": "quay.io/openshift-release-dev/ocp-release@sha256:b0fcdaaac358ad352bb4a948ac1f88ad728c4b9b044c13a9e1294706d643dc7c", "version": "4.6.0-fc.1" }, "upstream": "https://api.openshift.com/api/upgrades_info/v1/graph" }
Upgrade to 4.6.0-fc.1 is successful now, QE will wait for a nightly-with-below-pr to have more confidence. Thanks for your support,@wking [1] https://github.com/openshift/cluster-image-registry-operator/pull/591
4.6.0-0.nightly-2020-08-21-084833 and later have the explicit false.
Upgrade from 4.6.0-fc.0 to 4.6.0-0.nightly-2020-08-24-034934: $ oc -n openshift-image-registry get -o jsonpath='{.spec.template.spec.shareProcessNamespace}{"\n"}' deployment cluster-image-registry-operator true $ oc adm upgrade --to-image=registry.svc.ci.openshift.org/ocp/release:4.6.0-0.nightly-2020-08-24-034934 --force=true --allow-explicit-upgrade wait.. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-08-24-034934 True False 60s Cluster version is 4.6.0-0.nightly-2020-08-24-034934 $ oc -n openshift-image-registry get -o jsonpath='{.spec.template.spec.shareProcessNamespace}{"\n"}' deployment cluster-image-registry-operator false
*** Bug 1857782 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196