Description of problem: Enabling the caps by modifying the baselineCapabilitySet from None to v4.11, baremetal, marketplace and openshift-samples are installed. But CVO doesn’t get the enabled caps shown in cv.status.capabilities.enabledCapabilities. # oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-04-01T03:02:32Z" generation: 4 name: version resourceVersion: "76564" uid: 730ec007-9c8c-4069-8b88-ce5147797353 spec: capabilities: additionalEnabledCapabilities: - marketplace baselineCapabilitySet: v4.11 channel: stable-4.11 clusterID: 176f1e28-9c97-4255-a429-4833f17202af status: availableUpdates: null capabilities: enabledCapabilities: - marketplace knownCapabilities: - baremetal - marketplace - openshift-samples conditions: - lastTransitionTime: "2022-04-01T03:02:35Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest not found in the "stable-4.11" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates - lastTransitionTime: "2022-04-01T03:02:35Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2022-04-01T03:02:35Z" message: Payload loaded version="4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest" image="registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4" reason: PayloadLoaded status: "True" type: ReleaseAccepted - lastTransitionTime: "2022-04-01T04:35:59Z" message: Done applying 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest status: "True" type: Available - lastTransitionTime: "2022-04-01T05:54:59Z" status: "False" type: Failing - lastTransitionTime: "2022-04-01T05:55:44Z" message: Cluster version is 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest status: "False" type: Progressing desired: image: registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4 version: 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest history: - completionTime: "2022-04-01T04:35:59Z" image: registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4 startedTime: "2022-04-01T03:02:35Z" state: Completed verified: false version: 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest observedGeneration: 4 versionHash: Q_qcbW4ZuAs= kind: List metadata: resourceVersion: "" selfLink: "" # oc get co | grep 'baremetal\|samples\|marketplace' baremetal 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest True False False 133m marketplace 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest True False False 3h31m openshift-samples 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest True False False 132m Version-Release number of the following components: rpm -q openshift-ansible rpm -q ansible ansible --version How reproducible: 1/1 Steps to Reproduce: 1.Install a cluster with the below cap set spec: capabilities: additionalEnabledCapabilities: - marketplace baselineCapabilitySet: None 2.Modifying the baselineCapabilitySet to v4.11 spec: capabilities: additionalEnabledCapabilities: - marketplace baselineCapabilitySet: v4.11 Actual results: cv.status.capabilities.enabledCapabilities doesn’t show the day-2 enabled caps in baselineCapabilitySet. Expected results: cv.status.capabilities.enabledCapabilities shows the day-2 enabled caps in baselineCapabilitySet. Additional info: The day-2 enabled caps in additionalEnabledCapabilities can be shown in cv.status.capabilities.enabledCapabilities CVO log is available at https://drive.google.com/file/d/1zcPyDqTePN6Hdey4je2Y6pqCXkKkG2U6/view?usp=sharing
Hmm, may be getting stepped on. Here we see it gets set: I0401 05:54:24.245258 1 sync_worker.go:709] Detected while considering cluster version generation 4: capabilities changed ({map[baremetal:{} marketplace:{} openshift-samples:{}] map[marketplace:{}] []} to {map[baremetal:{} marketplace:{} openshift-samples:{}] map[baremetal:{} marketplace:{} openshift-samples:{}] []}) I0401 05:54:24.245266 1 sync_worker.go:232] syncPayload: 4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest (force=false) I0401 05:54:24.245281 1 sync_worker.go:453] Cancel the sync worker's current loop I0401 05:54:24.245296 1 sync_worker.go:459] Notify the sync worker that new work is available I0401 05:54:24.245302 1 status.go:171] Synchronizing status errs=field.ErrorList(nil) status=&cvo.SyncWorkerStatus{Generation:3, Failure:error(nil), Done:252, Total:784, Completed:0, Reconciling:true, Initial:false, VersionHash:"Q_qcbW4ZuAs=", LastProgress:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Actual:v1.Release{Version:"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest", Image:"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4", URL:"", Channels:[]string(nil)}, Verified:false, loadPayloadStatus:cvo.LoadPayloadStatus{Step:"PayloadLoaded", Message:"Payload loaded version=\"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest\" image=\"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4\"", Failure:error(nil), Release:v1.Release{Version:"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest", Image:"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4", URL:"", Channels:[]string(nil)}, Verified:false, LastTransitionTime:time.Time{wall:0xc089b9f015d86925, ext:906723265536, loc:(*time.Location)(0x2c58160)}}, CapabilitiesStatus:cvo.CapabilityStatus{Status:v1.ClusterVersionCapabilitiesStatus{EnabledCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}, KnownCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}}, ImplicitlyEnabledCaps:[]v1.ClusterVersionCapability(nil)}} But then another change is detected and status is incorrect: I0401 05:54:24.248908 1 sync_worker.go:709] Detected while calculating next work: capabilities changed ({map[baremetal:{} marketplace:{} openshift-samples:{}] map[marketplace:{}] []} to {map[baremetal:{} marketplace:{} openshift-samples:{}] map[baremetal:{} marketplace:{} openshift-samples:{}] []}) I0401 05:54:24.248936 1 sync_worker.go:635] Work changed, transitioning from Reconciling to Updating I0401 05:54:24.248952 1 sync_worker.go:538] Previous sync status: &cvo.SyncWorkerStatus{Generation:3, Failure:(*payload.UpdateError)(0xc0021f0ba0), Done:252, Total:784, Completed:0, Reconciling:true, Initial:false, VersionHash:"Q_qcbW4ZuAs=", LastProgress:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Actual:v1.Release{Version:"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest", Image:"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4", URL:"", Channels:[]string(nil)}, Verified:false, loadPayloadStatus:cvo.LoadPayloadStatus{Step:"PayloadLoaded", Message:"Payload loaded version=\"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest\" image=\"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4\"", Failure:error(nil), Release:v1.Release{Version:"4.11.0-0.ci.test-2022-04-01-025354-ci-ln-2ydg80t-latest", Image:"registry.build01.ci.openshift.org/ci-ln-2ydg80t/release@sha256:77bf803cfc45670f8bf7a5c08fce7de247c7c40f10c43f2c1ae51458d294bfc4", URL:"", Channels:[]string(nil)}, Verified:false, LastTransitionTime:time.Time{wall:0xc089b9f015d86925, ext:906723265536, loc:(*time.Location)(0x2c58160)}}, CapabilitiesStatus:cvo.CapabilityStatus{Status:v1.ClusterVersionCapabilitiesStatus{EnabledCapabilities:[]v1.ClusterVersionCapability{"marketplace"}, KnownCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}}, ImplicitlyEnabledCaps:[]v1.ClusterVersionCapability(nil)}} Investigating...
This [1] is stomping status because what it's holding here [2] is obsolete. [1] https://github.com/jottofar/cluster-version-operator/blob/16916ca3863d2303aa266d1b7cf004c92b7b320d/pkg/cvo/sync_worker.go#L887 [2] https://github.com/jottofar/cluster-version-operator/blob/16916ca3863d2303aa266d1b7cf004c92b7b320d/pkg/cvo/sync_worker.go#L771
$ head -n1 cvo.log I0401 03:08:37.681883 1 start.go:23] ClusterVersionOperator v1.0.0-797-gc169ad14-dirty $ git show c169ad14 fatal: ambiguous argument 'c169ad14': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' What code-base was used for the comment 0 test?
Along the same lines as Trevor's question, how was this tested? How did you get the system installed with capabilities "None" such that you could change it to v4.11? I was trying to test my fix by overwriting the capabilities setting to None but my cluster fails to start because of issues with marketplace-operator-metrics resources.
(In reply to Jack Ottofaro from comment #4) > Along the same lines as Trevor's question, how was this tested? How did you > get the system installed with capabilities "None" such that you could change > it to v4.11? I was trying to test my fix by overwriting the capabilities > setting to None but my cluster fails to start because of issues with > marketplace-operator-metrics resources. To answer my own question, I suppose you were simply using the Installer and selected "None". My problem was that I was using cluster bot and changing the setting after install in the CVO code as a "test-only" hack.
(In reply to W. Trevor King from comment #3) > What code-base was used for the comment 0 test? The payload I was using was built from cvo#754 by using cluster-bot build openshift/cluster-version-operator#754
(In reply to Jack Ottofaro from comment #4) > Along the same lines as Trevor's question, how was this tested? How did you > get the system installed with capabilities "None" such that you could change > it to v4.11? I was trying to test my fix by overwriting the capabilities > setting to None but my cluster fails to start because of issues with > marketplace-operator-metrics resources. I installed the cluster with baselineCapabilitySet: None. The CVO did loop on the service "openshift-marketplace/marketplace-operator-metrics" creation [1]. But the cluster was functional after installation exited. For your testing, I feel we can install a cluster with below cap set to workaround the marketplace-operator-metrics issue. And then change the baseline to v4.11. spec: capabilities: additionalEnabledCapabilities: - marketplace baselineCapabilitySet: None [1] https://bugzilla.redhat.com/show_bug.cgi?id=2070792
Retried nightly builds and faced it with 4.11.0-0.nightly-2022-04-01-172551 and didn't face it with 4.11.0-0.nightly-2022-04-06-000911. Hmm, possibly it's affected by https://bugzilla.redhat.com/show_bug.cgi?id=2070792. Starting a 4.11.0-0.nightly-2022-04-01-172551 cluster with baselineCapabilitySet: None and then modify it to v4.11, then we have: # oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-04-06T09:59:43Z" generation: 3 name: version resourceVersion: "58204" uid: c5473edc-ccee-4b81-9cb5-db9d37c3f906 spec: capabilities: baselineCapabilitySet: v4.11 channel: stable-4.11 clusterID: 8a6164da-b181-48f6-a402-833441f17f6b status: availableUpdates: null capabilities: knownCapabilities: - baremetal - marketplace - openshift-samples conditions: - lastTransitionTime: "2022-04-06T09:59:49Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-01-172551 not found in the "stable-4.11" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates - lastTransitionTime: "2022-04-06T09:59:49Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2022-04-06T09:59:49Z" message: Payload loaded version="4.11.0-0.nightly-2022-04-01-172551" image="registry.ci.openshift.org/ocp/release@sha256:6d498a5829f1190815fe5edfc204b14a122ac7bb7358201d15feea0b93639bbf" reason: PayloadLoaded status: "True" type: ReleaseAccepted - lastTransitionTime: "2022-04-06T11:57:12Z" message: Done applying 4.11.0-0.nightly-2022-04-01-172551 status: "True" type: Available - lastTransitionTime: "2022-04-06T11:56:42Z" status: "False" type: Failing - lastTransitionTime: "2022-04-06T11:57:12Z" message: Cluster version is 4.11.0-0.nightly-2022-04-01-172551 status: "False" type: Progressing desired: image: registry.ci.openshift.org/ocp/release@sha256:6d498a5829f1190815fe5edfc204b14a122ac7bb7358201d15feea0b93639bbf version: 4.11.0-0.nightly-2022-04-01-172551 history: - completionTime: "2022-04-06T11:57:12Z" image: registry.ci.openshift.org/ocp/release@sha256:6d498a5829f1190815fe5edfc204b14a122ac7bb7358201d15feea0b93639bbf startedTime: "2022-04-06T09:59:49Z" state: Completed verified: false version: 4.11.0-0.nightly-2022-04-01-172551 observedGeneration: 3 versionHash: OLq2WHNzHM0= kind: List metadata: resourceVersion: "" selfLink: "" # oc get co | grep 'baremetal\|samples\|marketplace' baremetal 4.11.0-0.nightly-2022-04-01-172551 True False False 82s marketplace 4.11.0-0.nightly-2022-04-01-172551 True False False 81s openshift-samples 4.11.0-0.nightly-2022-04-01-172551 True False False 66s Starting a 4.11.0-0.nightly-2022-04-06-000911 cluster with baselineCapabilitySet: None and then modify it to v4.11, then we have: # oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2022-04-06T09:45:35Z" generation: 4 name: version resourceVersion: "64807" uid: b2faeb06-6421-4909-8355-fafa28f0ab96 spec: capabilities: additionalEnabledCapabilities: - marketplace baselineCapabilitySet: v4.11 channel: stable-4.11 clusterID: 14f59b0d-a733-4787-882f-0cfee61d4212 status: availableUpdates: null capabilities: enabledCapabilities: - baremetal - marketplace - openshift-samples knownCapabilities: - baremetal - marketplace - openshift-samples conditions: - lastTransitionTime: "2022-04-06T09:45:38Z" message: 'Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-06-000911 not found in the "stable-4.11" channel' reason: VersionNotFound status: "False" type: RetrievedUpdates - lastTransitionTime: "2022-04-06T09:45:38Z" message: Capabilities match configured spec reason: AsExpected status: "False" type: ImplicitlyEnabledCapabilities - lastTransitionTime: "2022-04-06T09:45:38Z" message: Payload loaded version="4.11.0-0.nightly-2022-04-06-000911" image="registry.ci.openshift.org/ocp/release@sha256:885b2a993c72a438183a8a7f04f6adb405ef6167c8724fb00682093941de2472" reason: PayloadLoaded status: "True" type: ReleaseAccepted - lastTransitionTime: "2022-04-06T10:06:32Z" message: Done applying 4.11.0-0.nightly-2022-04-06-000911 status: "True" type: Available - lastTransitionTime: "2022-04-06T10:06:32Z" status: "False" type: Failing - lastTransitionTime: "2022-04-06T11:54:47Z" message: Cluster version is 4.11.0-0.nightly-2022-04-06-000911 status: "False" type: Progressing desired: image: registry.ci.openshift.org/ocp/release@sha256:885b2a993c72a438183a8a7f04f6adb405ef6167c8724fb00682093941de2472 version: 4.11.0-0.nightly-2022-04-06-000911 history: - completionTime: "2022-04-06T10:06:32Z" image: registry.ci.openshift.org/ocp/release@sha256:885b2a993c72a438183a8a7f04f6adb405ef6167c8724fb00682093941de2472 startedTime: "2022-04-06T09:45:38Z" state: Completed verified: false version: 4.11.0-0.nightly-2022-04-06-000911 observedGeneration: 4 versionHash: KK4BRYwcro0= kind: List metadata: resourceVersion: "" selfLink: "" # oc get co | grep 'baremetal\|samples\|marketplace' baremetal 4.11.0-0.nightly-2022-04-06-000911 True False False 11m marketplace 4.11.0-0.nightly-2022-04-06-000911 True False False 12m openshift-samples 4.11.0-0.nightly-2022-04-06-000911 True False False 10m
Fix is just about to land, but this is peripheral enough that if for some reason we can't fix this by 4.11 GA, I expect we'd still ship. Setting blocker-
Reproducing it with 4.11.0-0.nightly-2022-04-12-072444 1. Install a cluster with baselineCapabilitySet: None set 2. Delete secret pull-secret # oc -n openshift-config delete secret pull-secret secret "pull-secret" deleted 3. Delete machine-config-controller # oc -n openshift-machine-config-operator delete pod -l k8s-app=machine-config-controller pod "machine-config-controller-67499cb584-b6949" deleted 4. After a while, check CVO complains about machine-config - lastTransitionTime: "2022-04-14T03:17:16Z" message: Cluster operator machine-config is not available reason: ClusterOperatorNotAvailable status: "True" type: Failing - lastTransitionTime: "2022-04-14T02:39:31Z" message: 'Error while reconciling 4.11.0-0.nightly-2022-04-12-072444: the cluster operator machine-config has not yet successfully rolled out' reason: ClusterOperatorNotAvailable status: "False" type: Progressing 5. Enable baremetal spec: capabilities: additionalEnabledCapabilities: - baremetal baselineCapabilitySet: None 6. We see it in the status. So far so good. status: availableUpdates: null capabilities: enabledCapabilities: - baremetal knownCapabilities: - baremetal - marketplace - openshift-samples 7. CVO complains about the machine-config # oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2022-04-14T02:14:23Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-12-072444 not found in the "stable-4.11" channel 2022-04-14T02:14:23Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec 2022-04-14T02:14:23Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-04-12-072444" image="registry.ci.openshift.org/ocp/release@sha256:5434a3279b4b0dea355316914798405f32cbb5551a242c05ebb64eb2abe0eddf" 2022-04-14T02:39:31Z Available=True : Done applying 4.11.0-0.nightly-2022-04-12-072444 2022-04-14T06:31:16Z Failing=True ClusterOperatorNotAvailable: Cluster operator machine-config is not available 2022-04-14T03:24:31Z Progressing=True ClusterOperatorNotAvailable: Unable to apply 4.11.0-0.nightly-2022-04-12-072444: the cluster operator machine-config has not yet successfully rolled out 8. Enable marketplace spec: capabilities: additionalEnabledCapabilities: - baremetal - marketplace baselineCapabilitySet: None 9. cv.status.capabilities.enabledCapabilities doesn't show the marketplace # oc get clusterversion version -o json | jq -r ".status.capabilities" { "enabledCapabilities": [ "baremetal" ], "knownCapabilities": [ "baremetal", "marketplace", "openshift-samples" ] } It's reproduced. Trying 4.11.0-0.nightly-2022-04-13-235127 with the same step as reproducer # oc get clusterversion version -o json | jq -r ".status.capabilities" { "enabledCapabilities": [ "baremetal", "marketplace" ], "knownCapabilities": [ "baremetal", "marketplace", "openshift-samples" ] } # oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message' 2022-04-14T02:23:50Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-13-235127 not found in the "stable-4.11" channel 2022-04-14T02:23:50Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec 2022-04-14T02:23:50Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-04-13-235127" image="registry.ci.openshift.org/ocp/release@sha256:fa07f4de887c8cbdad30f91afe9347519776cba9095c392f513d40749fda76a0" 2022-04-14T02:45:15Z Available=True : Done applying 4.11.0-0.nightly-2022-04-13-235127 2022-04-14T07:14:15Z Failing=False : 2022-04-14T07:12:45Z Progressing=True : Working towards 4.11.0-0.nightly-2022-04-13-235127: 642 of 787 done (81% complete) We can see that cv.status.capabilities.enabledCapabilities shows all the day-2 enabled caps. Moving it to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069