Bug 2079789 - cluster drops ImplicitlyEnabledCapabilities during upgrade
Summary: cluster drops ImplicitlyEnabledCapabilities during upgrade
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.11
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Jack Ottofaro
QA Contact: Yang Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-28 09:10 UTC by Evgeni Vakhonin
Modified: 2022-08-10 11:09 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:09:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 768 0 None Merged Bug 2079789: pkg/cvo/sync_worker.go: Initialize implicitlyEnabledCaps 2022-05-09 19:22:11 UTC
Github openshift cluster-version-operator pull 773 0 None open Bug 2079789: capability: Init prior known from CV status 2022-05-09 19:22:13 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:09:32 UTC

Description Evgeni Vakhonin 2022-04-28 09:10:24 UTC
ImplicitlyEnabledCapabilities not preserved during upgrade, resulting in 
capability removal, and capability resources stuck at old version silently after upgrade

Version-Release number of selected component (if applicable):
Client Version: 4.11.0-0.nightly-2022-04-26-085341
Server Version: 4.11.0-0.nightly-2022-04-26-085341

How reproducible:
100%

Steps to Reproduce:
install a cluster with 
baselineCapabilitySet: None
additionalEnabledCapabilities: ["marketplace"]

spec
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "marketplace"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "6096ef9c-5d0c-4bd9-a422-37c02c5c49b2"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-04-27T13:39:15Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-26-085341 not found in the "stable-4.11" channel
2022-04-27T13:39:15Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec
2022-04-27T13:39:15Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-04-26-085341" image="registry.ci.openshift.org/ocp/release@sha256:f9875a76c9867901d6e441f2eca7130a838255324b701604201152aa2f332e57"
2022-04-27T14:05:01Z Available=True : Done applying 4.11.0-0.nightly-2022-04-26-085341
2022-04-27T13:45:30Z Failing=False : 
2022-04-27T14:05:01Z Progressing=False : Cluster version is 4.11.0-0.nightly-2022-04-26-085341


remove the capability, triggering ImplicitlyEnabledCapabilities

spec
{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "6096ef9c-5d0c-4bd9-a422-37c02c5c49b2"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-04-27T13:39:15Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-04-26-085341 not found in the "stable-4.11" channel
2022-04-27T15:32:59Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: marketplace
2022-04-27T13:39:15Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-04-26-085341" image="registry.ci.openshift.org/ocp/release@sha256:f9875a76c9867901d6e441f2eca7130a838255324b701604201152aa2f332e57"
2022-04-27T14:05:01Z Available=True : Done applying 4.11.0-0.nightly-2022-04-26-085341
2022-04-27T13:45:30Z Failing=False : 
2022-04-27T14:05:01Z Progressing=False : Cluster version is 4.11.0-0.nightly-2022-04-26-085341

now upgrade the cluster..
cvo drops ImplicitlyEnabledCapabilities as well as .status.enabledCapabilities

{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "6096ef9c-5d0c-4bd9-a422-37c02c5c49b2",
  "desiredUpdate": {
    "force": true,
    "image": "registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2
e1c",
    "version": ""
  }
}
status caps
{
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-04-27T13:39:15Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling
 cluster version 4.11.0-0.nightly-2022-04-26-181148 not found in the "stable-4.11" channel
2022-04-27T15:35:49Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec
2022-04-27T13:39:15Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-04-26-181148" image="registry.ci.openshift.org/ocp/release@sha256:30452e14cbefed21f883ac38652b9dbaf653a922a1ca0efd6f3a1a10acfc2e1c"
2022-04-27T14:05:01Z Available=True : Done applying 4.11.0-0.nightly-2022-04-26-181148
2022-04-27T15:43:34Z Failing=False : 
2022-04-27T16:06:03Z Progressing=False : Cluster version is 4.11.0-0.nightly-2022-04-26-181148

marketplace resources still all exist on cluster, but outdated version

oc get co
...........
machine-config                             4.11.0-0.nightly-2022-04-26-181148   True        False         False      4h45m   
marketplace                                4.11.0-0.nightly-2022-04-26-085341   True        False         False      4h46m   
monitoring                                 4.11.0-0.nightly-2022-04-26-181148   True        False         False      4h34m   


same bug observed by changing caps from v4.11 to None and then upgrading

Comment 2 Jack Ottofaro 2022-04-28 15:45:37 UTC
This [1] gets returned uninitialized and stomping on the implicitly enabled caps in many cases such as when the "upgrade" version of CVO comes up.

evakhoni, when you test this again can you also capture the outgoing CVO pod's log. Tail it so you get it before the pod is deleted.

[1] https://github.com/openshift/cluster-version-operator/blob/76f188564fa5af4b9dc110093b4f6c3e0262587d/pkg/cvo/sync_worker.go#L230

Comment 4 Evgeni Vakhonin 2022-05-02 17:14:50 UTC
since no nightly available with the patch, but already merged so no cluster-bot either
tested on 4.11.0-0.ci-2022-05-01-215302 to 4.11.0-0.ci-2022-05-02-034852 build which both includes the patch.


╰─ flexy.sh 99021
connecting flexy job 99021
Client Version: 4.11.0-0.nightly-2022-04-26-181148
Kustomize Version: v4.5.4
Server Version: 4.11.0-0.ci-2022-05-01-215302
Kubernetes Version: v1.23.3-2047+d464c70a480f1c-dirty


╰─ cvoCapsAndConds 
spec
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "marketplace"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "d89fac02-be79-4fe2-832b-c837a459d817"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-02T15:49:26Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci-2022-05-01-215302 not found in the "stable-4.11" channel
2022-05-02T15:49:26Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec
2022-05-02T15:49:26Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci-2022-05-01-215302" image="registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4"
2022-05-02T16:11:42Z Available=True : Done applying 4.11.0-0.ci-2022-05-01-215302
2022-05-02T16:11:42Z Failing=False : 
2022-05-02T16:11:42Z Progressing=False : Cluster version is 4.11.0-0.ci-2022-05-01-215302


╰─ oc patch clusterversion version --type json --patch '[{"op": "remove", "path": "/spec/capabilities/additionalEnabledCapabilities" }]'
clusterversion.config.openshift.io/version patched


╰─ cvoCapsAndConds 
spec
{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "d89fac02-be79-4fe2-832b-c837a459d817"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-02T15:49:26Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci-2022-05-01-215302 not found in the "stable-4.11" channel
2022-05-02T16:50:09Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: marketplace
2022-05-02T15:49:26Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci-2022-05-01-215302" image="registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4"
2022-05-02T16:11:42Z Available=True : Done applying 4.11.0-0.ci-2022-05-01-215302
2022-05-02T16:11:42Z Failing=False : 
2022-05-02T16:11:42Z Progressing=False : Cluster version is 4.11.0-0.ci-2022-05-01-215302


╰─ forceupgradeToDigest aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81


╰─ cvoCapsAndConds 
spec
{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "d89fac02-be79-4fe2-832b-c837a459d817",
  "desiredUpdate": {
    "force": true,
    "image": "registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81",
    "version": ""
  }
}
status caps
{
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-02T15:49:26Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci-2022-05-02-034852 not found in the "stable-4.11" channel
2022-05-02T16:51:09Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec
2022-05-02T15:49:26Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci-2022-05-02-034852" image="registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81"
2022-05-02T16:11:42Z Available=True : Done applying 4.11.0-0.ci-2022-05-01-215302
2022-05-02T16:11:42Z Failing=False : 
2022-05-02T16:51:01Z Progressing=True : Working towards 4.11.0-0.ci-2022-05-02-034852: 96 of 791 done (12% complete)

not good.. looks like the bug still reproduces

Comment 6 Evgeni Vakhonin 2022-05-02 18:13:52 UTC
observer in cvo log while triggering the upgrade:

I0502 16:50:42.258088       1 status.go:171] Synchronizing status errs=field.ErrorList(nil) status=&cvo.SyncWorkerStatus{Generation:2, Failure:error(nil), Done:750, Total:791, Completed:16, Reconciling:true, Initial:false, VersionHash:"HebJDo095qY=", LastProgress:time.Date(2022, time.May, 2, 16, 49, 22, 752787043, time.Local), Actual:v1.Release{Version:"4.11.0-0.ci-2022-05-01-215302", Image:"registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4", URL:"", Channels:[]string(nil)}, Verified:false, loadPayloadStatus:cvo.LoadPayloadStatus{Step:"PayloadLoaded", Message:"Payload loaded version=\"4.11.0-0.ci-2022-05-01-215302\" image=\"registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4\"", Failure:error(nil), Release:v1.Release{Version:"4.11.0-0.ci-2022-05-01-215302", Image:"registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4", URL:"", Channels:[]string(nil)}, Verified:false, LastTransitionTime:time.Time{wall:0xc0941dc8cf5971e7, ext:345970244632, loc:(*time.Location)(0x2c5c180)}}, CapabilitiesStatus:cvo.CapabilityStatus{Status:v1.ClusterVersionCapabilitiesStatus{EnabledCapabilities:[]v1.ClusterVersionCapability{"marketplace"}, KnownCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}}, ImplicitlyEnabledCaps:[]v1.ClusterVersionCapability{"marketplace"}}}


I0502 16:51:01.316105       1 status.go:171] Synchronizing status errs=field.ErrorList(nil) status=&cvo.SyncWorkerStatus{Generation:2, Failure:error(nil), Done:750, Total:791, Completed:16, Reconciling:true, Initial:false, VersionHash:"HebJDo095qY=", LastProgress:time.Date(2022, time.May, 2, 16, 49, 22, 752787043, time.Local), Actual:v1.Release{Version:"4.11.0-0.ci-2022-05-01-215302", Image:"registry.ci.openshift.org/ocp/release@sha256:b4687fa905eb2b708a9e9bf2c09b47f495596a50f00ba39daa68da088ff1a7c4", URL:"", Channels:[]string(nil)}, Verified:false, loadPayloadStatus:cvo.LoadPayloadStatus{Step:"PayloadLoaded", Message:"Payload loaded version=\"4.11.0-0.ci-2022-05-02-034852\" image=\"registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81\"", Failure:error(nil), Release:v1.Release{Version:"4.11.0-0.ci-2022-05-02-034852", Image:"registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81", URL:"", Channels:[]string(nil)}, Verified:false, LastTransitionTime:time.Time{wall:0xc09420dd52d5efb6, ext:3500028734953, loc:(*time.Location)(0x2c5c180)}}, CapabilitiesStatus:cvo.CapabilityStatus{Status:v1.ClusterVersionCapabilitiesStatus{EnabledCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace"}, KnownCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}}, ImplicitlyEnabledCaps:[]v1.ClusterVersionCapability{"marketplace", "baremetal"}}}


on the new pod:


I0502 16:51:09.434444       1 status.go:171] Synchronizing status errs=field.ErrorList(nil) status=&cvo.SyncWorkerStatus{Generation:4, Failure:error(nil), Done:0, Total:0, Completed:0, Reconciling:false, Initial:false, VersionHash:"", LastProgress:time.Date(1, time.January, 1, 0, 0, 0, 0, time.UTC), Actual:v1.Release{Version:"", Image:"registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81", URL:"", Channels:[]string(nil)}, Verified:false, loadPayloadStatus:cvo.LoadPayloadStatus{Step:"PayloadLoaded", Message:"Payload loaded version=\"4.11.0-0.ci-2022-05-02-034852\" image=\"registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81\"", Failure:error(nil), Release:v1.Release{Version:"4.11.0-0.ci-2022-05-02-034852", Image:"registry.ci.openshift.org/ocp/release@sha256:aa863823a75441c7e53b76cd1886eea2113e35ce78fb085b3093259073889c81", URL:"", Channels:[]string(nil)}, Verified:false, LastTransitionTime:time.Time{wall:0xc09420df59e366a6, ext:2304964580, loc:(*time.Location)(0x2c5c180)}}, CapabilitiesStatus:cvo.CapabilityStatus{Status:v1.ClusterVersionCapabilitiesStatus{EnabledCapabilities:[]v1.ClusterVersionCapability(nil), KnownCapabilities:[]v1.ClusterVersionCapability{"baremetal", "marketplace", "openshift-samples"}}, ImplicitlyEnabledCaps:[]v1.ClusterVersionCapability(nil)}}

Comment 9 Evgeni Vakhonin 2022-05-02 19:47:29 UTC
looks like a failed fix to me 
@jack.ottofaro wdyt?

Comment 10 Jack Ottofaro 2022-05-02 21:08:58 UTC
(In reply to Evgeni Vakhonin from comment #9)
> looks like a failed fix to me 
> @jack.ottofaro wdyt?

The fix was necessary but unfortunately there's another piece of code that's missing. So, yeah, agree that it's a fail. Will be sending up another PR to address this missing piece.

Comment 11 Evgeni Vakhonin 2022-05-03 06:46:41 UTC
ok then.. moving back to assigned and waiting for your PR. thanks!

Comment 13 Evgeni Vakhonin 2022-05-10 18:35:08 UTC
tried Pre-merge verify with the latest PR.. 

installed a patched cluster with the following:

Client Version: 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest
Kustomize Version: v4.5.4
Server Version: 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest
Kubernetes Version: v1.23.3-2049+69213f85dee380-dirty

╰─ cvoCapsAndConds 
spec
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "marketplace"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "ac8140ce-4e6a-4390-a200-c0d1b24bec92"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-10T17:49:22Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest not found in the "stable-4.11" channel
2022-05-10T17:49:22Z ImplicitlyEnabledCapabilities=False AsExpected: Capabilities match configured spec
2022-05-10T17:49:22Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest" image="registry.build01.ci.openshift.org/ci-ln-6vrmgwb/release@sha256:9a1c4edbfae2d13af608d8dca93260a53884dfc68963ddb72c2a1a042aa44fde"
2022-05-10T18:05:57Z Available=True : Done applying 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest
2022-05-10T18:04:12Z Failing=False : 
2022-05-10T18:05:57Z Progressing=False : Cluster version is 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest

╰─ oc patch clusterversion version --type json --patch '[{"op": "remove", "path": "/spec/capabilities/additionalEnabledCapabilities" }]'
clusterversion.config.openshift.io/version patched

╰─ cvoCapsAndConds
spec
{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "ac8140ce-4e6a-4390-a200-c0d1b24bec92"
}
status caps
{
  "enabledCapabilities": [
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-10T17:49:22Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest not found in the "stable-4.11" channel
2022-05-10T18:19:02Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: marketplace
2022-05-10T17:49:22Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest" image="registry.build01.ci.openshift.org/ci-ln-6vrmgwb/release@sha256:9a1c4edbfae2d13af608d8dca93260a53884dfc68963ddb72c2a1a042aa44fde"
2022-05-10T18:05:57Z Available=True : Done applying 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest
2022-05-10T18:04:12Z Failing=False : 
2022-05-10T18:05:57Z Progressing=False : Cluster version is 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest


after triggering ImplicitlyEnabledCapabilities, upgraded to another fixed version built from the same PR..


╰─ oc adm upgrade --allow-explicit-upgrade --force --to-image registry.build01.ci.openshift.org/ci-ln-fdjflgk/release@sha256:fa4fd5127d78cd2f6128e8c9230f55daa9fe5267ba78c91c71464ef9b8763dfd
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.build01.ci.openshift.org/ci-ln-fdjflgk/release@sha256:fa4fd5127d78cd2f6128e8c9230f55daa9fe5267ba78c91c71464ef9b8763dfd

╰─ cvoCapsAndConds
spec
{
  "capabilities": {
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "ac8140ce-4e6a-4390-a200-c0d1b24bec92",
  "desiredUpdate": {
    "force": true,
    "image": "registry.build01.ci.openshift.org/ci-ln-fdjflgk/release@sha256:fa4fd5127d78cd2f6128e8c9230f55daa9fe5267ba78c91c71464ef9b8763dfd",
    "version": ""
  }
}
status caps
{
  "enabledCapabilities": [
    "baremetal",
    "marketplace"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}
2022-05-10T17:49:22Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.ci.test-2022-05-10-131821-ci-ln-fdjflgk-latest not found in the "stable-4.11" channel
2022-05-10T18:19:02Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: marketplace, baremetal
2022-05-10T17:49:22Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.ci.test-2022-05-10-131821-ci-ln-fdjflgk-latest" image="registry.build01.ci.openshift.org/ci-ln-fdjflgk/release@sha256:fa4fd5127d78cd2f6128e8c9230f55daa9fe5267ba78c91c71464ef9b8763dfd"
2022-05-10T18:05:57Z Available=True : Done applying 4.11.0-0.ci.test-2022-05-10-071934-ci-ln-6vrmgwb-latest
2022-05-10T18:21:12Z Failing=False : 
2022-05-10T18:21:08Z Progressing=True : Working towards 4.11.0-0.ci.test-2022-05-10-131821-ci-ln-fdjflgk-latest: 9 of 796 done (1% complete)


as seen above, while this time ImplicitlyEnabledCapabilities is preserved, it grown 'baremetal' capability, while it should only have 'marketplace'
appears to me that the original bug is fixed, but we stepped on yet another one that was hiding behind it? @jack.ottofaro wdyt?

Comment 15 Jack Ottofaro 2022-05-10 21:09:26 UTC
(In reply to Evgeni Vakhonin from comment #14)
> Created attachment 1878422 [details]
> CVO Logs after #773
> 
> for both pods

Thanks for the pre test Evgeni. From the outgoing CVO's log it appears it's picking up an extra implicit capability when its checks out the new payload to see if any enabled manifests' capabilities have changed. I will let you know when I have figured it out, fixed, and tested.

Comment 16 Jack Ottofaro 2022-05-11 15:10:21 UTC
From my testing and logs I can see that first the Configmap "openshift-machine-api/kube-rbac-proxy" is not applied due to "disabled capabilities: baremetal" which is correct. But later I see he Configmap "openshift-machine-api/kube-rbac-proxy" being applied. Hmm, are there two manifests with this same resource, i.e. same apiVersion, kind, name, namespace? So I extracted my PR's image and appears there are two, one with the baremetal anno and one without.

0000_31_cluster-baremetal-operator_05_kube-rbac-proxy-config.yaml:

apiVersion: v1
data:
  config-file.yaml: |+
    authorization:
      resourceAttributes:
        apiVersion: v1
        resource: namespace
        subresource: metrics
        namespace: openshift-machine-api

kind: ConfigMap
metadata:
  annotations:
    capability.openshift.io/name: baremetal
    include.release.openshift.io/self-managed-high-availability: "true"
  name: kube-rbac-proxy
  namespace: openshift-machine-api

0000_30_machine-api-operator_10_kube-rbac-proxy-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-rbac-proxy
  namespace: openshift-machine-api
  annotations:
    include.release.openshift.io/self-managed-high-availability: "true"
    include.release.openshift.io/single-node-developer: "true"
data:
  config-file.yaml: |+
    authorization:
      resourceAttributes:
        apiVersion: v1
        resource: namespace
        subresource: metrics
        namespace: openshift-machine-api

So indirectly this is testing, and verifying, correct behaviour to support https://issues.redhat.com/browse/OTA-574. Since Configmap "openshift-machine-api/kube-rbac-proxy" does get deployed in initial release (the one that does not contain the baremetal capability annotation) when the upgrade happens CVO reads the 0000_31_cluster-baremetal-operator_05_kube-rbac-proxy-config.yaml manifest from the upgrade release, compares it to the  0000_30_machine-api-operator_10_kube-rbac-proxy-config.yaml manifest from the initial release, and says, "oh, this resource is running on the cluster but the resource is now part of the "baremetal" capability, which is disabled, so I'll enable it and flag it as implicitly enabled".

I suspect there was an oversight and 0000_30_machine-api-operator_10_kube-rbac-proxy-config.yaml shouldn't be around any longer so I'll get to the bottom of that.

Comment 17 Jack Ottofaro 2022-05-11 16:50:40 UTC
Created bug [1] to correct resource issue described in https://bugzilla.redhat.com/show_bug.cgi?id=2079789#c16. See slack discussion https://coreos.slack.com/archives/CFP6ST0A3/p1652282996332859 for more detail.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=2084215

Comment 20 Yang Yang 2022-05-18 09:22:27 UTC
Verifying with 4.11.0-0.nightly-2022-05-18-010528

Steps to verify:
1. Install a cluster with baremetal and openshift-samples enabled

# oc get clusterversion/version -ojson | jq -r '.spec, .status'
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "baremetal",
      "openshift-samples"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "afb0a880-efcd-429b-9ab3-479e53723b39"
}
{
  "availableUpdates": null,
  "capabilities": {
    "enabledCapabilities": [
      "baremetal",
      "openshift-samples"
    ],
    "knownCapabilities": [
      "baremetal",
      "marketplace",
      "openshift-samples"
    ]
  },

2. Disable openshift-samples
# oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message'
2022-05-18T05:46:46Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-05-18-010528 not found in the "stable-4.11" channel
2022-05-18T08:06:30Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: openshift-samples
2022-05-18T05:46:46Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-05-18-010528" image="registry.ci.openshift.org/ocp/release@sha256:414dbf48dc86fba8e94e830e12e0629fec6d540f1991289af05c6c9225738edd"
2022-05-18T06:05:12Z Available=True : Done applying 4.11.0-0.nightly-2022-05-18-010528
2022-05-18T06:05:12Z Failing=False : 
2022-05-18T06:05:12Z Progressing=False : Cluster version is 4.11.0-0.nightly-2022-05-18-010528

3. Upgrade the cluster
# oc adm upgrade --to-image=registry.ci.openshift.org/ocp/release@sha256:09950da77b12d7e6141f859efc9e958fb412df61d2bc051c575c79e3f535badd --allow-explicit-upgrade --force
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures.
Updating to release image registry.ci.openshift.org/ocp/release@sha256:09950da77b12d7e6141f859efc9e958fb412df61d2bc051c575c79e3f535badd

4. After upgrade is completed, check cv condition
# oc get -o json clusterversion version | jq -r '.status.conditions[] | .lastTransitionTime + " " + .type + "=" + .status + " " + .reason + ": " + .message'
2022-05-18T05:46:46Z RetrievedUpdates=False VersionNotFound: Unable to retrieve available updates: currently reconciling cluster version 4.11.0-0.nightly-2022-05-18-053037 not found in the "stable-4.11" channel
2022-05-18T08:06:30Z ImplicitlyEnabledCapabilities=True CapabilitiesImplicitlyEnabled: The following capabilities could not be disabled: openshift-samples
2022-05-18T05:46:46Z ReleaseAccepted=True PayloadLoaded: Payload loaded version="4.11.0-0.nightly-2022-05-18-053037" image="registry.ci.openshift.org/ocp/release@sha256:09950da77b12d7e6141f859efc9e958fb412df61d2bc051c575c79e3f535badd"
2022-05-18T06:05:12Z Available=True : Done applying 4.11.0-0.nightly-2022-05-18-053037
2022-05-18T06:05:12Z Failing=False : 
2022-05-18T08:58:42Z Progressing=False : Cluster version is 4.11.0-0.nightly-2022-05-18-053037

Fine, ImplicitlyEnabledCapabilities=True and complains openshift-samples

5. Check cv status
# oc get clusterversion/version -ojson | jq -r '.spec, .status.capabilities'
{
  "capabilities": {
    "additionalEnabledCapabilities": [
      "baremetal"
    ],
    "baselineCapabilitySet": "None"
  },
  "channel": "stable-4.11",
  "clusterID": "afb0a880-efcd-429b-9ab3-479e53723b39",
  "desiredUpdate": {
    "force": true,
    "image": "registry.ci.openshift.org/ocp/release@sha256:09950da77b12d7e6141f859efc9e958fb412df61d2bc051c575c79e3f535badd",
    "version": ""
  }
}
{
  "enabledCapabilities": [
    "baremetal",
    "openshift-samples"
  ],
  "knownCapabilities": [
    "baremetal",
    "marketplace",
    "openshift-samples"
  ]
}

Looks good, no more caps are grown.

6. Check co
# oc get co | egrep 'baremetal|marketplace|openshift-samples'
baremetal                                  4.11.0-0.nightly-2022-05-18-053037   True        False         False      3h18m   
openshift-samples                          4.11.0-0.nightly-2022-05-18-053037   True        False         False      51m     

Looks good to me. Moving it to verified state.

Comment 22 errata-xmlrpc 2022-08-10 11:09:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.