+++ This bug was initially created as a clone of Bug #1927515 +++ +++ This bug was initially created as a clone of Bug #1916384 +++ --- Additional comment from wking on 2021-01-20 04:02:12 UTC --- Thanks for pushing this, Neil :). I've pushed up a PR for master/4.7 which has what I expect to be a fix. As that is verified for 4.7, we'll backport to 4.6, and then to 4.5. Filling out a formal impact statement: Who is impacted? If we have to block upgrade edges based on this issue, which edges would need blocking? * 4.5.15's bug 1872906 and 4.6.0's bug 1843505 broke the outgoing ClusterVersion status sync. Clusters updating out of those releases, regardless of which release they are updating to, will be impacted by this bug. To avoid the bug, we could theoretically block all edged into impacted releases, but that's an awful lot of releases, and as discussed below, the impact isn't particularly terrible. What is the impact? Is it serious enough to warrant blocking edges? * Late-breaking changes to ClusterVersion status may not be pushed into the cluster. Because it takes some time to pull down and verify the release image, and because the incoming CVO knows what version it's been asked to run, the version name and release image are unlikely to be corrupted, but 'verified' might be reported as 'false' when in reality the incoming release was successfully verified (the outgoing CVO just exited without attempting to sync that 'verified: true' out to the cluster. This corrupted data is unfortunate, but has no known in-cluster consumers, and the main "was the target signed?" condition can be confirmed later by manually looking up the signature for the target release image. That is probably limited enough to not be worth blocking edges. How involved is remediation (even moderately serious impacts might be acceptable if they are easy to mitigate)? * Updating out to fixed releases will avoid the problem for future updates. There's no repairing updates out of impacted releases short of manually forcing 'verified' values, and that's probably not something we want to recommend. Is this a regression (if all previous versions were also vulnerable, updating to the new, vulnerable version does not increase exposure)? * Yes, from 4.5.14 (and earlier) into 4.5.15, 4.6, and later.
Verified with 4.5.0-0.nightly-2021-02-26-170201 Steps to verify it: 1. Install a cluster with 4.5.0-0.nightly-2021-02-26-170201 2. Create a dummy cincy server with 4.5.0-0.nightly-2021-02-26-170201 and 4.6.19 3. Patch to use the cincy server 4. Upgrade the cluster to 4.6.19 # oc get clusterversion -oyaml apiVersion: v1 items: - apiVersion: config.openshift.io/v1 kind: ClusterVersion metadata: creationTimestamp: "2021-03-01T04:17:22Z" generation: 3 managedFields: - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:clusterID: {} manager: cluster-bootstrap operation: Update time: "2021-03-01T04:17:22Z" - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:channel: {} f:upstream: {} manager: kubectl-edit operation: Update time: "2021-03-01T06:07:17Z" - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:desiredUpdate: .: {} f:force: {} f:image: {} f:version: {} manager: oc operation: Update time: "2021-03-01T06:07:51Z" - apiVersion: config.openshift.io/v1 fieldsType: FieldsV1 fieldsV1: f:status: .: {} f:availableUpdates: {} f:conditions: {} f:desired: .: {} f:channels: {} f:image: {} f:url: {} f:version: {} f:history: {} f:observedGeneration: {} f:versionHash: {} manager: cluster-version-operator operation: Update time: "2021-03-01T06:12:52Z" name: version resourceVersion: "50753" selfLink: /apis/config.openshift.io/v1/clusterversions/version uid: 5b136b58-5a12-40c0-9b59-45d3f462f387 spec: channel: stable-4.6 clusterID: 16a9d8a3-a65d-4dda-a23f-dc717ed35a75 desiredUpdate: force: false image: quay.io/openshift-release-dev/ocp-release@sha256:47df4bfe1cfd6d63dd2e880f00075ed1d37f997fd54884ed823ded9f5d96abfc version: 4.6.19 upstream: https://raw.githubusercontent.com/shellyyang1989/upgrade-cincy/master/cincy4.json status: availableUpdates: null conditions: - lastTransitionTime: "2021-03-01T05:00:59Z" message: Done applying 4.5.0-0.nightly-2021-02-26-170201 status: "True" type: Available - lastTransitionTime: "2021-03-01T06:08:28Z" status: "False" type: Failing - lastTransitionTime: "2021-03-01T06:07:59Z" message: 'Working towards 4.6.19: 15% complete' status: "True" type: Progressing - lastTransitionTime: "2021-03-01T06:07:17Z" status: "True" type: RetrievedUpdates desired: channels: - stable-4.6 image: quay.io/openshift-release-dev/ocp-release@sha256:47df4bfe1cfd6d63dd2e880f00075ed1d37f997fd54884ed823ded9f5d96abfc url: https://access.redhat.com/errata/RHBA-2021:0634 version: 4.6.19 history: - completionTime: null image: quay.io/openshift-release-dev/ocp-release@sha256:47df4bfe1cfd6d63dd2e880f00075ed1d37f997fd54884ed823ded9f5d96abfc startedTime: "2021-03-01T06:07:59Z" state: Partial verified: true <--- The state is changed to True. version: 4.6.19 - completionTime: "2021-03-01T05:00:59Z" image: registry.ci.openshift.org/ocp/release@sha256:e54366af2e363c90249dceb97a1496d3b4249da69c5400ab383eca63799db762 startedTime: "2021-03-01T04:17:39Z" state: Completed verified: false version: 4.5.0-0.nightly-2021-02-26-170201 observedGeneration: 3 versionHash: llINEEKbEPQ= kind: List metadata: resourceVersion: "" selfLink: "" The verified: true is visible, so move it to verified state
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.34 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:0714