Bug 2064991 - cluster-version operator stops applying manifests when blocked by a precondition check
Summary: cluster-version operator stops applying manifests when blocked by a precondit...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 4.10.z
Assignee: Over the Air Updates
QA Contact: liujia
URL:
Whiteboard:
Depends On: 1822752
Blocks: 1822922 2091806
TreeView+ depends on / blocked
 
Reported: 2022-03-17 04:00 UTC by liujia
Modified: 2022-06-06 16:25 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1822752
Environment:
Last Closed: 2022-04-08 05:04:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 753 0 None Merged Bug 2064991: pkg/cvo: Separate payload load from payload apply 2022-04-03 04:22:43 UTC
Red Hat Product Errata RHSA-2022:1162 0 None None None 2022-04-08 05:04:42 UTC

Comment 2 liujia 2022-03-17 06:37:34 UTC
Yeah, I agree that we should be careful on the backport to v4.10/v4.9 since the change is big enough. I clone the bug to raise the question here, except for un-completed ResourceDeletesInProgress, we need a decision on it's worth to resolve the issue(stop syncing manifests) in 4.10/4.9 or just leave it in current status to be a known issue.

Comment 5 liujia 2022-04-02 06:30:29 UTC
Version : 4.10.8

1. Upgrade cluster to an unsigned payload.
# ./oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:edb2f74d5caf03746726808655745baa7f9561f25e9dac39d226380ca0d20295
warning: The requested upgrade image is not one of the available updates.You have used --allow-explicit-upgrade for the update to proceed anyway
Updating to release image registry.ci.openshift.org/ocp/release@sha256:edb2f74d5caf03746726808655745baa7f9561f25e9dac39d226380ca0d20295

2. Check that the image check failed and no upgrade happened.
# ./oc get clusterversion -ojson|jq .items[].status.conditions[1]
{
  "lastTransitionTime": "2022-04-02T06:11:53Z",
  "message": "Retrieving payload failed version=\"\" image=\"registry.ci.openshift.org/ocp/release@sha256:edb2f74d5caf03746726808655745baa7f9561f25e9dac39d226380ca0d20295\" failure=The update cannot be verified: unable to locate a valid signature for one or more sources",
  "reason": "RetrievePayload",
  "status": "False",
  "type": "ReleaseAccepted"
}
# ./oc get clusterversion -ojson|jq .items[].status.conditions[]
...
{
  "lastTransitionTime": "2022-04-02T06:11:53Z",
  "message": "Retrieving payload failed version=\"\" image=\"registry.ci.openshift.org/ocp/release@sha256:edb2f74d5caf03746726808655745baa7f9561f25e9dac39d226380ca0d20295\" failure=The update cannot be verified: unable to locate a valid signature for one or more sources",
  "reason": "RetrievePayload",
  "status": "False",
  "type": "ReleaseAccepted"
}
...
{
  "lastTransitionTime": "2022-04-02T04:16:55Z",
  "message": "Cluster version is 4.10.8",
  "status": "False",
  "type": "Progressing"
}

3. Patch maxUnavailable of marketplace-operator deployment 
# ./oc patch -n openshift-marketplace deployment/marketplace-operator --type=json -p '[{"op": "replace", "path": "/spec/strategy/rollingUpdate/maxUnavailable", "value": "50%"}]'
deployment.apps/marketplace-operator patched

# ./oc -n openshift-marketplace get deployment -ojson|jq .items[].spec.strategy.rollingUpdate
{
  "maxSurge": "25%",
  "maxUnavailable": "50%"
}

4. Wait for several minutes, and check the resource reconciled back to 25%
# ./oc -n openshift-marketplace get deployment -ojson|jq .items[].spec.strategy.rollingUpdate
{
  "maxSurge": "25%",
  "maxUnavailable": "25%"
}

Comment 7 errata-xmlrpc 2022-04-08 05:04:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.10.8 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1162


Note You need to log in before you can comment on or make changes to this bug.