2071211 – CVO does not trigger new upgrade again after fail to update to unavailable payload

Bug 2071211 - CVO does not trigger new upgrade again after fail to update to unavailable payload

Summary: CVO does not trigger new upgrade again after fail to update to unavailable pa...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cluster Version Operator
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.10.z
Assignee:	Over the Air Updates
QA Contact:	liujia
Docs Contact:
URL:
Whiteboard:
Depends On:	2062568
Blocks:
TreeView+	depends on / blocked

Reported:	2022-04-02 08:03 UTC by liujia
Modified:	2022-08-18 02:25 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	2062568
Environment:
Last Closed:	2022-05-02 18:38:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-version-operator pull 764	0	None	Merged	Bug 2071211: lib/resourcebuilder/batch: Stop waiting on Job deadline exceeded	2022-08-18 01:58:53 UTC
Red Hat Product Errata	RHBA-2022:1601	0	None	None	None	2022-05-02 18:39:07 UTC

Comment 2 liujia 2022-04-19 03:21:40 UTC

Checked the cluster that launched by cluster-bot: 4.10,openshift/cluster-version-operator#764

1. First try to upgrade with unavailable repo and failed.
# ./oc get clusterversion -ojson|jq -r '.items[].status.conditions[]| select(.type=="ReleaseAccepted")'{
  "lastTransitionTime": "2022-04-19T02:59:18Z",
  "message": "Retrieving payload failed version=\"\" image=\"quay.io/openshift-release-dev-test/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3\" failure=Unable to download and prepare the update: deadline exceeded, reason: \"DeadlineExceeded\", message: \"Job was active longer than specified deadline\"",
  "reason": "RetrievePayload",
  "status": "False",
  "type": "ReleaseAccepted"
}

2. Continue upgrade to target payload with correct repo. The upgrade is not triggered successfully. But this time it's because another known issue.
# ./oc get clusterversion -ojson|jq -r '.items[].status.conditions[]| select(.type=="ReleaseAccepted")'
{
  "lastTransitionTime": "2022-04-19T02:59:18Z",
  "message": "Preconditions failed for payload loaded version=\"4.10.10\" image=\"quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3\": Precondition \"EtcdRecentBackup\" failed because of \"ControllerStarted\": ",
  "reason": "PreconditionChecks",
  "status": "False",
  "type": "ReleaseAccepted"
}

Further check to find that new job of downloading the new target payload is successful, which means the new update is not blocked by the issue in this bug yet.
# ./oc get job
NAME             COMPLETIONS   DURATION   AGE
version--jvldk   1/1           8s         11m
version--snz26   0/1           21m        21m

# ./oc describe pod/version--jvldk-j9zvz|grep "quay.io"
    Image:         quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3
    Image ID:      quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3
  Normal  Pulling         10m   kubelet  Pulling image "quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3"
  Normal  Pulled          10m   kubelet  Successfully pulled image "quay.io/openshift-release-dev/ocp-release@sha256:39efe13ef67cb4449f5e6cdd8a26c83c07c6a2ce5d235dfbc3ba58c64418fcf3" in 2.976460275s

Comment 7 errata-xmlrpc 2022-05-02 18:38:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.12 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1601

Comment 8 W. Trevor King 2022-08-18 02:25:03 UTC

We're considering taking this back to 4.9.z in [1].

[1]: https://issues.redhat.com/browse/OCPBUGS-230

Note You need to log in before you can comment on or make changes to this bug.