2094078 – CVO gets stuck downloading an upgrade, with the version pod complaining about invalid options

Bug 2094078 - CVO gets stuck downloading an upgrade, with the version pod complaining about invalid options

Summary: CVO gets stuck downloading an upgrade, with the version pod complaining about...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cluster Version Operator
Sub Component:
Version:	4.10
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.10.z
Assignee:	W. Trevor King
QA Contact:	Evgeni Vakhonin
Docs Contact:
URL:
Whiteboard:
Depends On:	2091770
Blocks:
TreeView+	depends on / blocked

Reported:	2022-06-06 18:45 UTC by OpenShift BugZilla Robot
Modified:	2022-07-27 07:24 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-06-28 11:50:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift cluster-version-operator pull 787	None	Merged	Bug 2094078: pkg/cvo/updatepayload: Guard against 'rm -fR -whatever' with ./*	2022-06-17 16:41:45 UTC
Red Hat Knowledge Base (Solution)	6965075	None	None	None	2022-06-29 18:53:24 UTC
Red Hat Product Errata	RHBA-2022:5172	None	None	None	2022-06-28 11:50:45 UTC

Comment 1 Evgeni Vakhonin 2022-06-07 08:20:21 UTC

reproducing on Server Version: 4.10.0-0.nightly-2022-06-02-223105
injecting synthetic dashed-directory
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-03-081413
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41 --force 
$ oc -n openshift-cluster-version get pods
NAME                                        READY   STATUS       RESTARTS      AGE
cluster-version-operator-67454b6c58-wvt2g   1/1     Running      0             29m
version--p6rwq-wk6mk                        0/1     Init:Error   3 (28s ago)   46s
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk 
Error from server (BadRequest): container "rename-to-final-location" in pod "version--p6rwq-wk6mk" is waiting to start: PodInitializing
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk -c cleanup
rm: invalid option -- 'c'
Try 'rm ./-cccccccc' to remove the file '-cccccccc'.
Try 'rm --help' for more information.
$ oc adm upgrade
Cluster version is 4.10.0-0.nightly-2022-06-02-223105

ReleaseAccepted=False

  Reason: RetrievePayload
  Message: Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41" failure=Unable to download and prepare the update: deadline exceeded, reason: "DeadlineExceeded", message: "Job was active longer than specified deadline"




pre merge verifying with cluster bot Server Version: 4.10.0-0.ci.test-2022-06-07-071530-ci-ln-ivyrk82-latest
injectiong
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-06-184250
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:919f026c74ab4739698eb5bebc5f201bd6028cd9246b0fc4415904abde35b1c1 --force
checking pod status and log
$ oc -n openshift-cluster-version get pods                                                                                                                                
NAME                                       READY   STATUS      RESTARTS   AGE
cluster-version-operator-ffd6f5b58-r8dm4   1/1     Running     0          32m
version--vwm5f-rbj6j                       0/1     Completed   0          11s
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j 
Defaulted container "rename-to-final-location" out of: rename-to-final-location, cleanup (init), make-temporary-directory (init), move-operator-manifests-to-temporary-directory (init), move-release-manifests-to-temporary-directory (init)
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j -c cleanup
(no output)
pod looks good! checking upgrade.. 
$ oc adm upgrade                                                      
info: An upgrade is in progress. Working towards 4.10.0-0.nightly-2022-06-06-184250: 95 of 771 done (12% complete)
upgrade started and progressing. 

everything looks good.

Comment 3 Evgeni Vakhonin 2022-06-19 11:02:44 UTC

already pre-merge verified in https://bugzilla.redhat.com/show_bug.cgi?id=2094078#c1
since it haven't transitioned automatically to verified, changing status manually

Comment 6 errata-xmlrpc 2022-06-28 11:50:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.20 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5172

Note You need to log in before you can comment on or make changes to this bug.