Bug 2094078 - CVO gets stuck downloading an upgrade, with the version pod complaining about invalid options
Summary: CVO gets stuck downloading an upgrade, with the version pod complaining about...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.z
Assignee: W. Trevor King
QA Contact: Evgeni Vakhonin
URL:
Whiteboard:
Depends On: 2091770
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-06 18:45 UTC by OpenShift BugZilla Robot
Modified: 2022-07-27 07:24 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-28 11:50:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-version-operator pull 787 0 None Merged Bug 2094078: pkg/cvo/updatepayload: Guard against 'rm -fR -whatever' with ./* 2022-06-17 16:41:45 UTC
Red Hat Knowledge Base (Solution) 6965075 0 None None None 2022-06-29 18:53:24 UTC
Red Hat Product Errata RHBA-2022:5172 0 None None None 2022-06-28 11:50:45 UTC

Comment 1 Evgeni Vakhonin 2022-06-07 08:20:21 UTC
reproducing on Server Version: 4.10.0-0.nightly-2022-06-02-223105
injecting synthetic dashed-directory
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-03-081413
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41 --force 
$ oc -n openshift-cluster-version get pods
NAME                                        READY   STATUS       RESTARTS      AGE
cluster-version-operator-67454b6c58-wvt2g   1/1     Running      0             29m
version--p6rwq-wk6mk                        0/1     Init:Error   3 (28s ago)   46s
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk 
Error from server (BadRequest): container "rename-to-final-location" in pod "version--p6rwq-wk6mk" is waiting to start: PodInitializing
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk -c cleanup
rm: invalid option -- 'c'
Try 'rm ./-cccccccc' to remove the file '-cccccccc'.
Try 'rm --help' for more information.
$ oc adm upgrade
Cluster version is 4.10.0-0.nightly-2022-06-02-223105

ReleaseAccepted=False

  Reason: RetrievePayload
  Message: Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41" failure=Unable to download and prepare the update: deadline exceeded, reason: "DeadlineExceeded", message: "Job was active longer than specified deadline"




pre merge verifying with cluster bot Server Version: 4.10.0-0.ci.test-2022-06-07-071530-ci-ln-ivyrk82-latest
injectiong
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-06-184250
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:919f026c74ab4739698eb5bebc5f201bd6028cd9246b0fc4415904abde35b1c1 --force
checking pod status and log
$ oc -n openshift-cluster-version get pods                                                                                                                                
NAME                                       READY   STATUS      RESTARTS   AGE
cluster-version-operator-ffd6f5b58-r8dm4   1/1     Running     0          32m
version--vwm5f-rbj6j                       0/1     Completed   0          11s
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j 
Defaulted container "rename-to-final-location" out of: rename-to-final-location, cleanup (init), make-temporary-directory (init), move-operator-manifests-to-temporary-directory (init), move-release-manifests-to-temporary-directory (init)
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j -c cleanup
(no output)
pod looks good! checking upgrade.. 
$ oc adm upgrade                                                      
info: An upgrade is in progress. Working towards 4.10.0-0.nightly-2022-06-06-184250: 95 of 771 done (12% complete)
upgrade started and progressing. 

everything looks good.

Comment 3 Evgeni Vakhonin 2022-06-19 11:02:44 UTC
already pre-merge verified in https://bugzilla.redhat.com/show_bug.cgi?id=2094078#c1
since it haven't transitioned automatically to verified, changing status manually

Comment 6 errata-xmlrpc 2022-06-28 11:50:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.20 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5172


Note You need to log in before you can comment on or make changes to this bug.