Bug 2094078

Summary: CVO gets stuck downloading an upgrade, with the version pod complaining about invalid options
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: Cluster Version OperatorAssignee: W. Trevor King <wking>
Status: CLOSED ERRATA QA Contact: Evgeni Vakhonin <evakhoni>
Severity: high Docs Contact:
Priority: high    
Version: 4.10CC: aos-team-ota, lmohanty, mbargenq, travi, wking
Target Milestone: ---Keywords: Regression, ServiceDeliveryImpact, Upgrades
Target Release: 4.10.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-28 11:50:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2091770    
Bug Blocks:    

Comment 1 Evgeni Vakhonin 2022-06-07 08:20:21 UTC
reproducing on Server Version: 4.10.0-0.nightly-2022-06-02-223105
injecting synthetic dashed-directory
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-03-081413
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41 --force 
$ oc -n openshift-cluster-version get pods
NAME                                        READY   STATUS       RESTARTS      AGE
cluster-version-operator-67454b6c58-wvt2g   1/1     Running      0             29m
version--p6rwq-wk6mk                        0/1     Init:Error   3 (28s ago)   46s
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk 
Error from server (BadRequest): container "rename-to-final-location" in pod "version--p6rwq-wk6mk" is waiting to start: PodInitializing
$ oc -n openshift-cluster-version logs version--p6rwq-wk6mk -c cleanup
rm: invalid option -- 'c'
Try 'rm ./-cccccccc' to remove the file '-cccccccc'.
Try 'rm --help' for more information.
$ oc adm upgrade
Cluster version is 4.10.0-0.nightly-2022-06-02-223105

ReleaseAccepted=False

  Reason: RetrievePayload
  Message: Retrieving payload failed version="" image="registry.ci.openshift.org/ocp/release@sha256:a6eeef7ca72d4127123f4e9a2733451553bc6e20c54a1c1ea34003481677ec41" failure=Unable to download and prepare the update: deadline exceeded, reason: "DeadlineExceeded", message: "Job was active longer than specified deadline"




pre merge verifying with cluster bot Server Version: 4.10.0-0.ci.test-2022-06-07-071530-ci-ln-ivyrk82-latest
injectiong
$ for NODE in $(oc get -l node-role.kubernetes.io/master= -o jsonpath='{.items[*].metadata.name}' nodes); do oc debug --as-root "node/${NODE}" -- mkdir -p /host/etc/cvo/updatepayloads/-cccccccc; done
upgrading to 4.10.0-0.nightly-2022-06-06-184250
$ oc adm upgrade --allow-explicit-upgrade --to-image registry.ci.openshift.org/ocp/release@sha256:919f026c74ab4739698eb5bebc5f201bd6028cd9246b0fc4415904abde35b1c1 --force
checking pod status and log
$ oc -n openshift-cluster-version get pods                                                                                                                                
NAME                                       READY   STATUS      RESTARTS   AGE
cluster-version-operator-ffd6f5b58-r8dm4   1/1     Running     0          32m
version--vwm5f-rbj6j                       0/1     Completed   0          11s
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j 
Defaulted container "rename-to-final-location" out of: rename-to-final-location, cleanup (init), make-temporary-directory (init), move-operator-manifests-to-temporary-directory (init), move-release-manifests-to-temporary-directory (init)
$ oc -n openshift-cluster-version logs version--vwm5f-rbj6j -c cleanup
(no output)
pod looks good! checking upgrade.. 
$ oc adm upgrade                                                      
info: An upgrade is in progress. Working towards 4.10.0-0.nightly-2022-06-06-184250: 95 of 771 done (12% complete)
upgrade started and progressing. 

everything looks good.

Comment 3 Evgeni Vakhonin 2022-06-19 11:02:44 UTC
already pre-merge verified in https://bugzilla.redhat.com/show_bug.cgi?id=2094078#c1
since it haven't transitioned automatically to verified, changing status manually

Comment 6 errata-xmlrpc 2022-06-28 11:50:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.10.20 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5172