Bug 2098219

Summary: When attempting to upgrade from 4.10.16 to 4.10.17 using stable channel, pod version4.10.17 stuck in Init:Error and crashloop backup
Product: OpenShift Container Platform Reporter: Scott Worthington <scott.worthington>
Component: Cluster Version OperatorAssignee: Over the Air Updates <aos-team-ota>
Status: CLOSED DUPLICATE QA Contact: liujia <jiajliu>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.10CC: aos-team-ota, jack.ottofaro
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-06-24 17:24:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Scott Worthington 2022-06-17 15:37:53 UTC
NOTE: I am unsure if "Cluster Version Operator" is the correct component for this bug.

Description of problem:


When attempting to upgrade a 3CP x 2W OCP 4 cluster from 4.10.16 to 4.10.17 using the 4.10-stable channel, getting Init:Error and crashloop backoff in the version-4.10 pod in the openshift-cluster-version namespace.

operator-lifecycle-manager-catalog reports:
Encountered errors while checking compatibility with the next minor version of OpenShift: Desired release version missing from ClusterVersion

And log for the cleanup container in the version-4.10.17 pod contains:
rm: invalid option -- '0'
Try 'rm ./-0bX7BjpLoBa1j1hWXegtA' to remove the file '-0bX7BjpLoBa1j1hWXegtA'.
Try 'rm --help' for more information.

The log for the make-temporary-directory contains:
rm: invalid option -- '0'
Try 'rm ./-0bX7BjpLoBa1j1hWXegtA' to remove the file '-0bX7BjpLoBa1j1hWXegtA'.
Try 'rm --help' for more information.


How reproducible:

Steps to Reproduce:
1. Upgrade from 4.10.16 to 4.10.17

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Expected upgrade to start and progress.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Worthington 2022-06-17 16:17:14 UTC
Maybe related to?

https://bugzilla.redhat.com/show_bug.cgi?id=2091770

and maybe...
https://bugzilla.redhat.com/show_bug.cgi?id=2097557#c4

Comment 2 Scott Worthington 2022-06-17 17:28:28 UTC
This is an existing bug.

I stopped the upgrade with:

oc adm upgrade --clear

And then deleted the buggy CVO pod which is the cause of the bug via:

oc -n openshift-cluster-version delete pod -l k8s-app=cluster-version-operator


...following the deletion of the CVO pod, I was able to initiate the upgrade from 4.10.16 to 4.10.17

Comment 3 Jack Ottofaro 2022-06-24 17:24:27 UTC

*** This bug has been marked as a duplicate of bug 2091770 ***