2098219 – When attempting to upgrade from 4.10.16 to 4.10.17 using stable channel, pod version4.10.17 stuck in Init:Error and crashloop backup

Bug 2098219 - When attempting to upgrade from 4.10.16 to 4.10.17 using stable channel, pod version4.10.17 stuck in Init:Error and crashloop backup

Summary: When attempting to upgrade from 4.10.16 to 4.10.17 using stable channel, pod ...

Keywords:
Status:	CLOSED DUPLICATE of bug 2091770
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Cluster Version Operator
Sub Component:
Version:	4.10
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Over the Air Updates
QA Contact:	liujia
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-06-17 15:37 UTC by Scott Worthington
Modified:	2022-06-24 17:24 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-06-24 17:24:27 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Scott Worthington 2022-06-17 15:37:53 UTC

NOTE: I am unsure if "Cluster Version Operator" is the correct component for this bug.

Description of problem:


When attempting to upgrade a 3CP x 2W OCP 4 cluster from 4.10.16 to 4.10.17 using the 4.10-stable channel, getting Init:Error and crashloop backoff in the version-4.10 pod in the openshift-cluster-version namespace.

operator-lifecycle-manager-catalog reports:
Encountered errors while checking compatibility with the next minor version of OpenShift: Desired release version missing from ClusterVersion

And log for the cleanup container in the version-4.10.17 pod contains:
rm: invalid option -- '0'
Try 'rm ./-0bX7BjpLoBa1j1hWXegtA' to remove the file '-0bX7BjpLoBa1j1hWXegtA'.
Try 'rm --help' for more information.

The log for the make-temporary-directory contains:
rm: invalid option -- '0'
Try 'rm ./-0bX7BjpLoBa1j1hWXegtA' to remove the file '-0bX7BjpLoBa1j1hWXegtA'.
Try 'rm --help' for more information.


How reproducible:

Steps to Reproduce:
1. Upgrade from 4.10.16 to 4.10.17

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:

Expected upgrade to start and progress.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Scott Worthington 2022-06-17 16:17:14 UTC

Maybe related to?

https://bugzilla.redhat.com/show_bug.cgi?id=2091770

and maybe...
https://bugzilla.redhat.com/show_bug.cgi?id=2097557#c4

Comment 2 Scott Worthington 2022-06-17 17:28:28 UTC

This is an existing bug.

I stopped the upgrade with:

oc adm upgrade --clear

And then deleted the buggy CVO pod which is the cause of the bug via:

oc -n openshift-cluster-version delete pod -l k8s-app=cluster-version-operator


...following the deletion of the CVO pod, I was able to initiate the upgrade from 4.10.16 to 4.10.17

Comment 3 Jack Ottofaro 2022-06-24 17:24:27 UTC


*** This bug has been marked as a duplicate of bug 2091770 ***

Note You need to log in before you can comment on or make changes to this bug.