Bug 2026003

Summary: No seamless upgrade between v4.8.z to v4.9.z though OLM
Product: OpenShift Container Platform Reporter: Oren Cohen <ocohen>
Component: Performance Addon OperatorAssignee: Martin Sivák <msivak>
Status: CLOSED DUPLICATE QA Contact: Gowrishankar Rajaiyan <grajaiya>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.9CC: aos-bugs, shajmakh, ykashtan
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-11-24 09:10:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oren Cohen 2021-11-23 15:48:58 UTC
Description of problem:
Initial state: OCP 4.8 with PAO 4.8.1 installed.
When performing an upgrade to OCP 4.9, OLM used the production catalog source for v4.9 (pointing to registry.redhat.io/redhat/redhat-operator-index:v4.9 index image), but in this index image, PAO is publishing only 4.9 channel while the subscription made in previous version is configured with `.spec.channel` of v4.8, producing a resolution error in OLM.

The solution will be to publish also the previous channel of PAO (in this example, 4.8), so OLM package resolution will succeed after switching to v4.9 redhat-operators catalog source, and upgrade to PAO 4.9 will be triggered.
This applies to all PAO versions, 4.8 --> is an example.

A workaround exists to force the upgrade: restart OLM pods (olm-operator and catalog operator).

Version-Release number of selected component (if applicable):
OCP 4.9
PAO 4.9.1

How reproducible:
100%

Steps to Reproduce:
1. Have an OCP 4.8 cluster with performance-addon-operator 4.8.z installed
2. Upgrade the OCP cluster to 4.9
3. Change the channel in the PAO's subscription from "4.8" to "4.9" (4.9 is the only available channel).


Actual results:
Upgrade is not being triggered; there's an ResolutionFailed condition on the Subscription.

Expected results:
Upgrade to PAO 4.9.1 should be triggered, new install plan generated, new CSV for 4.9.1 generated, new performance-operator deployment, etc.

Additional info:

Comment 1 Martin Sivák 2021-11-23 22:05:54 UTC
> The solution will be to publish also the previous channel of PAO (in this example, 4.8)

PAO uses very broad skipRange - basically <4.8.0; 4.9.z), not upgrade edges. So OLM should simply ignore everything and upgrade to the latest version. We intentionally do not publish old CSVs due to this.

Moreover, if OLM restart fixes this, it is an OLM bug, don't you think? There are no new data for OLM and yet it starts working.

Comment 2 Yuval Kashtan 2021-11-24 07:25:54 UTC
well, dunno who's bug it is 
but the fact remains - there's no (easy) seamless way to upgrade PAO.
that should be addressed.

In this case, we observed that once you upgrade your cluster to 4.9, OLM have no info on PAO, because PAO does not publish the 4.8 channel in 4.9. It seems that other operators are doing that.

Comment 3 Oren Cohen 2021-11-24 09:10:26 UTC
Probably this was caused by the following OLM bug:
https://bugzilla.redhat.com/show_bug.cgi?id=2002276

The OCP cluster I used when upgrading PAO to 4.9.1 was 4.9.4, in which the OLM bug still existed.

I'll close this BZ, thanks.

*** This bug has been marked as a duplicate of bug 2002276 ***