Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1774732

Summary: OLM fails to upgrade Kiali operator when using Manual approvals
Product: OpenShift Container Platform Reporter: Edgar Hernández <ehernand>
Component: OLMAssignee: Bowen Song <bsong>
OLM sub component: OLM QA Contact: Tom Buskey <tbuskey>
Status: CLOSED INSUFFICIENT_DATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bandrade, bsong, nhale, scolange
Version: 4.2.z   
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1775213 1775216 (view as bug list) Environment:
Last Closed: 2019-11-27 16:34:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1775216    
Bug Blocks:    
Attachments:
Description Flags
OLM logs for manual strategy
none
Kiali operator cluster roles as created after installing 1.9.1
none
Kiali operator ClusterRoleBindings as created after installing 1.9.1
none
Kiali operator ServiceAccount as created after installing 1.9.1
none
FYI: OLM logs with automatic upgrade strategy (which does a successful upgrade)
none
Install plan before upgrade (first install of the operator)
none
Install plan when upgrading
none
catalog-operator log none

Description Edgar Hernández 2019-11-20 19:45:32 UTC
Created attachment 1638205 [details]
OLM logs for manual strategy

Description of problem:

OLM fails to correctly upgrade an operator when using manual approval for upgrades. After approving the install plan, the CSV is applied correctly, but all other dependent resources are not created. Thus, the affected operator doesn't work correctly due to RBAC issues.

Version-Release number of selected component (if applicable):

OCP 4.2.1.

How reproducible:

This was discovered when working on changes for the Kiali operator. Install version 1.9.1 of the Kiali operator using a custom source and using **manual** approvals for upgrades. Then, after uploading a new manifest with version 1.10.0 and approving the install, OLM erroes when applying a ClusteRole and sends the operator to a Pending state and never finishes the installation. 


Steps to Reproduce:
1. Open the OCP console and go to the OperatorHub page.
2. Search for the Kiali operator and click Install.
3. On the Install page, choose "Manual" for the update approval strategy. Leave the defaults for the other options. Then, install the operator.
4. Approve the install plan of the Kiali operator 1.9.1 and wait for its install. It should succeed.
5. Push a new version for the Kiali operator (say 1.10.0), and wait for OLM to scan for updates (or force a scan).
6. In the OCP console, go to the Subscription page of the Kiali operator. Wait for it to show "upgrading".
7. Approve the new install plan for the Kiali operator to start the upgrade.

Actual results:

After approving the install plan for the upgrade, OLM tries to apply the upgrade, but it fails. Then, apparently, it sends the Kiali operator to a "Pending" state and it never finishes the install.

Previous version 1.9.1 of the Kiali operator appears to be uninstalled successfully. New version (say 1.10.0) is deployed, but the pod never starts because of missing clusterroles, service accounts and other dependent resources.

Expected results:

After the manual approval of the install plan, OLM should properly upgrade the operator.

ALTERNATIVE: Perhaps, the subscription page could show a "retry" button, in case automatic retry is not feasible.

Additional info:

Looks like the issue is particular with Manual upgrades. I tried the Automatic upgrade strategy and it successfully applies the upgrade. So, *probably*, the issue may be generan and NOT specific to the Kiali operator.

I'm attaching:
- Logs of OLM under manual upgrade. Errors start at line 228.
- The following Kiali operator resources that OLM should be managing: ClusterRoles, ClusteRoleBindings, ServiceAccount

Comment 1 Edgar Hernández 2019-11-20 19:47:05 UTC
Created attachment 1638207 [details]
Kiali operator cluster roles as created after installing 1.9.1

Comment 2 Edgar Hernández 2019-11-20 19:47:52 UTC
Created attachment 1638209 [details]
Kiali operator ClusterRoleBindings as created after installing 1.9.1

Comment 3 Edgar Hernández 2019-11-20 19:48:18 UTC
Created attachment 1638210 [details]
Kiali operator ServiceAccount as created after installing 1.9.1

Comment 4 Edgar Hernández 2019-11-20 19:49:12 UTC
Created attachment 1638211 [details]
FYI: OLM logs with automatic upgrade strategy (which does a successful upgrade)

Comment 5 Edgar Hernández 2019-11-20 19:51:35 UTC
About the logs with the automatic upgrade, the upgrade seems to start at line 780.

Comment 6 Evan Cordell 2019-11-22 21:28:35 UTC
Could you please provide the `InstallPlan` objects that you are seeing in the manual approval case? And can you provide logs from `catalog-operator` during the failed upgrade?

Comment 7 Edgar Hernández 2019-11-26 22:14:26 UTC
Created attachment 1639973 [details]
Install plan before upgrade (first install of the operator)

Comment 8 Edgar Hernández 2019-11-26 22:15:05 UTC
Created attachment 1639974 [details]
Install plan when upgrading

Comment 9 Edgar Hernández 2019-11-26 22:16:03 UTC
Created attachment 1639975 [details]
catalog-operator log

Comment 10 Edgar Hernández 2019-11-26 22:20:53 UTC
I'm providing the requested data. However, I can no longer replicate.
I moved to OCP 4.2.4 and the issue seems to be gone.

Originally, I saw this issue on OCP 4.2.1.