Bug 1924970

Summary: Operator Upgrade failed for serviceaccount ownership
Product: OpenShift Container Platform Reporter: Jatan Malde <jmalde>
Component: OLMAssignee: Evan Cordell <ecordell>
OLM sub component: OLM QA Contact: Jian Zhang <jiazha>
Status: CLOSED DUPLICATE Docs Contact:
Severity: urgent    
Priority: urgent CC: aivaraslaimikis, anbhatta, assingh, bluddy, cpassare, davegord, ecordell, jnordell, krizza, sarora
Version: 4.6.zKeywords: Triaged
Target Milestone: ---Flags: davegord: needinfo-
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-06-18 15:53:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jatan Malde 2021-02-04 02:45:32 UTC
Description of problem:

IHAC who upgraded the cluster to 4.6.9 to run with the fix mentioned in this bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=1904583.

Once the cluster was on 4.6.9 they attempted to upgrade the ocs operator from 4.5.1 to 4.5.2. which is stuck as follows, 

~~~
# oc get csv

NAME                  DISPLAY                       VERSION   REPLACES              PHASE
ocs-operator.v4.5.1   OpenShift Container Storage   4.5.1                           Replacing
ocs-operator.v4.5.2   OpenShift Container Storage   4.5.2     ocs-operator.v4.5.1   Pending


# oc get installplan

NAME            CSV                   APPROVAL   APPROVED
install-2b8gh   ocs-operator.v4.5.2   Manual     true
install-9clkl   ocs-operator.v4.5.1   Manual     true
~~~

The CSV of ocs-operator.v4.5.2 shows the following requirement which were not met.


Phase:                 Installing
Reason:                InstallWaiting
Last Transition Time:  2021-01-13T13:42:04Z
Last Update Time:      2021-01-13T13:42:04Z
Message:               install timeout
Phase:                 Failed
Reason:                InstallCheckFailed
Last Transition Time:  2021-01-13T13:42:04Z
Last Update Time:      2021-01-13T13:42:04Z
Message:               requirements not met
Phase:                 Pending
Reason:                RequirementsNotMet
Last Transition Time:    2021-01-13T13:42:04Z
Last Update Time:        2021-01-13T13:42:07Z
Message:                 one or more requirements couldn't be found
Phase:                   Pending
Reason:                  RequirementsNotMet

[..]

 Name:     noobaa
    Status:   PresentNotSatisfied
    Version:  v1
    Group:
    Kind:     ServiceAccount
    Message:  Service account is not owned by this ClusterServiceVersion       
    Name:     rook-ceph-global
    Status:   PresentNotSatisfied
    Version:  v1
    Group:
    Kind:     ServiceAccount
    Message:  Service account is not owned by this ClusterServiceVersion
    Name:     rook-csi-cephfs-plugin-sa
    Status:   PresentNotSatisfied
    Version:  v1
    Group:
    Kind:     ServiceAccount
---

Will attach must-gather and the applied workaround in the following comments.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 26 Ben Luddy 2021-06-18 15:53:36 UTC
The bug identified from the initial reports was resolved for 4.8.0 (https://bugzilla.redhat.com/show_bug.cgi?id=1934080) with backports to 4.7.11 (https://bugzilla.redhat.com/show_bug.cgi?id=1949139) and 4.6.30 (https://bugzilla.redhat.com/show_bug.cgi?id=1955112). The bugfix prevents this condition from occurring, but can not automatically resolve the condition. Removing (Subscription and ClusterServiceVersions) and reinstalling the affected operator after applying the bugfix should resolve a blocked upgrade.

This issue remained open in order to prove or exclude https://bugzilla.redhat.com/show_bug.cgi?id=1923111 as an explanation for case (1) described in https://bugzilla.redhat.com/show_bug.cgi?id=1924970#c9, but there hasn't been any new evidence.

I'm closing this issue because it combines troubleshooting steps for multiple clusters under varying circumstances, and there has been no evidence of a second issue. If you believe that you have encountered an issue that is different from https://bugzilla.redhat.com/show_bug.cgi?id=1934080, please open a *new* bug.

Comment 27 Ben Luddy 2021-06-18 15:54:38 UTC

*** This bug has been marked as a duplicate of bug 1934080 ***