Bug 1904584

Summary: Operator upgrades can delete existing CSV before completion
Product: OpenShift Container Platform Reporter: OpenShift BugZilla Robot <openshift-bugzilla-robot>
Component: OLMAssignee: Vu Dinh <vdinh>
OLM sub component: OLM QA Contact: Salvatore Colangelo <scolange>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: alkazako, assingh, bandrade, dageoffr, ecordell, htariq, kaczynsk, krizza, nhale
Version: 4.4Keywords: Triaged
Target Milestone: ---   
Target Release: 4.5.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-03 04:40:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1904583    
Bug Blocks: 1904585    

Comment 3 Kevin Rizza 2021-02-08 19:47:03 UTC
These test flakes still need to be addressed before this PR can merge.

Comment 7 Salvatore Colangelo 2021-02-25 18:18:48 UTC
[scolange@scolange ~]$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.33    True        False         4h35m   Cluster version is 4.5.33


[scolange@scolange ~]$ oc -n openshift-operator-lifecycle-manager exec catalog-operator-cf7576f5b-7zbfg -- olm --version
OLM version: 0.15.1
git commit: 83b6bbad794dec0fc1b923f1dab7aa08d1874cd2


1, Consume this special CatalogSource image.
[scolange@scolange ~]$ cat cs-etcd.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: etcd-test
  namespace: openshift-marketplace
spec:
  displayName: Salvo Test
  publisher: Salvo
  sourceType: grpc
  image: quay.io/olmqe/etcd-index:0.9.4-sa
  updateStrategy:
    registryPoll:
      interval: 10m

[scolange@scolange ~]$ oc create -f cs-etcd.yaml 
catalogsource.operators.coreos.com/etcd-test created


[scolange@scolange ~]$ oc get catalogsource -n openshift-marketplace
NAME                  DISPLAY               TYPE   PUBLISHER   AGE
...
etcd-test             Salvo Test            grpc   Salvo        60s




2, subscribe to the etcd operator with manual approval.

[scolange@scolange ~]$ cat og.yaml 
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: test-og
  namespace: default
spec:
  targetNamespaces:
  - default
[scolange@scolange ~]$oc create -f og.yaml 
operatorgroup.operators.coreos.com/test-og created

[scolange@scolange ~]$ cat sub-0.9.2.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: etcd-sub
  namespace: default
spec:
  installPlanApproval: Manual
  channel: alpha
  name: etcd
  source: etcd-test
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.2
[scolange@scolange ~]$ oc create -f sub-0.9.2.yaml 
subscription.operators.coreos.com/etcd-sub created



[scolange@scolange ~]$ oc get sub -n default
NAME       PACKAGE   SOURCE      CHANNEL
etcd-sub   etcd      etcd-test   alpha
[scolange@scolange ~]$ oc get ip -n default
NAME            CSV                   APPROVAL   APPROVED
install-672hf  etcdoperator.v0.9.2   Manual     false
[scolange@scolange ~]$ oc get csv -n default
No resources found in default namespace.





3, Approve etcdoperator.v0.9.2
[scolange@scolange ~]$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES   PHASE
etcdoperator.v0.9.2   etcd      0.9.2                Succeeded
[scolange@scolange ~]$ oc get ip
NAME            CSV                   APPROVAL   APPROVED
install-672hf   etcdoperator.v0.9.2   Manual     true
install-mj5k3   etcdoperator.v0.9.4   Manual     false

4, Approve etcdoperator.v0.9.4
[scolange@scolange ~]$ oc get ip
NAME            CSV                   APPROVAL   APPROVED
install-672hf   etcdoperator.v0.9.2   Manual     true
install-mj5k3   etcdoperator.v0.9.4   Manual     true
[scolange@scolange ~]$ oc get csv
NAME                  DISPLAY   VERSION   REPLACES              PHASE
etcdoperator.v0.9.2   etcd      0.9.2                           Replacing
etcdoperator.v0.9.4   etcd      0.9.4     etcdoperator.v0.9.2   Pending
[scolange@scolange ~]$ oc get sa
NAME            SECRETS   AGE
builder         2         71m
default         2         41m
deployer        2         71m
etcd-operator   2         2m11s


, The sa still exist and the owner is v0.9.2 csv.
[scolange@scolange ~]$ oc get sa etcd-operator -o yaml
apiVersion: v1
imagePullSecrets:
  name: etcd-operator
  namespace: default
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: ClusterServiceVersion
    name: etcdoperator.v0.9.2
    uid: c99f5618-0f1c-449b-9066-ba79ca48d31b
  resourceVersion: "32632"

The error info is "Service account is not owned by this ClusterServiceVersion", LGTM. Verify it.

[scolange@scolange ~]$ oc get sa etcd-operator -o yaml
apiVersion: v1
imagePullSecrets:
- name: etcd-operator-dockercfg-8f8fk
kind: ServiceAccount
...
  - group: ""
    kind: ServiceAccount
    message: Service account is not owned by this ClusterServiceVersion
    name: etcd-operator
    status: PresentNotSatisfied
    version: v1

Comment 9 errata-xmlrpc 2021-03-03 04:40:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.5.33 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0428