Description of problem: when upgrade sriov operator from 4.8 to 4.9, found the service account did not be created in namespace. When describe the install plan, I can see the serviceaccount Resolving: sriov-network-operator.4.9.0-202108130204 Resource: Group: Kind: ServiceAccount Manifest: {"kind":"ConfigMap","name":"4d4332d303fc98b8e66a4cbc963b223d79db25315b4aa67148b4e4bc8039aed","namespace":"openshift-marketplace","catalogSourceName":"qe-app-registry","catalogSourceNamespace":"openshift-marketplace","replaces":"sriov-network-operator.4.8.0-202108130208","properties":"{\"properties\":[{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovNetworkPoolConfig\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovNetwork\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovOperatorConfig\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovIBNetwork\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovNetworkNodePolicy\",\"version\":\"v1\"}},{\"type\":\"olm.gvk\",\"value\":{\"group\":\"sriovnetwork.openshift.io\",\"kind\":\"SriovNetworkNodeState\",\"version\":\"v1\"}},{\"type\":\"olm.package\",\"value\":{\"packageName\":\"sriov-network-operator\",\"version\":\"4.9.0-202108130204\"}}]}"} Name: sriov-network-config-daemon Source Name: qe-app-registry Source Namespace: openshift-marketplace Version: v1 Status: Present However actually, the sa 'sriov-network-config-daemon' is not exist. # oc get sa -n openshift-sriov-network-operator NAME SECRETS AGE builder 2 79m default 2 79m deployer 2 79m network-resources-injector-sa 2 69m operator-webhook-sa 2 69m sriov-network-operator 2 77m # # oc get csv NAME DISPLAY VERSION REPLACES PHASE sriov-network-operator.4.9.0-202108130204 SR-IOV Network Operator 4.9.0-202108130204 sriov-network-operator.4.8.0-202108130208 Succeeded Version-Release number of selected component (if applicable): How reproducible: always Steps to Reproduce: 1. setup 4.9 cluster 2. setup sriov operator with 4.8 version 3. update channel to 4.9 for upgrade 4. check the serviceaccount 'sriov-network-config-daemon' Actual results: serviceaccount 'sriov-network-config-daemon' did not be created after upgrade. Expected results: Additional info: Not sure if this is OLM issue. so assign olm team for helping check firstly. please let know if you need more logs, thanks.
Is the 4.9 version of the operator bundle available anywhere for us to test this? The index image for 4.9 doesn't exist yet. Also, can you provide a must gather and an olm must gather `oc adm inspect --dest-dir=must-gather-olm -A olm`?
I poked around in the sriov manifest data, I think this is just because this fix didn't remove enough of the service accounts: https://github.com/openshift/sriov-network-operator/pull/550/files OLM generates all the service accounts defined in the CSV at bundle install time, and it looks like the sriov operator was already attempting to fix the pivoting problem you can run into on upgrade with the sriov-network-operator service account in the above pull request, but the manifest for the sriov-network-config-daemon service account wasn't removed https://github.com/openshift/sriov-network-operator/blob/master/bundle/manifests/sriov-network-config-daemon_v1_serviceaccount.yaml#L5 The operator manifests need to be updated to remove that service account as well, otherwise OLM is going to have upgrade problems. We recently added validation logic to the operator-framework to prevent adding service accounts like this to your operator bundle: https://github.com/operator-framework/api/pull/144/files Reassigning this to the sr-iov component
@krizza the service account 'sriov-network-config-daemon' is not defined in the CSV file, as it is not used by the operator deployment. If we remove the sriov-network-config-daemon_v1_serviceaccount.yaml from the operator bundle, how can this service account be created by OLM?
BTW, in my environment, I can see the SA sriov-network-config-daemon was first created, then removed, when the operator upgrade starts.
must-gather logs: http://file.apac.redhat.com/~zzhao/must-gather.tar.gz must-gather-olm: http://file.apac.redhat.com/~zzhao/must-gather-olm.tar.gz
*** Bug 1991493 has been marked as a duplicate of this bug. ***
There is a limitation in OLM, that if a service account was defined in the CSV file in the previous release, it cannot be moved out of the CSV file, otherwise, the SA will be removed by OLM when we upgrade the bundle.
Verified this bug on 4.9.0-202108191042 # oc get csv NAME DISPLAY VERSION REPLACES PHASE sriov-network-operator.4.9.0-202108191042 SR-IOV Network Operator 4.9.0-202108191042 sriov-network-operator.4.8.0-202108181331 Succeeded # oc get sa NAME SECRETS AGE builder 2 8h default 2 8h deployer 2 8h network-resources-injector-sa 2 8h operator-webhook-sa 2 8h sriov-device-plugin 2 7h58m sriov-network-config-daemon 2 8h sriov-network-operator 2 8h
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days