Bug 1989456 - sriov operator cannot be upgraded to 4.9 from 4.8
Summary: sriov operator cannot be upgraded to 4.9 from 4.8
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.9
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.9.0
Assignee: Peng Liu
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-03 09:21 UTC by zhaozhanqi
Modified: 2021-10-18 17:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:44:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sriov-network-operator pull 550 0 None None None 2021-08-04 06:37:28 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:44:18 UTC

Description zhaozhanqi 2021-08-03 09:21:41 UTC
Description of problem:
sriov operator upgraded from 4.8 to 4.9, it's pending due to  one or more requirements couldn't be found

# oc get csv
NAME                                        DISPLAY                   VERSION              REPLACES                                    PHASE
sriov-network-operator.4.8.0-202107291502   SR-IOV Network Operator   4.8.0-202107291502                                               Replacing
sriov-network-operator.4.9.0-202108021349   SR-IOV Network Operator   4.9.0-202108021349   sriov-network-operator.4.8.0-202107291502   Pending


# oc describe csv sriov-network-operator.4.9.0-202108021349


..

<snip>

  Requirement Status:
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD is present and Established condition is true
    Name:     sriovibnetworks.sriovnetwork.openshift.io
    Status:   Present
    Uuid:     c9848a53-2f7c-4a6a-9cee-5352168f3400
    Version:  v1
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD installed alongside other CSV(s): sriov-network-operator.4.8.0-202107291502
    Name:     sriovnetworknodepolicies.sriovnetwork.openshift.io
    Status:   PresentNotSatisfied
    Version:  v1
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD installed alongside other CSV(s): sriov-network-operator.4.8.0-202107291502
    Name:     sriovnetworknodestates.sriovnetwork.openshift.io
    Status:   PresentNotSatisfied
    Version:  v1
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD is not present
    Name:     sriovnetworkpoolconfigs.sriovnetwork.openshift.io
    Status:   NotPresent
    Version:  v1
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD is present and Established condition is true
    Name:     sriovnetworks.sriovnetwork.openshift.io
    Status:   Present
    Uuid:     ec093c6c-1def-43e5-bc1a-8c299c741a89
    Version:  v1
    Group:    apiextensions.k8s.io
    Kind:     CustomResourceDefinition
    Message:  CRD is present and Established condition is true
    Name:     sriovoperatorconfigs.sriovnetwork.openshift.io
    Status:   Present
    Uuid:     ebb9dda9-bf7a-4586-b868-a331d760f33f
    Version:  v1
    Group:    
    Kind:     ServiceAccount
    Message:  Service account is owned by another ClusterServiceVersion
    Name:     sriov-network-operator
    Status:   PresentNotSatisfied
    Version:  v1


<snip>


above show: CRD 'sriovnetworkpoolconfigs.sriovnetwork.openshift.io' is not present

# oc get crd sriovnetworkpoolconfigs.sriovnetwork.openshift.io -o yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.4.1
    operatorframework.io/installed-alongside-1ff2c668e58dcbc9: openshift-sriov-network-operator/sriov-network-operator.4.9.0-202108021349
  creationTimestamp: "2021-08-03T07:11:13Z"
  generation: 1
  labels:
    operators.coreos.com/sriov-network-operator.openshift-sriov-network-operator: ""
  name: sriovnetworkpoolconfigs.sriovnetwork.openshift.io
  resourceVersion: "78195"
  uid: 5843effc-1670-41d0-8986-5afcc5471d5e
spec:
  conversion:
    strategy: None
  group: sriovnetwork.openshift.io
  names:
    kind: SriovNetworkPoolConfig
    listKind: SriovNetworkPoolConfigList
    plural: sriovnetworkpoolconfigs
    singular: sriovnetworkpoolconfig
  scope: Namespaced
  versions:
  - name: v1
    schema:
      openAPIV3Schema:
        description: SriovNetworkPoolConfig is the Schema for the sriovnetworkpoolconfigs
          API
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: SriovNetworkPoolConfigSpec defines the desired state of SriovNetworkPoolConfig
            properties:
              ovsHardwareOffloadConfig:
                description: OvsHardwareOffloadConfig describes the OVS HWOL configuration
                  for selected Nodes
                properties:
                  name:
                    description: 'Name is mandatory and must be unique. On Kubernetes:
                      Name is the name of OvsHardwareOffloadConfig On OpenShift: Name
                      is the name of MachineConfigPool to be enabled with OVS hardware
                      offload'
                    type: string
                type: object
            type: object
          status:
            description: SriovNetworkPoolConfigStatus defines the observed state of
              SriovNetworkPoolConfig
            type: object
        type: object
    served: true
    storage: true
    subresources:
      status: {}
status:
  acceptedNames:
    kind: SriovNetworkPoolConfig
    listKind: SriovNetworkPoolConfigList
    plural: sriovnetworkpoolconfigs
    singular: sriovnetworkpoolconfig
  conditions:
  - lastTransitionTime: "2021-08-03T07:11:13Z"
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: "2021-08-03T07:11:13Z"
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  storedVersions:
  - v1


Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1. upgrade sriov operator from 4.8 to 4.9
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 zhaozhanqi 2021-08-03 09:23:11 UTC
it works when setup new sriov operator with 4.9 directly

Comment 3 Jian Zhang 2021-08-03 10:26:03 UTC
The reproduce:
1, Install OCP 4.9
2, Install QE CatalogSource.

[cloud-user@preserve-olm-env jian]$ cat cs-qe.yaml 
apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: qe-app-registry
  namespace: openshift-marketplace
spec:
  displayName: Production Operators
  image: quay.io/openshift-qe-optional-operators/ocp4-index:latest
  publisher: OpenShift QE
  sourceType: grpc
  updateStrategy:
    registryPoll:
      interval: 15m
[cloud-user@preserve-olm-env jian]$ oc create -f cs-qe.yaml 
catalogsource.operators.coreos.com/qe-app-registry created

3, Create ICSP so that can pull its image successfully.

[cloud-user@preserve-olm-env jian]$ cat icsp-qe.yaml 
apiVersion: operator.openshift.io/v1alpha1
kind: ImageContentSourcePolicy
metadata:
  name: brew-registry
spec:
  repositoryDigestMirrors:
  - mirrors:
    - brew.registry.redhat.io
    source: registry.redhat.io
  - mirrors:
    - brew.registry.redhat.io
    source: registry.stage.redhat.io
  - mirrors:
    - brew.registry.redhat.io
    source: registry-proxy.engineering.redhat.com

4, Subscribe this sriov operator from the QE CatlaogSource.

[cloud-user@preserve-olm-env jian]$ oc get sub
NAME                     PACKAGE                  SOURCE            CHANNEL
sriov-network-operator   sriov-network-operator   qe-app-registry   4.8
[cloud-user@preserve-olm-env jian]$ oc get ip
NAME            CSV                                         APPROVAL    APPROVED
install-knmvk   sriov-network-operator.4.8.0-202107291502   Automatic   true
[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                                        DISPLAY                   VERSION              REPLACES   PHASE
sriov-network-operator.4.8.0-202107291502   SR-IOV Network Operator   4.8.0-202107291502              Succeeded

5, Change the operator's subscription channel to 4.9.

[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                                        DISPLAY                   VERSION              REPLACES                                    PHASE
sriov-network-operator.4.8.0-202107291502   SR-IOV Network Operator   4.8.0-202107291502                                               Replacing
sriov-network-operator.4.9.0-202108021349   SR-IOV Network Operator   4.9.0-202108021349   sriov-network-operator.4.8.0-202107291502   Pending

Got the warning below:

  - message: 'constraints not satisfiable: subscription sriov-network-operator exists,
      clusterserviceversion sriov-network-operator.4.8.0-202107291502 exists and is
      not referenced by a subscription, @existing/openshift-sriov-network-operator//sriov-network-operator.4.9.0-202108021349,
      @existing/openshift-sriov-network-operator//sriov-network-operator.4.8.0-202107291502
      and qe-app-registry/openshift-marketplace/4.9/sriov-network-operator.4.9.0-202108021349
      originate from package sriov-network-operator, subscription sriov-network-operator
      requires at least one of qe-app-registry/openshift-marketplace/4.9/sriov-network-operator.4.9.0-202108021349
      or @existing/openshift-sriov-network-operator//sriov-network-operator.4.9.0-202108021349'
    reason: ConstraintsNotSatisfiable
    status: "True"
    type: ResolutionFailed

[cloud-user@preserve-olm-env jian]$ oc get csv
NAME                                        DISPLAY                   VERSION              REPLACES                                    PHASE
sriov-network-operator.4.8.0-202107291502   SR-IOV Network Operator   4.8.0-202107291502                                               Replacing
sriov-network-operator.4.9.0-202108021349   SR-IOV Network Operator   4.9.0-202108021349   sriov-network-operator.4.8.0-202107291502   Pending
[cloud-user@preserve-olm-env jian]$ oc get csv sriov-network-operator.4.9.0-202108021349 -o yaml
...
  - group: ""
    kind: ServiceAccount
    message: Service account is owned by another ClusterServiceVersion
    name: sriov-network-operator
    status: PresentNotSatisfied
    version: v1


[cloud-user@preserve-olm-env jian]$ oc get sa
NAME                            SECRETS   AGE
builder                         2         3h30m
default                         2         3h30m
deployer                        2         3h30m
network-resources-injector-sa   2         3h4m
operator-webhook-sa             2         3h4m
sriov-cni                       2         176m
sriov-device-plugin             2         176m
sriov-network-config-daemon     2         3h5m
sriov-network-operator          2         3h5m
[cloud-user@preserve-olm-env jian]$ oc get sa sriov-network-operator -o yaml
apiVersion: v1
imagePullSecrets:
- name: sriov-network-operator-dockercfg-7b9s9
kind: ServiceAccount
metadata:
  creationTimestamp: "2021-08-03T06:35:20Z"
  labels:
    operators.coreos.com/sriov-network-operator.openshift-sriov-network-operator: ""
  name: sriov-network-operator
  namespace: openshift-sriov-network-operator
  ownerReferences:
  - apiVersion: operators.coreos.com/v1alpha1
    blockOwnerDeletion: false
    controller: false
    kind: ClusterServiceVersion
    name: sriov-network-operator.4.8.0-202107291502
    uid: 9cae99fb-13b7-4445-b268-9f783b507176
  resourceVersion: "78176"
  uid: 8b8d22cd-773c-4423-bdfa-35a284ac3502
secrets:
- name: sriov-network-operator-token-z8sth
- name: sriov-network-operator-dockercfg-7b9s9

Comment 4 zhaozhanqi 2021-08-03 11:03:47 UTC
cc

Comment 5 tflannag 2021-08-03 15:23:17 UTC
It looks like resolution failed as the 4.9 sriov-network-operator bundle contains a reference to a ServiceAccount manifest where the metadata.Name of that manifest matches the generated name from the 4.9 CSV resource: https://github.com/openshift/sriov-network-operator/blob/release-4.9/manifests/4.9/sriov-network-operator_v1_serviceaccount.yaml.

You can compare that to the 4.8 bundle, which is missing that ServiceAccount manifest: https://github.com/openshift/sriov-network-operator/tree/release-4.8/manifests/4.8.

The solution here is to update the 4.9 bundle, remove the reference to the ServiceAccount manifest (or rename it so it doesn't conflict with the generated one referenced in the CSV resource), and rebuild the bundle/attach to the QE's index image.

Another quick note - it looks like that bundle had been generated using operator-sdk tooling - we may want to add a validation check in the future that ensures this behavior isn't easily reproducible.

Closing this as a duplicate of 1974438 - See Kevin's comment in https://bugzilla.redhat.com/show_bug.cgi?id=1974438#c3 for more information.

*** This bug has been marked as a duplicate of bug 1974438 ***

Comment 6 Jian Zhang 2021-08-04 01:50:25 UTC
Hi Tim,

Thanks for your information! 

> The solution here is to update the 4.9 bundle, remove the reference to the ServiceAccount manifest (or rename it so it doesn't conflict with the generated one referenced in the CSV resource), and rebuild the bundle/attach to the QE's index image.

That make sense, since bug 1974438 had been closed with WON"T FIX, I reopen this bug and transfer it to the Sriov network operator team.

> Another quick note - it looks like that bundle had been generated using operator-sdk tooling - we may want to add a validation check in the future that ensures this behavior isn't easily reproducible.

Yes, do you mind opening a bug or an RFE to trace this issue? Thanks!

Comment 8 zhaozhanqi 2021-08-09 06:20:37 UTC
Verified this bug on 4.9.0-202108060405

# oc get csv
NAME                                        DISPLAY                   VERSION              REPLACES                                    PHASE
sriov-network-operator.4.9.0-202108060405   SR-IOV Network Operator   4.9.0-202108060405   sriov-network-operator.4.8.0-202108041625   Succeeded

Comment 11 errata-xmlrpc 2021-10-18 17:44:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.