Bug 1904830 - CNV upgrade to 2.5.2 fails because KubevirtCommonTemplatesBundle cr fails to update to the target version
Summary: CNV upgrade to 2.5.2 fails because KubevirtCommonTemplatesBundle cr fails to ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: SSP
Version: 2.5.2
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 2.5.2
Assignee: Omer Yahud
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-06 18:36 UTC by Ruth Netser
Modified: 2020-12-17 21:45 UTC (History)
3 users (show)

Fixed In Version: kubevirt-ssp-operator-container-v2.5.2-5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-16 00:16:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:5560 0 None None None 2020-12-16 00:17:01 UTC

Description Ruth Netser 2020-12-06 18:36:52 UTC
Description of problem:
Upgrade CNV 2.4.4 (from production) to CNV 2.5.2 (from osbs).
Upgrade fails.

Version-Release number of selected component (if applicable):
kubevirt-ssp-operator-container-v2.5.2-4

How reproducible:
100% (happened on 2 clusters)

Steps to Reproduce:
1. Install OCP 4.5 and CNV 2.4.4 from production
2. Update OCP to 4.6
3. Upgrade CNV to 2.5.2 from osbs

Actual results:
HCO upgrade fails:
      message: 'An unhandled exception occurred while running the lookup plugin ''k8s''.
        Error was a <class ''ansible.errors.AnsibleError''>, original message: Failed
        to find exact match for kubevirt.io/v1.KubevirtCommonTemplatesBundle by [kind,
        name, singularName, shortNames]'
      reason: Failed
      status: "True"
      type: Failure
    observedVersion: v2.4.4
    operatorVersion: v2.5.2
    targetVersion: v2.5.2


Expected results:
Upgrade should end successfully.

Additional info:
==============================================
$ oc get csv -n openshift-cnv
NAME                                      DISPLAY                    VERSION   REPLACES                                  PHASE
kubevirt-hyperconverged-operator.v2.5.2   OpenShift Virtualization   2.5.2     kubevirt-hyperconverged-operator.v2.4.4   Installing

==============================================

$ oc get KubevirtCommonTemplatesBundle -n openshift -oyaml
apiVersion: v1
items:
- apiVersion: ssp.kubevirt.io/v1
  kind: KubevirtCommonTemplatesBundle
  metadata:
    creationTimestamp: "2020-12-02T18:59:45Z"
    generation: 1
    labels:
      app: kubevirt-hyperconverged
    managedFields:
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:app: {}
        f:spec: {}
      manager: hyperconverged-cluster-operator
      operation: Update
      time: "2020-12-02T18:59:45Z"
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          f:observedVersion: {}
          f:operatorVersion: {}
          f:targetVersion: {}
      manager: OpenAPI-Generator
      operation: Update
      time: "2020-12-06T14:31:50Z"
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          .: {}
          f:conditions: {}
      manager: ansible-operator
      operation: Update
      time: "2020-12-06T18:00:26Z"
    name: common-templates-kubevirt-hyperconverged
    namespace: openshift
    resourceVersion: "4245877"
    selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift/kubevirtcommontemplatesbundles/common-templates-kubevirt-hyperconverged
    uid: 00698e4b-1846-40b5-9174-e01567563608
  spec: {}
  status:
    conditions:
    - lastTransitionTime: "2020-12-06T14:31:51Z"
      message: 'Templates progressing (deployed templates: 52, desired deployed templated:
        58).'
      reason: progressing
      status: "False"
      type: Progressing
    - lastTransitionTime: "2020-12-06T14:31:51Z"
      message: 'Common templates available (deployed templates: 52, desired deployed
        templated: 58).'
      reason: available
      status: "True"
      type: Available
    - lastTransitionTime: "2020-12-06T14:31:51Z"
      message: 'Templates degraded (deployed templates: 52, desired deployed templated:
        58).'
      reason: degraded
      status: "False"
      type: Degraded
    - lastTransitionTime: "2020-12-06T17:59:49Z"
      message: Running reconciliation
      reason: Running
      status: "False"
      type: Running
    - lastTransitionTime: "2020-12-06T18:00:26Z"
      message: 'An unhandled exception occurred while running the lookup plugin ''k8s''.
        Error was a <class ''ansible.errors.AnsibleError''>, original message: Failed
        to find exact match for kubevirt.io/v1.KubevirtCommonTemplatesBundle by [kind,
        name, singularName, shortNames]'
      reason: Failed
      status: "True"
      type: Failure
    observedVersion: v2.4.4
    operatorVersion: v2.5.2
    targetVersion: v2.5.2
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

==============================================
$ oc describe hco -n openshift-cnv kubevirt-hyperconverged 
Name:         kubevirt-hyperconverged
Namespace:    openshift-cnv
Labels:       app=kubevirt-hyperconverged
Annotations:  API Version:  hco.kubevirt.io/v1beta1
Kind:         HyperConverged
Metadata:
  Creation Timestamp:  2020-12-02T18:59:02Z
  Finalizers:
    hyperconvergeds.hco.kubevirt.io
  Generation:        2
  Resource Version:  4267414
  Self Link:         /apis/hco.kubevirt.io/v1beta1/namespaces/openshift-cnv/hyperconvergeds/kubevirt-hyperconverged
  UID:               d9d86269-a51c-49b7-9cca-a31b68294f4d
Spec:
  Version:  v2.4.4
Status:
  Conditions:
    Last Heartbeat Time:   2020-12-06T18:24:19Z
    Last Transition Time:  2020-12-02T18:59:46Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  ReconcileComplete
    Last Heartbeat Time:   2020-12-06T18:24:19Z
    Last Transition Time:  2020-12-06T14:35:30Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  Available
    Last Heartbeat Time:   2020-12-06T18:24:19Z
    Last Transition Time:  2020-12-06T14:30:58Z
    Message:               HCO is now upgrading to version v2.5.2
    Reason:                HCOUpgrading
    Status:                True
    Type:                  Progressing
    Last Heartbeat Time:   2020-12-06T18:24:19Z
    Last Transition Time:  2020-12-06T14:35:30Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                False
    Type:                  Degraded
    Last Heartbeat Time:   2020-12-06T18:24:19Z
    Last Transition Time:  2020-12-06T14:35:30Z
    Message:               Reconcile completed successfully
    Reason:                ReconcileCompleted
    Status:                True
    Type:                  Upgradeable
  Related Objects:
    API Version:       scheduling.k8s.io/v1
    Kind:              PriorityClass
    Name:              kubevirt-cluster-critical
    Resource Version:  65583
    UID:               0bcf7a73-d1db-411c-a0de-d376c8b77ac2
    API Version:       v1
    Kind:              ConfigMap
    Name:              kubevirt-config
    Namespace:         openshift-cnv
    Resource Version:  4054797
    UID:               1506979d-54ff-4d3c-9411-184314233bd8
    API Version:       v1
    Kind:              ConfigMap
    Name:              kubevirt-storage-class-defaults
    Namespace:         openshift-cnv
    Resource Version:  65595
    UID:               b0cb3e73-6182-4583-a59c-d2ff9895410b
    API Version:       kubevirt.io/v1alpha3
    Kind:              KubeVirt
    Name:              kubevirt-kubevirt-hyperconverged
    Namespace:         openshift-cnv
    Resource Version:  4059896
    UID:               b0fe95d2-03b8-4dfe-8a17-6752ce69ea5d
    API Version:       cdi.kubevirt.io/v1alpha1
    Kind:              CDI
    Name:              cdi-kubevirt-hyperconverged
    Resource Version:  881251
    UID:               00cf5b9a-3f16-436f-b70e-5063cb90b4e6
    API Version:       networkaddonsoperator.network.kubevirt.io/v1alpha1
    Kind:              NetworkAddonsConfig
    Name:              cluster
    Resource Version:  4051423
    UID:               ffdc6a64-742b-484e-8ea2-cf13ae551f1d
    API Version:       ssp.kubevirt.io/v1
    Kind:              KubevirtCommonTemplatesBundle
    Name:              common-templates-kubevirt-hyperconverged
    Namespace:         openshift
    Resource Version:  4261519
    UID:               00698e4b-1846-40b5-9174-e01567563608
    API Version:       ssp.kubevirt.io/v1
    Kind:              KubevirtNodeLabellerBundle
    Name:              node-labeller-kubevirt-hyperconverged
    Namespace:         openshift-cnv
    Resource Version:  4153869
    UID:               db9bc012-a721-4ba5-a379-85ea85348942
    API Version:       ssp.kubevirt.io/v1
    Kind:              KubevirtTemplateValidator
    Name:              template-validator-kubevirt-hyperconverged
    Namespace:         openshift-cnv
    Resource Version:  4153848
    UID:               412f1864-1237-4d2c-8a65-27175bde826f
    API Version:       ssp.kubevirt.io/v1
    Kind:              KubevirtMetricsAggregation
    Name:              metrics-aggregation-kubevirt-hyperconverged
    Namespace:         openshift-cnv
    Resource Version:  4153663
    UID:               781abc7f-2d40-4bcf-b121-2ae60c95c35a
    API Version:       v1
    Kind:              ConfigMap
    Name:              v2v-vmware
    Namespace:         openshift-cnv
    Resource Version:  4054805
    UID:               2b81ea09-eb76-4d6c-931b-e4e077730f6b
    API Version:       v2v.kubevirt.io/v1alpha1
    Kind:              VMImportConfig
    Name:              vmimport-kubevirt-hyperconverged
    Resource Version:  880716
    UID:               c09361ba-daac-46ea-ac02-4310f9b7b211
    API Version:       console.openshift.io/v1
    Kind:              ConsoleCLIDownload
    Name:              virtctl-clidownloads-kubevirt-hyperconverged
    Resource Version:  4054787
    UID:               696c4a52-9bb4-472e-aaed-549047ce49f8
    API Version:       cdi.kubevirt.io/v1beta1
    Kind:              CDI
    Name:              cdi-kubevirt-hyperconverged
    Resource Version:  4054708
    UID:               00cf5b9a-3f16-436f-b70e-5063cb90b4e6
    API Version:       networkaddonsoperator.network.kubevirt.io/v1
    Kind:              NetworkAddonsConfig
    Name:              cluster
    Resource Version:  4267413
    UID:               ffdc6a64-742b-484e-8ea2-cf13ae551f1d
    API Version:       v2v.kubevirt.io/v1beta1
    Kind:              VMImportConfig
    Name:              vmimport-kubevirt-hyperconverged
    Resource Version:  4055884
    UID:               c09361ba-daac-46ea-ac02-4310f9b7b211
    API Version:       rbac.authorization.k8s.io/v1
    Kind:              Role
    Name:              hco.kubevirt.io:config-reader
    Namespace:         openshift-cnv
    Resource Version:  4054800
    UID:               3b962e6a-434d-480d-a622-4df36133c43a
    API Version:       rbac.authorization.k8s.io/v1
    Kind:              RoleBinding
    Name:              hco.kubevirt.io:config-reader
    Namespace:         openshift-cnv
    Resource Version:  4054804
    UID:               788dc625-b50e-415f-953f-61738c842728
  Versions:
    Name:     operator
    Version:  v2.4.4
Events:
  Type     Reason          Age                     From                     Message
  ----     ------          ----                    ----                     -------
  Normal   ReconcileHCO    148m (x620 over 3h54m)  kubevirt-hyperconverged  HCO Upgrade in progress
  Normal   UpgradeHCO      143m                    kubevirt-hyperconverged  Upgrading the HyperConverged to version v2.5.2
  Warning  HcoUpdateError  140m (x5 over 143m)     kubevirt-hyperconverged  Failed to update HCO Status
  Normal   ReconcileHCO    2m52s (x735 over 143m)  kubevirt-hyperconverged  HCO Upgrade in progress


==============================================


$ oc get template -n openshift |grep 'rhel\|fedora\|win' |wc -l
60

Comment 4 Omer Yahud 2020-12-07 12:21:53 UTC
I found what the issue is, the operator assumes that all deployed templates are of the same version (template.kubevirt.io/version: "v0.12.4")
But that is not the case anymore after the last fix I submitted for 2.5.2 (https://bugzilla.redhat.com/show_bug.cgi?id=1899460), where older templates do not get the latest version.

Working on a fix right now

Comment 6 Omer Yahud 2020-12-07 13:56:57 UTC
Upstream PR: https://github.com/kubevirt/kubevirt-ssp-operator/pull/255

Comment 7 Omer Yahud 2020-12-08 15:13:10 UTC
gerrit patch: https://code.engineering.redhat.com/gerrit/#/c/220554/

Comment 8 Ruth Netser 2020-12-13 12:10:24 UTC
Tested cnv 2.4.4->cnv 2.5.2 upgrade (kubevirt-ssp-operator-container-v2.5.2-5):
Upgrade ends successfully and KubevirtCommonTemplatesBundle conditions are ok: Progressing=False, Available=True, Degraded=False.

There is however a message about the number of deployed templates vs desired:

      message: 'Templates progressing (deployed templates: 60, desired deployed templated:
        58).'

There are 60 deployed templates:
$ oc get template -n openshift  |grep 'win\|fed\|rhel'|wc -l
60

But the desired number should be lower than 58:
$ oc get template -n openshift  |grep win|wc -l
16

$ for i in `oc get template -n openshift -oname | cut -d'/' -f2 | grep 'win'`; do echo $i; oc get template -n openshift $i -oyaml |grep "    template.kubevirt.io/version\|depreca" | grep -v 'v1a\|f:'; done
win2k12r2-desktop-large-v0.11.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.11.3
win2k12r2-desktop-medium-v0.11.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.11.3
win2k12r2-server-large-v0.11.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.11.3
win2k12r2-server-medium-v0.11.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.11.3
windows-server-large-v0.11.3
    template.kubevirt.io/version: v0.11.3
windows-server-large-v0.12.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.12.3
windows-server-medium-v0.11.3
    template.kubevirt.io/version: v0.11.3
windows-server-medium-v0.12.3
    template.kubevirt.io/deprecated: "true"
    template.kubevirt.io/version: v0.12.3
windows10-desktop-large-v0.11.3
    template.kubevirt.io/version: v0.12.4
windows10-desktop-medium-v0.11.3
    template.kubevirt.io/version: v0.12.4
windows2k12r2-server-large-v0.12.3
    template.kubevirt.io/version: v0.12.4
windows2k12r2-server-medium-v0.12.3
    template.kubevirt.io/version: v0.12.4
windows2k16-server-large-v0.12.3
    template.kubevirt.io/version: v0.12.4
windows2k16-server-medium-v0.12.3
    template.kubevirt.io/version: v0.12.4
windows2k19-server-large-v0.12.3
    template.kubevirt.io/version: v0.12.4
windows2k19-server-medium-v0.12.3
    template.kubevirt.io/version: v0.12.4



$ oc get  -n openshift KubevirtCommonTemplatesBundle -oyaml
apiVersion: v1
items:
- apiVersion: ssp.kubevirt.io/v1
  kind: KubevirtCommonTemplatesBundle
  metadata:
    creationTimestamp: "2020-12-10T18:48:29Z"
    generation: 1
    labels:
      app: kubevirt-hyperconverged
    managedFields:
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:app: {}
        f:spec: {}
      manager: hyperconverged-cluster-operator
      operation: Update
      time: "2020-12-10T18:48:29Z"
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          f:observedVersion: {}
          f:operatorVersion: {}
          f:targetVersion: {}
      manager: OpenAPI-Generator
      operation: Update
      time: "2020-12-11T09:55:55Z"
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        f:status:
          .: {}
          f:conditions: {}
      manager: ansible-operator
      operation: Update
      time: "2020-12-13T11:53:44Z"
    name: common-templates-kubevirt-hyperconverged
    namespace: openshift
    resourceVersion: "3339566"
    selfLink: /apis/ssp.kubevirt.io/v1/namespaces/openshift/kubevirtcommontemplatesbundles/common-templates-kubevirt-hyperconverged
    uid: cc446c97-d197-40a1-af0a-5fcc5cbfd0cb
  spec: {}
  status:
    conditions:
    - lastTransitionTime: "2020-12-11T09:55:55Z"
      message: 'Templates progressing (deployed templates: 60, desired deployed templated:
        58).'
      reason: progressing
      status: "False"
      type: Progressing
    - lastTransitionTime: "2020-12-11T09:55:55Z"
      message: 'Common templates available (deployed templates: 60, desired deployed
        templated: 58).'
      reason: available
      status: "True"
      type: Available
    - lastTransitionTime: "2020-12-11T09:55:55Z"
      message: 'Templates degraded (deployed templates: 60, desired deployed templated:
        58).'
      reason: degraded
      status: "False"
      type: Degraded
    - lastTransitionTime: "2020-12-13T11:53:18Z"
      message: Running reconciliation
      reason: Running
      status: "False"
      type: Running
    - lastTransitionTime: "2020-12-13T11:53:44Z"
      message: 'An unhandled exception occurred while running the lookup plugin ''k8s''.
        Error was a <class ''ansible.errors.AnsibleError''>, original message: Failed
        to find exact match for kubevirt.io/v1.KubevirtCommonTemplatesBundle by [kind,
        name, singularName, shortNames]'
      reason: Failed
      status: "True"
      type: Failure
    observedVersion: v2.5.2
    operatorVersion: v2.5.2
    targetVersion: v2.5.2
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Omer - can you please have a look at the deployed templates? I want to make sure we're not deploying something that will cause any issues

Comment 9 Omer Yahud 2020-12-14 09:39:32 UTC
Hi Ruth,

The message you see is expected.
When a template version is bumped, from the operator's perspective the amount of templates has not changed, but in the cluster both versions exist (after an upgrade) so deployed > expected.

The templates look fine, only the relevant templates are of the latest version 0.12.4

Comment 10 Ruth Netser 2020-12-14 11:21:50 UTC
Upgrade passed succesfully (along with basic templates tests); moving to verified.

Comment 16 errata-xmlrpc 2020-12-16 00:16:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 2.5.2 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5560

Comment 17 Oren Cohen 2020-12-17 21:45:42 UTC
This bug is still observed when upgrading from 2.5.2 to 2.5.3.

bundle image: hco-bundle-registry-container-v2.5.3-7
ssp image in bundle: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-kubevirt-ssp-operator@sha256:3ba20e2c06ea0d9828c0b33e521c1417ab74b3c96d5e68bb64211ad40045e5c6
ssp version in image: https://access.redhat.com/containers/#/registry.access.redhat.com/container-native-virtualization/kubevirt-ssp-operator/images/v2.5.2-5
(which is specified in "fixed in version")
The other 3 SSP CRs reached observedVersion of 2.5.3

-----------
KubevirtCommonTemplatesBundle yaml:

apiVersion: ssp.kubevirt.io/v1
kind: KubevirtCommonTemplatesBundle
metadata:
  selfLink: >-
    /apis/ssp.kubevirt.io/v1/namespaces/openshift/kubevirtcommontemplatesbundles/common-templates-kubevirt-hyperconverged
  resourceVersion: '343393831'
  name: common-templates-kubevirt-hyperconverged
  uid: bf89cccb-bdc6-4393-8c18-303364ab82dd
  creationTimestamp: '2020-07-30T14:46:43Z'
  generation: 1
  managedFields:
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:labels':
            .: {}
            'f:app': {}
        'f:spec': {}
      manager: hyperconverged-cluster-operator
      operation: Update
      time: '2020-07-30T14:46:43Z'
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:status':
          'f:observedVersion': {}
          'f:operatorVersion': {}
          'f:targetVersion': {}
      manager: OpenAPI-Generator
      operation: Update
      time: '2020-12-17T21:00:30Z'
    - apiVersion: ssp.kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:status':
          .: {}
          'f:conditions': {}
      manager: ansible-operator
      operation: Update
      time: '2020-12-17T21:33:29Z'
  namespace: openshift
  labels:
    app: kubevirt-hyperconverged
spec: {}
status:
  conditions:
    - lastTransitionTime: '2020-12-17T21:00:31Z'
      message: >-
        Templates progressing (deployed templates: 148, desired deployed
        templated: 58).
      reason: progressing
      status: 'False'
      type: Progressing
    - lastTransitionTime: '2020-12-17T21:00:31Z'
      message: >-
        Common templates available (deployed templates: 148, desired deployed
        templated: 58).
      reason: available
      status: 'True'
      type: Available
    - lastTransitionTime: '2020-12-17T21:00:31Z'
      message: >-
        Templates degraded (deployed templates: 148, desired deployed templated:
        58).
      reason: degraded
      status: 'False'
      type: Degraded
    - lastTransitionTime: '2020-12-17T21:32:52Z'
      message: Running reconciliation
      reason: Running
      status: 'False'
      type: Running
    - lastTransitionTime: '2020-12-17T21:33:29Z'
      message: >-
        An unhandled exception occurred while running the lookup plugin 'k8s'.
        Error was a <class 'ansible.errors.AnsibleError'>, original message:
        Failed to find exact match for
        kubevirt.io/v1.KubevirtCommonTemplatesBundle by [kind, name,
        singularName, shortNames]
      reason: Failed
      status: 'True'
      type: Failure
  observedVersion: v2.5.2
  operatorVersion: v2.5.3
  targetVersion: v2.5.3


Note You need to log in before you can comment on or make changes to this bug.