Bug 2151693

Summary: After upgrade to 4.10.7->4.11.0 hco.spec.workloadUpdateStrategy value is getting overwritten
Product: Container Native Virtualization (CNV) Reporter: Debarati Basu-Nag <dbasunag>
Component: InstallationAssignee: Simone Tiraboschi <stirabos>
Status: CLOSED ERRATA QA Contact: Debarati Basu-Nag <dbasunag>
Severity: high Docs Contact:
Priority: high    
Version: 4.11.0CC: jortialc, kmajcher, stirabos, ycui
Target Milestone: ---   
Target Release: 4.11.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: hco-bundle-registry-v4.11.2-18 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2153849 2235308 (view as bug list) Environment:
Last Closed: 2023-01-12 14:08:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2153849, 2235308    

Description Debarati Basu-Nag 2022-12-07 20:42:07 UTC
Description of problem:For EUS->EUS upgrade we are supposed to turn off hco.spec.workloadUpdateStrategy, so that workload updates can only happen after upgrading to 4.12. However, that is currently not happening. Post upgrade to 4.11.0, I see hco.spec.workloadUpdateStrategy set to be LiveMigrate and workloads are live migrating.


Version-Release number of selected component (if applicable):
4.11.0

How reproducible:
100%

Steps to Reproduce:
1. Before upgrade set hco.spec.workloadUpdateStrategy to []
2. After upgrade check that hco.spec.workloadUpdateStrategy is set to LiveMigrate
3.

Actual results:
Before upgrade:
==============
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$  kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".status.versions"
[
  {
    "name": "operator",
    "version": "4.10.7"
  }
]
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get vm -A
NAMESPACE                NAME                                            AGE   STATUS    READY
test-upgrade-namespace   vm-for-product-upgrade-hos-1670437886-760028    15m   Running   True
test-upgrade-namespace   vm-for-product-upgrade-hos-1670437887-3594792   14m   Running   True
test-upgrade-namespace   vm-for-product-upgrade-nfs-1670437884-7562118   15m   Running   True
test-upgrade-namespace   vm-for-product-upgrade-ocs-1670437885-5164044   15m   Running   True
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ 
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".spec.workloadUpdateStrategy"
{
  "batchEvictionInterval": "1m0s",
  "batchEvictionSize": 10,
  "workloadUpdateMethods": []
}
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ 
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".status.conditions"
[
  {
    "lastTransitionTime": "2022-12-06T18:37:49Z",
    "message": "Reconcile completed successfully",
    "observedGeneration": 3,
    "reason": "ReconcileCompleted",
    "status": "True",
    "type": "ReconcileComplete"
  },
  {
    "lastTransitionTime": "2022-12-07T18:51:50Z",
    "message": "Reconcile completed successfully",
    "observedGeneration": 3,
    "reason": "ReconcileCompleted",
    "status": "True",
    "type": "Available"
  },
  {
    "lastTransitionTime": "2022-12-07T18:51:50Z",
    "message": "Reconcile completed successfully",
    "observedGeneration": 3,
    "reason": "ReconcileCompleted",
    "status": "False",
    "type": "Progressing"
  },
  {
    "lastTransitionTime": "2022-12-07T18:51:02Z",
    "message": "Reconcile completed successfully",
    "observedGeneration": 3,
    "reason": "ReconcileCompleted",
    "status": "False",
    "type": "Degraded"
  },
  {
    "lastTransitionTime": "2022-12-07T18:51:50Z",
    "message": "Reconcile completed successfully",
    "observedGeneration": 3,
    "reason": "ReconcileCompleted",
    "status": "True",
    "type": "Upgradeable"
  }
]
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".spec.workloadUpdateStrategy"
{
  "batchEvictionInterval": "1m0s",
  "batchEvictionSize": 10,
  "workloadUpdateMethods": []
}
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$
Post ocp upgrade:
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get catalogsource -n openshift-marketplace
NAME                  DISPLAY                                TYPE   PUBLISHER   AGE
certified-operators   Certified Operators                    grpc   Red Hat     26h
community-operators   Community Operators                    grpc   Red Hat     26h
hco-catalogsource     OpenShift Virtualization Index Image   grpc   Red Hat     25h
ocs-catalogsource     OpenShift Container Storage            grpc   Red Hat     25h
redhat-marketplace    Red Hat Marketplace                    grpc   Red Hat     26h
redhat-operators      Red Hat Operators                      grpc   Red Hat     26h
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl patch operatorhub cluster --type merge -p '{"spec": {"disableAllDefaultSources": true}}'
operatorhub.config.openshift.io/cluster patched
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get catalogsource -n openshift-marketplace
NAME                DISPLAY                                TYPE   PUBLISHER   AGE
hco-catalogsource   OpenShift Virtualization Index Image   grpc   Red Hat     25h
ocs-catalogsource   OpenShift Container Storage            grpc   Red Hat     25h
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get ip -A
NAMESPACE                          NAME            CSV                                          APPROVAL    APPROVED
openshift-cnv                      install-lqt5n   kubevirt-hyperconverged-operator.v4.11.0     Manual      false
openshift-cnv                      install-njk24   kubevirt-hyperconverged-operator.v4.10.7     Manual      true
openshift-local-storage            install-s8jm4   local-storage-operator.4.10.0-202211041323   Automatic   true
openshift-local-storage            install-v86lx   local-storage-operator.4.11.0-202211072116   Automatic   true
openshift-nfd                      install-t95wm   nfd.4.10.0-202211041323                      Automatic   true
openshift-nfd                      install-xcqkn   nfd.4.11.0-202211091549                      Automatic   true
openshift-operators                install-4xbvx   servicemeshoperator.v2.3.0                   Automatic   true
openshift-operators                install-lkc4w   jaeger-operator.v1.39.0-3                    Automatic   true
openshift-operators                install-pj9kw   jaeger-operator.v1.39.0-3                    Automatic   true
openshift-sriov-network-operator   install-42c9t   sriov-network-operator.4.10.0-202211180226   Automatic   true
openshift-sriov-network-operator   install-lw5p9   sriov-network-operator.4.11.0-202211211407   Automatic   true
openshift-storage                  install-s9qfr   ocs-operator.v4.10.9                         Automatic   true
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl patch installplan install-lqt5n --namespace='openshift-cnv'   --type='merge'   --patch='{"spec":{"approved":true}}'
installplan.operators.coreos.com/install-lqt5n patched
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get csv -n openshift-cnv
NAME                                       DISPLAY                                          VERSION    REPLACES                                   PHASE
jaeger-operator.v1.39.0-3                  Red Hat OpenShift distributed tracing platform   1.39.0-3   jaeger-operator.v1.34.1-5                  Succeeded
kiali-operator.v1.57.3                     Kiali Operator                                   1.57.3     kiali-operator.v1.48.3                     Succeeded
kubevirt-hyperconverged-operator.v4.10.7   OpenShift Virtualization                         4.10.7     kubevirt-hyperconverged-operator.v4.10.6   Replacing
kubevirt-hyperconverged-operator.v4.11.0   OpenShift Virtualization                         4.11.0     kubevirt-hyperconverged-operator.v4.10.7   Installing
servicemeshoperator.v2.3.0                 Red Hat OpenShift Service Mesh                   2.3.0-0    servicemeshoperator.v2.2.3                 Succeeded
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get csv -n openshift-cnv
NAME                                       DISPLAY                                          VERSION    REPLACES                                   PHASE
jaeger-operator.v1.39.0-3                  Red Hat OpenShift distributed tracing platform   1.39.0-3   jaeger-operator.v1.34.1-5                  Succeeded
kiali-operator.v1.57.3                     Kiali Operator                                   1.57.3     kiali-operator.v1.48.3                     Succeeded
kubevirt-hyperconverged-operator.v4.11.0   OpenShift Virtualization                         4.11.0     kubevirt-hyperconverged-operator.v4.10.7   Succeeded
servicemeshoperator.v2.3.0                 Red Hat OpenShift Service Mesh                   2.3.0-0    servicemeshoperator.v2.2.3                 Succeeded
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".status.versions"
[
  {
    "name": "operator",
    "version": "4.11.0"
  }
]
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".spec.workloadUpdateStrategy"
{
  "batchEvictionInterval": "1m0s",
  "batchEvictionSize": 10,
  "workloadUpdateMethods": [
    "LiveMigrate"
  ]
}
(cnv-tests) [cnv-qe-jenkins@cnv-qe-01 cnv-tests]$
Expected results:
hco.spec.workloadUpdateStrategy should stay unaltered. 

Additional info:
This impacts EUS->EUS upgrade. Without ability to pause workload updates till CNV is updated to 4.12, we can't achive EUS->EUS upgradability

Comment 1 Simone Tiraboschi 2022-12-09 14:00:08 UTC
This happens only with: workloadUpdateMethods: [],
setting something like workloadUpdateMethods: ["None"] sounds like a valid workaround.

Comment 3 Simone Tiraboschi 2022-12-15 16:36:33 UTC
Adding "olm.skipRange: '>=4.10.7 <4.11.0'" on 4.11.2 so that users coming from 4.10.7 and greater are not going to pass into 4.11.0 and 4.11.1 hitting this.

Comment 4 Debarati Basu-Nag 2022-12-20 17:23:09 UTC
Validated with 4.11.2-19.
============================
cnv-qe-jenkins@cnv-qe-infra-01:~/dbasunag$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".spec.workloadUpdateStrategy"
{
  "batchEvictionInterval": "1m0s",
  "batchEvictionSize": 10,
  "workloadUpdateMethods": []
}
cnv-qe-jenkins@cnv-qe-infra-01:~/dbasunag$ kubectl get hco kubevirt-hyperconverged -n openshift-cnv -o json | jq ".status.versions"
[
  {
    "name": "operator",
    "version": "4.11.2"
  }
]
cnv-qe-jenkins@cnv-qe-infra-01:~/dbasunag$ kubectl get csv -n openshift-cnv
NAME                                       DISPLAY                                          VERSION    REPLACES                                   PHASE
jaeger-operator.v1.39.0-3                  Red Hat OpenShift distributed tracing platform   1.39.0-3   jaeger-operator.v1.34.1-5                  Succeeded
kiali-operator.v1.57.3                     Kiali Operator                                   1.57.3     kiali-operator.v1.48.3                     Succeeded
kubevirt-hyperconverged-operator.v4.11.2   OpenShift Virtualization                         4.11.2     kubevirt-hyperconverged-operator.v4.10.7   Succeeded
servicemeshoperator.v2.3.0                 Red Hat OpenShift Service Mesh                   2.3.0-0    servicemeshoperator.v2.2.3                 Succeeded
cnv-qe-jenkins@cnv-qe-infra-01:~/dbasunag$

Comment 13 errata-xmlrpc 2023-01-12 14:08:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Virtualization 4.11.2 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:0155