Description of problem: Until CNv 4.10, knmstate was installed as part of CNV. Starting from CNV 4.11, knmstate is installed as a standalone operator, and CNV does not depend on knmstate anymore (unless user explicitly wish to install and use knmstate). Upgrading CNV 4.10->4.11 should be blocked if knmstate is currently used (i.e. if there are NodeNetworkConfigurationPolicy's applied), and standalone knmstate is not yet installed. Currently - the upgrade is blocked even if there is no policy applied. Version-Release number of selected component (if applicable): OCP 4.11.0 CNV 4.10.5 knmstate v4.10.5-1 HCO v4.10.5-1 cluster-network-addons-operator v4.10.5-1 How reproducible: 100% Steps to Reproduce: 1. On a cluster with OCP 4.11 and CNV 4.10.z (z>=2) - make sure there is no NodeNetworkConfigurationPolicy (NNCP): $ oc get nncp No resources found 2. Make sure standalone knmstate is not installed: $ oc get ns | grep nmstate $ 3. Check the Upgradeable status of HCO: $ oc get hco -n openshift-cnv kubevirt-hyperconverged -ojsonpath={.status.conditions} | jq [ ... { "lastTransitionTime": "2022-09-13T14:54:19Z", "message": "NetworkAddonsConfig is not upgradeable: NMState deployment is not supported by CNAO anymore, please install Kubernetes NMState Operator", "observedGeneration": 2, "reason": "NetworkAddonsConfigNotUpgradeable", "status": "False", "type": "Upgradeable" } ] <BUG> Upgradeable should be "True"
Upstream fix https://github.com/kubevirt/cluster-network-addons-operator/pull/1415
Workaround is to remove kubernetes-nmstate before upgrade with the following command oc annotate --overwrite -n openshift-cnv hco kubevirt-hyperconverged 'networkaddonsconfigs.kubevirt.io/jsonpatch=[{"op": "replace","path": "/spec/nmstate", "value": null}]'
Yossi, would you please verify that the draft of a workaround in suggested release note is correct? We'd like to attach it to 4.11 release notes, so people can get over the upgrade blocker.
> Yossi, would you please verify that the draft of a workaround in suggested release note is correct? We'd like to attach it to 4.11 release notes, so people can get over the upgrade blocker. I've tested, and adding the annotation does end in setting the Upgradeable condition to True. However, it also results in adding a new status condition to the CNV HCO resource: { "lastTransitionTime": "2022-09-15T10:14:48Z", "message": "Unsupported feature was activated via an HCO annotation", "observedGeneration": 2, "reason": "UnsupportedFeatureAnnotation", "status": "True", "type": "TaintedConfiguration" } Is this valid?
IMHO yes. This is the expected behavior. We should be very clear in documenting this workaround - if you use NodeNetworkConfigurationPolicies, you must install the standalone kubernetes-nmstate before you upgrade to CNV-4.11 - if don't use NNCPs in your CNV-4.10 and don't plan to use them soon, do the patch of https://bugzilla.redhat.com/show_bug.cgi?id=2126537#c2. upgrade CNV and undo the patch. While the patch is applied, HCO would show TaintedConfiguration.
Thanks Dan. I updated the suggested release note accordingly.
@ysegev I have added this known issue to the 4.11 release notes: https://github.com/openshift/openshift-docs/pull/50464. Please let me know if you have any comments from QE perspective. Thank you.
Reviewed and left a comment.
Waiting to get to errata
Verified on OCP 4.11.0 CNV 4.10.6 knmstate v4.10.6-3 HCO v4.10.6-4 cluster-network-addons-operator v4.10.6-3 1. Make sure standalone knmstate is not installed: $ oc get ns | grep nmstate $ 2. Make sure there is no NodeNetworkConfigurationPolicy (NNCP): $ oc get nodenetworkconfigurationpolicy No resources found 3. Check the Upgradeable status of HCO: $ oc get hco -n openshift-cnv kubevirt-hyperconverged -ojsonpath={.status.conditions} | jq [ ... { "lastTransitionTime": "2022-10-19T13:15:22Z", "message": "Reconcile completed successfully", "observedGeneration": 2, "reason": "ReconcileCompleted", "status": "True", "type": "Upgradeable" } ] Upgrade is possible. 4. Create a basic NNCP: $ cat << EOF | oc apply -f - > apiVersion: nmstate.io/v1 > kind: NodeNetworkConfigurationPolicy > metadata: > name: eth-nncp > spec: > desiredState: > interfaces: > - name: ens3f1 > state: down > type: ethernet > nodeSelector: > node-role.kubernetes.io/worker: "cnvqe-11.lab.eng.tlv2.redhat.com" > EOF nodenetworkconfigurationpolicy.nmstate.io/eth-nncp created 5 Verify NNCP exists: $ oc get nncp NAME STATUS eth-nncp Available 6. Verify upgrade is now blocked (due to the existence of the NNCP): oc get hco -n openshift-cnv kubevirt-hyperconverged -ojsonpath={.status.conditions} | jq [ ... { "lastTransitionTime": "2022-10-19T13:28:47Z", "message": "NetworkAddonsConfig is not upgradeable: NMState deployment is not supported by CNAO anymore, please install Kubernetes NMState Operator", "observedGeneration": 2, "reason": "NetworkAddonsConfigNotUpgradeable", "status": "False", "type": "Upgradeable" } ] 7. Revert the setup of the NNCP, to avoid leaving a "dirty" cluster (in my case - setting the interface state back to UP): ]$ cat << EOF | oc apply -f - apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: eth-nncp spec: desiredState: interfaces: - name: ens3f1 state: up type: ethernet nodeSelector: node-role.kubernetes.io/worker: "cnvqe-11.lab.eng.tlv2.redhat.com" EOF nodenetworkconfigurationpolicy.nmstate.io/eth-nncp configured 8. Delete the NNCP, and verify there are no NNCPs left in the cluster: $ oc delete nncp eth-nncp nodenetworkconfigurationpolicy.nmstate.io "eth-nncp" deleted $ oc get nodenetworkconfigurationpolicy No resources found 9. Verify the cluster is upgardeable again: $ oc get hco -n openshift-cnv kubevirt-hyperconverged -ojsonpath={.status.conditions} | jq [ ... { "lastTransitionTime": "2022-10-19T13:32:04Z", "message": "Reconcile completed successfully", "observedGeneration": 2, "reason": "ReconcileCompleted", "status": "True", "type": "Upgradeable" } ]
4.10.6 has been shipped live a while back. Cleaning up.