Description of problem: When deleting HCO CR, which in turn deletes the KubeVirt CR, some of its underlying components remain in the cluster: * virt-controller Deployment (with two replicas) * virt-api Deployment (with two replicas) * virt-handler DaemonSet Version-Release number of selected component (if applicable): index-image bundle: registry-proxy.engineering.redhat.com/rh-osbs/iib:8258 HCO version: registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-hyperconverged-cluster-operator:v2.5.0-31 Kubevirt Version: D/S registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-virt-operator:v2.5.0-41 Based on upstream 0.33.0-rc.0-58-gbfe9df0 How reproducible: 100% Steps to Reproduce: 1. Deploy CNV 2.5.0 using above index image 2. Apply the HCO CR 3. Wait for CNV to stabilize and CSV phase is Succeeded 4. Delete the HCO CR Actual results: Mentioned components remain in cluster after Openshift Virtualization removal (by deleting HCO CR and uninstalling the operator - CSV + subscription). Expected results: openshift-cnv namespace should be empty from pods, deployments and daemonsets after CNV removal. Additional info: ----------------- #INITIAL CONDITION - CNV DEPLOYED: $ oc get kv NAME AGE PHASE kubevirt-kubevirt-hyperconverged 4h39m Deployed $ oc get deployments NAME READY UP-TO-DATE AVAILABLE AGE cdi-apiserver 1/1 1 1 4h39m cdi-deployment 1/1 1 1 4h39m cdi-operator 1/1 1 1 4h42m cdi-uploadproxy 1/1 1 1 4h39m cluster-network-addons-operator 1/1 1 1 4h42m hco-operator 1/1 1 1 4h42m hostpath-provisioner-operator 1/1 1 1 4h42m kubemacpool-mac-controller-manager 1/1 1 1 4h39m kubevirt-ssp-operator 1/1 1 1 4h42m nmstate-webhook 2/2 2 2 4h39m node-maintenance-operator 1/1 1 1 4h42m virt-api 2/2 2 2 4h53m virt-controller 2/2 2 2 4h52m virt-operator 2/2 2 2 4h42m virt-template-validator 2/2 2 2 4h39m vm-import-controller 1/1 1 1 4h25m vm-import-operator 1/1 1 1 4h42m $ oc get daemonsets NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE bridge-marker 6 6 6 6 6 beta.kubernetes.io/arch=amd64 4h39m kube-cni-linux-bridge-plugin 6 6 6 6 6 beta.kubernetes.io/arch=amd64 4h39m kubevirt-node-labeller 3 3 3 3 3 <none> 4h25m nmstate-handler 6 6 6 6 6 beta.kubernetes.io/arch=amd64 4h39m ovs-cni-amd64 6 6 6 6 6 beta.kubernetes.io/arch=amd64 4h39m virt-handler 3 3 3 3 3 <none> 4h52m ------------------------ REMOVING HCO CR: $ oc delete hco kubevirt-hyperconverged hyperconverged.hco.kubevirt.io "kubevirt-hyperconverged" deleted $ oc get kv No resources found in openshift-cnv namespace. $ oc delete csv --all clusterserviceversion.operators.coreos.com "kubevirt-hyperconverged-operator.v2.5.0" deleted $ oc delete subs --all subscription.operators.coreos.com "kubevirt-hyperconverged" deleted ----------------------------------- RESIDUES: $ oc get deployments NAME READY UP-TO-DATE AVAILABLE AGE virt-api 2/2 2 2 4h56m virt-controller 2/2 2 2 4h55m $ oc get ds NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE virt-handler 3 3 3 3 3 <none> 4h55m $ oc get pods NAME READY STATUS RESTARTS AGE virt-api-597b44646f-cq8nf 1/1 Running 0 4h56m virt-api-597b44646f-snh2p 1/1 Running 0 4h56m virt-controller-7c4f8f8c69-24r6f 1/1 Running 4 4h56m virt-controller-7c4f8f8c69-5n2wp 1/1 Running 5 4h56m virt-handler-dv5p4 1/1 Running 0 4h56m virt-handler-ghdb9 1/1 Running 0 4h56m virt-handler-lbrlh 1/1 Running 0 4h56m
David, do you have an idea why this could be?
yes. This was a result of the change to use the status subresource. Our operator wasn't properly handling the finalizer after that subresource change, which caused the KubeVirt cr to disappear before uninstallation. This merged PR fixes this. https://github.com/kubevirt/kubevirt/pull/4113
(cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get kv NAME AGE PHASE kubevirt-kubevirt-hyperconverged 6d7h Deployed (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get deployments NAME READY UP-TO-DATE AVAILABLE AGE cdi-operator 0/1 1 0 6d7h cluster-network-addons-operator 1/1 1 1 6d7h hco-operator 1/1 1 1 6d7h hostpath-provisioner-operator 1/1 1 1 6d7h kubemacpool-mac-controller-manager 1/1 1 1 6d7h kubevirt-ssp-operator 1/1 1 1 6d7h nmstate-webhook 2/2 2 2 6d7h node-maintenance-operator 1/1 1 1 6d7h virt-api 2/2 2 2 6d7h virt-controller 2/2 2 2 6d7h virt-operator 2/2 2 2 6d7h vm-import-operator 1/1 1 1 6d7h (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get daemonsets NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE bridge-marker 6 6 6 6 6 beta.kubernetes.io/arch=amd64 6d7h hostpath-provisioner 3 3 3 3 3 <none> 6d7h kube-cni-linux-bridge-plugin 6 6 6 6 6 beta.kubernetes.io/arch=amd64 6d7h kubevirt-node-labeller 3 3 3 3 3 <none> 5d nmstate-handler 6 6 6 6 6 beta.kubernetes.io/arch=amd64 6d7h ovs-cni-amd64 6 6 6 6 6 beta.kubernetes.io/arch=amd64 6d7h virt-handler 3 3 3 3 3 <none> 6d7h (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc delete hco kubevirt-hyperconverged hyperconverged.hco.kubevirt.io "kubevirt-hyperconverged" deleted (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get kv No resources found in openshift-cnv namespace. (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc delete csv --all clusterserviceversion.operators.coreos.com "kubevirt-hyperconverged-operator.v2.5.0" deleted (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc delete subs --all subscription.operators.coreos.com "hco-operatorhub" deleted (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get deployments No resources found in openshift-cnv namespace. (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get ds NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE hostpath-provisioner 3 3 3 3 3 <none> 6d7h (cnv-tests) [kbidarka@kbidarka-host cnv-tests]$ oc get pods NAME READY STATUS RESTARTS AGE hostpath-provisioner-4f9c4 1/1 Running 0 6d7h hostpath-provisioner-mj5kf 1/1 Running 0 6d7h hostpath-provisioner-qvtkg 1/1 Running 0 6d7h ------------------------ Deleting the HCO CR was successful. openshift-cnv namespace, is empty from virt( pods, deployments and daemonsets ) after CNV removal.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 2.5.0 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:5127
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days