Description of problem: When deleting the cr hyperconverged-cluster of kind HyperConverged, causes cascade deletion of kubevirt-hyperconverged-cluster cr, and all meta-operators (virt-handler, virt-api,..) also deletes all current running VM's. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: should the VM be kept alive, and accessible? Or is it the action of removing the whole Kubevirt product, and it's by-products? Additional info:
It's a design. But I wonder if we should add some safety measures here. I.e. Only permit to delete HCO CR - (or KubeVirt CR) if there are no VMs. - if there are no DataVolumes. Steve, thoughts?
Kubevirt cr is protected (with finalizer), when trying to delete it; But on the other hand, it's so easy to delete it by deleting it's owner..
In regards to Comment #2, does "owner" refer to the account that created the KubeVirt CR? The operator (in this case HCO) that created it?
> Kubevirt cr is protected (with finalizer), when trying to delete it the finalizer on KubeVirt's CR is (currently) only used for ensuring that all KubeVirt components are deleted before the CR is deleted. It does not prevent deletion.
(In reply to Fabian Deutsch from comment #1) > It's a design. > But I wonder if we should add some safety measures here. > > I.e. Only permit to delete HCO CR > - (or KubeVirt CR) if there are no VMs. > - if there are no DataVolumes. > > Steve, thoughts? I do think we need to protect the user, but I also think this might be a bit painful for the case where they really want to remove it and now have to manually go and clean up all the objects first? Is there a way to basically make them do a --force or "Are you really sure?" in all cases?
Federico had a good point: Only an operator can uninstall components. And it's as easy to uninstalL OCS as it is to uninstall CNV. To me these two points are strong enough to say that we can defer a pragmatic solution to 2.1.1. A pragmatic solution: KubeVirt can not be uninstalled as long as any VM or VMI is defined.
The solution to this is likely a 2-part fix. virt-operator does not own the CR, so attempting to prevent deletion is tricky. However, KubeVirt can definitely check for the existence of VMs/DVs and create a condition. This would make for a cleaner API and separation of responsibility. Whatever entity created the KubeVirt CR can monitor for that condition and act accordingly (or optionally ignore it).
It's not the CR (kind kubevirt).. when this cr is deleted, then all owned resources are cascade-deleted (virt-handler, virt-api, virt-controller) All resources such as vm's, vmi's, created by the user - are not owned by this cr. virt-operator explicitly deletes all user created resources when the kubevirt-kind cr is deleted - why? (I can still work with the vm, once the virt-handler, virt-api, etc are recreated)
We merged just recently https://github.com/kubevirt/kubevirt/pull/2976 in kubevirt/kubevirt. It allow setting a new field (spec.uninstallStrategy) in the KubeVirt CR to the value "BlockUninstallIfWorkloadsExist". When a user then tries to delete the KubeVirt CR when workloads (VM, VMI, VMIRS) still exist, the deletion is blocked on the webhook level (so no deletion timestamp set). HCO will have to pick it up and set it explicitly, since we did not want to change the default behaviour yet upstream. Simone I guess we want to handle the update in HCO in the context of this bugzilla?
Roman, Does this require a PR in HCO? Is that already done?
Yes. For kubevirt the PR was done here: https://github.com/kubevirt/hyperconverged-cluster-operator/pull/454
verify with build: $ oc version Client Version: 4.4.0-0.nightly-2020-02-17-022408 Server Version: 4.4.0-0.nightly-2020-03-06-170328 Kubernetes Version: v1.17.1 step: 1 deploy cnv 2 create a dv and vm 3 check kv defalut value $ oc get kv kubevirt-kubevirt-hyperconverged -o yaml ..... spec: uninstallStrategy: BlockUninstallIfWorkloadsExist ..... 4 try to delete kv Error from server: admission webhook "kubevirt-validator.kubevirt.io" denied the request: Rejecting the uninstall request, since there are still Virtual Machine Instances present. Either delete all KubeVirt related workloads or change the uninstall strategy before uninstalling KubeVirt. change strategy to spec: uninstallStrategy: RemoveWorkloads delete kv again $ oc delete kv kubevirt-kubevirt-hyperconverged kubevirt.kubevirt.io "kubevirt-kubevirt-hyperconverged" deleted check vm is removed $ oc get vm No resources found in openshift-cnv namespace. move to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:2011