Bug 1745998
Summary: | VMs and DVs (user-data) is getting deleted during HCO uninstall | ||
---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Asher Shoshan <ashoshan> |
Component: | Virtualization | Assignee: | Roman Mohr <rmohr> |
Status: | CLOSED ERRATA | QA Contact: | zhe peng <zpeng> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 2.1.0 | CC: | cnv-qe-bugs, ipinto, msluiter, rmohr, sgordon, sgott, stirabos |
Target Milestone: | --- | ||
Target Release: | 2.3.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | hco-bundle-registry-container-v2.2.0-445 virt-operator-container-v2.3.0-33 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-05-04 19:10:36 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Asher Shoshan
2019-08-27 12:34:45 UTC
It's a design. But I wonder if we should add some safety measures here. I.e. Only permit to delete HCO CR - (or KubeVirt CR) if there are no VMs. - if there are no DataVolumes. Steve, thoughts? Kubevirt cr is protected (with finalizer), when trying to delete it; But on the other hand, it's so easy to delete it by deleting it's owner.. In regards to Comment #2, does "owner" refer to the account that created the KubeVirt CR? The operator (in this case HCO) that created it? > Kubevirt cr is protected (with finalizer), when trying to delete it
the finalizer on KubeVirt's CR is (currently) only used for ensuring that all KubeVirt components are deleted before the CR is deleted. It does not prevent deletion.
(In reply to Fabian Deutsch from comment #1) > It's a design. > But I wonder if we should add some safety measures here. > > I.e. Only permit to delete HCO CR > - (or KubeVirt CR) if there are no VMs. > - if there are no DataVolumes. > > Steve, thoughts? I do think we need to protect the user, but I also think this might be a bit painful for the case where they really want to remove it and now have to manually go and clean up all the objects first? Is there a way to basically make them do a --force or "Are you really sure?" in all cases? Federico had a good point: Only an operator can uninstall components. And it's as easy to uninstalL OCS as it is to uninstall CNV. To me these two points are strong enough to say that we can defer a pragmatic solution to 2.1.1. A pragmatic solution: KubeVirt can not be uninstalled as long as any VM or VMI is defined. The solution to this is likely a 2-part fix. virt-operator does not own the CR, so attempting to prevent deletion is tricky. However, KubeVirt can definitely check for the existence of VMs/DVs and create a condition. This would make for a cleaner API and separation of responsibility. Whatever entity created the KubeVirt CR can monitor for that condition and act accordingly (or optionally ignore it). It's not the CR (kind kubevirt).. when this cr is deleted, then all owned resources are cascade-deleted (virt-handler, virt-api, virt-controller) All resources such as vm's, vmi's, created by the user - are not owned by this cr. virt-operator explicitly deletes all user created resources when the kubevirt-kind cr is deleted - why? (I can still work with the vm, once the virt-handler, virt-api, etc are recreated) We merged just recently https://github.com/kubevirt/kubevirt/pull/2976 in kubevirt/kubevirt. It allow setting a new field (spec.uninstallStrategy) in the KubeVirt CR to the value "BlockUninstallIfWorkloadsExist". When a user then tries to delete the KubeVirt CR when workloads (VM, VMI, VMIRS) still exist, the deletion is blocked on the webhook level (so no deletion timestamp set). HCO will have to pick it up and set it explicitly, since we did not want to change the default behaviour yet upstream. Simone I guess we want to handle the update in HCO in the context of this bugzilla? Roman, Does this require a PR in HCO? Is that already done? Yes. For kubevirt the PR was done here: https://github.com/kubevirt/hyperconverged-cluster-operator/pull/454 verify with build: $ oc version Client Version: 4.4.0-0.nightly-2020-02-17-022408 Server Version: 4.4.0-0.nightly-2020-03-06-170328 Kubernetes Version: v1.17.1 step: 1 deploy cnv 2 create a dv and vm 3 check kv defalut value $ oc get kv kubevirt-kubevirt-hyperconverged -o yaml ..... spec: uninstallStrategy: BlockUninstallIfWorkloadsExist ..... 4 try to delete kv Error from server: admission webhook "kubevirt-validator.kubevirt.io" denied the request: Rejecting the uninstall request, since there are still Virtual Machine Instances present. Either delete all KubeVirt related workloads or change the uninstall strategy before uninstalling KubeVirt. change strategy to spec: uninstallStrategy: RemoveWorkloads delete kv again $ oc delete kv kubevirt-kubevirt-hyperconverged kubevirt.kubevirt.io "kubevirt-kubevirt-hyperconverged" deleted check vm is removed $ oc get vm No resources found in openshift-cnv namespace. move to verified Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:2011 |