Description of problem: Two properties are currently not available in Kubernetes: * Disk Hotplug * Attaching container images as volumes KubeVirt offers these features therefore on top of Kubernetes. In order to achive that, virt-handler bind-mounts from the source container or pod certain files or devices into the VM container. This works well and reliable from a VM perspective, it however introduces management challenges for us which can in the worst case cause issues for users and customers. It should become clear when looking into the shutdown flow of VMs. If a shutdown occurs the following happens (step 3 is the critical step): (1) the kubelet will send termination signals to all containers (2) virt-handler will send more specific shutdown signals to the VM itself (if virt-handler is still running) (3) Finally, after the VM is down, virt-handler will unmount containerdisks and hotplugged disks Now, in phase (3), if virt-handler is down, the unmount will never happen. As a consequence the kubelet will never be able to clean the containers fully up. Reasons why virt-handler may be gone: * node selector changes on where to run VMs (likely to happen more often with customers) * the kubelet decides to evict virt-handler repeatedly * bugs in virt-handler As a consequence, with containerdisks, the pods are successfully deleted, but the image layers can't be removed and stay mounted. This leads to increased disk usage, error messages in kubelet and prevents image cleanups to e.g. reclaim space. For hotplug, in addition to blocking image layers, we are also blocking the final unmount of PVC resources (block devices and filesystems). They are never unmounted on the system. Which leads to immediate problems with RWO and RWOP PVCs, but can also lead to problems with RWX. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Start a VM with a containerdisk 2. remove virt-handler from the node (e.g. excluding the node as target for VMs in the KubeVirt CR) 3. delete the pod which contains VM 4. watch the kubelet logs which will now endlessly report that it can't unmount the container layer Actual results: In case virt-handler is not on the node, for containerdisks the pods are successfully removed, but the image layers are never unmounted and can pile up. For hotplug, PVCs are potentially blocked in addition. Expected results: * For as long as the VM is running, the kubelet must not be able to unmount the image layer and disks (this is currently the case, and this property needs to be kept) * As soon as the VM dies, the kubelet should not be blocked any longer with the cleanup One possibility would be using file-descriptors instead of mounts. For as long as there is an open file descriptor on a resource it blocks unmounts, but as soon as the last file processor is closed, the unmounts can continue. Additional info: