Description of problem: On a cluster using the Trident CSI driver we encountered a failure during the smart clone flow. A snapshot was created successfully but the restore is failing. This causes a DV to be stuck in Pending state. When the DV is removed, the cd-tmp PVC in the source namespace is not cleaned up. Version-Release number of selected component (if applicable): 4.8.1 How reproducible: 100% when cloning fails Steps to Reproduce: 1. Create a cross ns clone DV in an environment where snapshot restore is failing 2. Delete the DV Actual results: The cdi-tmp PVC in the source ns is not cleaned up Expected results: The cdi-tmp PVC should also be removed. Additional info:
Hi, Micheal do we have any update for this bug?
Sorry for the delay on this: The PVC in question should get deleted in this loop: PVC gets deleted here: https://github.com/kubevirt/containerized-data-importer/blob/main/pkg/controller/datavolume-controller.go#L1196 My guess is that in this specific situation, it was hanging around because of a finalizer added my the storage provisioner.
Base on the above comments, moving the bug to ON_QA
Tried reproducing this with CNV v4.8.4-35 using OCS and everything works as expected. $ oc create -f dv-src.yaml datavolume.cdi.kubevirt.io/cirros-dv created $ oc get dv -A NAMESPACE NAME PHASE PROGRESS RESTARTS AGE openshift-virtualization-os-images cirros-dv Succeeded 100.0% 87s $ oc create -f vm-target.yaml virtualmachine.kubevirt.io/vm-dv-clone created $ oc delete volumesnapshot -n openshift-virtualization-os-images $(oc get volumesnapshot -A | grep cdi) volumesnapshot.snapshot.storage.k8s.io "cdi-tmp-9acd9c18-112a-44de-b554-64966edeccca" deleted Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "openshift-virtualization-os-images" not found Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "false" not found Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "cirros-dv" not found Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "ocs-storagecluster-rbdplugin-snapclass" not found Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "snapcontent-8f2a3a7b-7a82-47c7-85a4-9e471d127dcd" not found Error from server (NotFound): volumesnapshots.snapshot.storage.k8s.io "1s" not found $ oc get volumesnapshot -A No resources found $ oc get dv -A NAMESPACE NAME PHASE PROGRESS RESTARTS AGE default target-dv-vm-clone SnapshotForSmartCloneInProgress 94s openshift-virtualization-os-images cirros-dv Succeeded 100.0% 15m No PVC associated with target VM: $ oc get pvc -A NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE openshift-storage db-noobaa-db-pg-0 Bound pvc-f15d2b49-08fc-4dea-bac2-7787350ad56a 50Gi RWO ocs-storagecluster-ceph-rbd 5d23h openshift-storage ocs-deviceset-0-data-09bnq2 Bound local-pv-94284990 70Gi RWO local-block 5d23h openshift-storage ocs-deviceset-1-data-0bp2nm Bound local-pv-b4007a1c 70Gi RWO local-block 5d23h openshift-storage ocs-deviceset-2-data-0rkswd Bound local-pv-c1f38162 70Gi RWO local-block 5d23h openshift-virtualization-os-images cirros-dv Bound pvc-94e8f4e8-5157-4219-a155-f0f33f25db85 30Gi RWO ocs-storagecluster-ceph-rbd 16m $ oc get vmi -A No resources found $ oc get vm -A NAMESPACE NAME AGE VOLUME default vm-dv-clone 3m48s $ virtctl start vm-dv-clone Error starting VirtualMachine Operation cannot be fulfilled on virtualmachine.kubevirt.io "vm-dv-clone": Always does not support manual start requests $ oc delete vm vm-dv-clone virtualmachine.kubevirt.io "vm-dv-clone" deleted $ oc get vm -A No resources found $ oc get dv -A NAMESPACE NAME PHASE PROGRESS RESTARTS AGE openshift-virtualization-os-images cirros-dv Succeeded 100.0% 21m $ oc delete dv -n openshift-virtualization-os-images cirros-dv datavolume.cdi.kubevirt.io "cirros-dv" deleted No PVC leftovers: $ oc get pvc -A NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE openshift-storage db-noobaa-db-pg-0 Bound pvc-f15d2b49-08fc-4dea-bac2-7787350ad56a 50Gi RWO ocs-storagecluster-ceph-rbd 6d openshift-storage ocs-deviceset-0-data-09bnq2 Bound local-pv-94284990 70Gi RWO local-block 6d openshift-storage ocs-deviceset-1-data-0bp2nm Bound local-pv-b4007a1c 70Gi RWO local-block 6d openshift-storage ocs-deviceset-2-data-0rkswd Bound local-pv-c1f38162 70Gi RWO local-block 6d Source DV: $ cat dv-src.yaml apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: namespace: "openshift-virtualization-os-images" name: cirros-dv spec: source: http: url: "http://.../cirros-0.4.0-x86_64-disk.qcow2" secretRef: "" pvc: accessModes: - ReadWriteOnce resources: requests: storage: 30Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block Target VM: $ cat vm-target.yaml apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: labels: kubevirt.io/vm: vm-dv-clone namespace: "default" name: vm-dv-clone spec: running: true template: metadata: labels: kubevirt.io/vm: vm-dv-clone spec: domain: devices: disks: - disk: bus: virtio name: root-disk resources: requests: memory: 64M volumes: - dataVolume: name: target-dv-vm-clone name: root-disk dataVolumeTemplates: - metadata: name: target-dv-vm-clone spec: pvc: storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block accessModes: - ReadWriteOnce resources: requests: storage: 32Gi source: pvc: namespace: "openshift-virtualization-os-images" name: "cirros-dv"
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.8.4 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:0213