Bug 1648232
| Summary: | CDI importer retry logic won't stop event when datavolume is deleted | ||
|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | shiyang.wang <shiywang> |
| Component: | Storage | Assignee: | John Griffith <jgriffith> |
| Status: | CLOSED ERRATA | QA Contact: | shiyang.wang <shiywang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 1.3 | CC: | alitke, cnv-qe-bugs, ncredi, qixuan.wang, sgordon |
| Target Milestone: | --- | ||
| Target Release: | 1.4 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | v1.4.0-6 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-02-26 13:24:16 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Deleting the DV in this case "oc delete datavolume datavolume2" does delete the datavolume object, and it also issues the delete call to the PVC as expected, but the PVC will be in a terminating state because it's attached to the pod that's in the crash-loopback or error state. If you then delete the pod the pvc deletion will complete as well. In the case of pod failures, we leave those objects present in an error state so that they can be debugged. The pod continuing the retry loop after a DV is deleted should be easy enough to fix. I'll look at adding some logic to the DV controller so that a 'delete datavolume' call will go through and clean up any associated PODs and/or PVCs. Let me know if there are any other details I'm missing here. Thanks! I've submitted a PR that explicitly cleans up PODs when a Data Volume is deleted; upstream PR is here: https://github.com/kubevirt/containerized-data-importer/pull/526 The PR https://github.com/kubevirt/containerized-data-importer/pull/526 has been merged. Tested with openshift v3.11.59, CNV v1.4.0 (http://download-node-02.eng.bos.redhat.com/rhel-7/nightly/CNV/CNV-1.4-RHEL-7-20190128.n.0/containers.list), the bug has been fixed, move it to VERIFIED, thanks. Here are verification results. 1. A request with an invalid URL will be denied. [root@cnv-executor-qwang-master1 ~]# oc create -f dv.yaml Error from server: admission webhook "datavolume-create-validator.cdi.kubevirt.io" denied the request: spec.source Invalid source URL: 123aadsfasdsk.img 2. Let a pod in the retry loop. [root@cnv-executor-qwang-master1 ~]# oc get all NAME READY STATUS RESTARTS AGE pod/importer-datavolume2-zl9g9 1/1 Running 2 6m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/glusterfs-dynamic-4ef2bc65-250a-11e9-878f-fa163e29a0b5 ClusterIP 172.30.37.239 <none> 1/TCP 6m [root@cnv-executor-qwang-master1 ~]# oc logs pod/importer-datavolume2-zl9g9 I0131 03:44:10.042528 1 importer.go:45] Starting importer I0131 03:44:10.042800 1 importer.go:58] begin import process I0131 03:44:10.042823 1 importer.go:82] begin import process I0131 03:44:10.042832 1 dataStream.go:293] copying "https://download.fedoraproject.org/pub/fedora/linux/releases/28/Cloud/x86_64/images/Fedora-Cloud-Base-28-1.1.x86_64.qcow2" to "/data/disk.img"... I0131 03:44:10.845505 1 prlimit.go:107] ExecWithLimits qemu-img, [info --output=json https://download.fedoraproject.org/pub/fedora/linux/releases/28/Cloud/x86_64/images/Fedora-Cloud-Base-28-1.1.x86_64.qcow2] I0131 03:44:11.790201 1 prlimit.go:107] ExecWithLimits qemu-img, [convert -p -f qcow2 -O raw json: {"file.driver": "https", "file.url": "https://download.fedoraproject.org/pub/fedora/linux/releases/28/Cloud/x86_64/images/Fedora-Cloud-Base-28-1.1.x86_64.qcow2", "file.timeout": 3600} /data/disk.img] I0131 03:44:11.804786 1 qemu.go:189] 0.00 [root@cnv-executor-qwang-master1 ~]# oc get dv NAME AGE datavolume2 9m [root@cnv-executor-qwang-master1 ~]# oc delete dv datavolume2 datavolume.cdi.kubevirt.io "datavolume2" deleted [root@cnv-executor-qwang-master1 ~]# oc get dv No resources found. [root@cnv-executor-qwang-master1 ~]# oc get pvc No resources found. [root@cnv-executor-qwang-master1 ~]# oc get all No resources found. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0417 |
Description of problem: Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. when I create a datavolume which endpoints is invalid, like { "kind": "List", "apiVersion": "v1", "metadata": {}, "items": [ { "apiVersion": "cdi.kubevirt.io/v1alpha1", "kind": "DataVolume", "metadata": { "name": "datavolume2" }, "spec": { "pvc": { "accessModes": [ "ReadWriteOnce" ], "resources": { "requests": { "storage": "500Mi" } } }, "source": { "http": { "url": "123aadsfasdsk.img" } } } } ] } 2. the importer-pod will crash and retry 3. then oc delete datavolume datavolume2 Actual results: importer-pod will still retry even if datavolume not exist and pvc will also be occupied since importer-pod keep retrying Expected results: importer-pod could be deleted even datavolume or pvc not exist Additional info: