Bug 1893528

Summary: [v2v][VM import] Not possible to release target Ceph-RBD/Block space after "partial" disk copy.
Product: Container Native Virtualization (CNV) Reporter: Ilanit Stein <istein>
Component: StorageAssignee: Adam Litke <alitke>
Status: CLOSED DUPLICATE QA Contact: Ying Cui <ycui>
Severity: high Docs Contact:
Priority: unspecified    
Version: 2.5.0CC: alitke, cnv-qe-bugs, fdupont, ngavrilo
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-11-25 13:57:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ilanit Stein 2020-11-01 13:26:59 UTC
Description of problem:
When importing from RHV to CNV a VM with a 100GB disk that is larger 
then the Ceph-RBD/Block available space - 50GB, the VM import is hanging for some time on 39%, while the cdi importer pod keeps reporting 47%.
There is no notification to the user that there a problem with the import

After a couple of minutes, the importer pod failed, and began a "New phase" that showed 0% progress. 
See below the importer pod log.

After the cdi importer pod gets into a "crash loopback" state it is going back to status "running".
Eventually, after a couple of "New phase" trials of the cdi importer pod turn into a "Terminating" state.
The VM import is removed automatically.
 
The Ceph-RBD/Block storage that was populated cannot be released in no means.

Importer pod log:
================
Part of the cdi importer log that we managed to capture:
1029 16:00:16.785152       1 importer.go:52] Starting importer
I1029 16:00:16.786785       1 importer.go:116] begin import process
I1029 16:00:18.495650       1 http-datasource.go:219] Attempting to get certs from /certs/ca.pem
I1029 16:00:18.546153       1 data-processor.go:302] Calculating available size
I1029 16:00:18.547157       1 data-processor.go:310] Checking out block volume size.
I1029 16:00:18.547170       1 data-processor.go:322] Request image size not empty.
I1029 16:00:18.547199       1 data-processor.go:327] Target size 100Gi.
I1029 16:00:18.547320       1 data-processor.go:224] New phase: TransferDataFile
I1029 16:00:18.548337       1 util.go:161] Writing data...
I1029 16:00:19.547661       1 prometheus.go:69] 0.00
...
I1029 16:02:57.603779       1 prometheus.go:69] 0.01
E1029 16:02:58.540140       1 util.go:163] Unable to write file from dataReader: unexpected EOF
E1029 16:02:58.540307       1 data-processor.go:221] unexpected EOF
unable to write to file
kubevirt.io/containerized-data-importer/pkg/util.StreamDataToFile
	/go/src/kubevirt.io/containerized-data-importer/pkg/util/util.go:165
kubevirt.io/containerized-data-importer/pkg/importer.(*ImageioDataSource).TransferFile
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/imageio-datasource.go:115
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:191
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:153
main.main
	/go/src/kubevirt.io/containerized-data-importer/cmd/cdi-importer/importer.go:171
runtime.main
	/usr/lib/golang/src/runtime/proc.go:203
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1357
Unable to transfer source data to target file
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:193
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:153
main.main
	/go/src/kubevirt.io/containerized-data-importer/cmd/cdi-importer/importer.go:171
runtime.main
	/usr/lib/golang/src/runtime/proc.go:203
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1357
E1029 16:02:58.540491       1 importer.go:173] unexpected EOF
unable to write to file
kubevirt.io/containerized-data-importer/pkg/util.StreamDataToFile
	/go/src/kubevirt.io/containerized-data-importer/pkg/util/util.go:165
kubevirt.io/containerized-data-importer/pkg/importer.(*ImageioDataSource).TransferFile
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/imageio-datasource.go:115
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:191
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:153
main.main
	/go/src/kubevirt.io/containerized-data-importer/cmd/cdi-importer/importer.go:171
runtime.main
	/usr/lib/golang/src/runtime/proc.go:203
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1357
Unable to transfer source data to target file
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:193
kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData
	/go/src/kubevirt.io/containerized-data-importer/pkg/importer/data-processor.go:153
main.main
	/go/src/kubevirt.io/containerized-data-importer/cmd/cdi-importer/importer.go:171
runtime.main
	/usr/lib/golang/src/runtime/proc.go:203
runtime.goexit
	/usr/lib/golang/src/runtime/asm_amd64.s:1357
 
Version-Release number of selected component (if applicable):
CNV-2.5: iib-22696 hco-v2.5.0-396

How reproducible:
100%

Expected results:
There should be a way to release the space of the Ceph-RBD/Block storage,  that got full by the "partial" VM disk copy, described above. 

Additional info:
This is also related to VM import from VMware to CNV via api (vmio).
For VMware, it is also not possible to release the Ceph-RBD resource,
and there is no suitable error that indicate the issue.

Comment 1 Sam Lucidi 2020-11-04 15:23:40 UTC
*** Bug 1891534 has been marked as a duplicate of this bug. ***

Comment 2 Natalie Gavrielov 2020-11-11 13:27:19 UTC
Ilanit, did you try removing the VM/DV/PVC?

Comment 3 Ilanit Stein 2020-11-11 13:53:13 UTC
The VM no longer exists.
I believe there is a problem to remove the PVC as the cdi importer pod remains in "Terminating" state, and as this pod is binded with this PVC, it cannot get removed.

Comment 4 Fabien Dupont 2020-11-18 15:35:27 UTC
@alitke can you please triage this BZ?

Comment 5 Natalie Gavrielov 2020-11-25 13:57:19 UTC
The main issue here is already described in bug 1897351.
Closing as dup

*** This bug has been marked as a duplicate of bug 1897351 ***

Comment 6 Maya Rashish 2021-07-25 08:56:19 UTC
*** Bug 1893526 has been marked as a duplicate of this bug. ***