Created attachment 1846503 [details] full-log-and-o-yamls-of-pvc-and-vm Created attachment 1846503 [details] full-log-and-o-yamls-of-pvc-and-vm Created attachment 1846503 [details] full-log-and-o-yamls-of-pvc-and-vm Description of problem: Cloning a Block DV to VM with Filesystem with not big enough size comes to endless loop - using pvc api Version-Release number of selected component (if applicable): 4.10, 4.9, 4.8 How reproducible: Always Steps to Reproduce: 1. Create DV with Block volume mode, size: 30Gi. 2. Create a VM with Filesystem volume mode from this DV, size: 30Gi. 3. See that VM's DV import restarts over and over, with no clear error message (except in the log). Actual results: The endless loop of attempts to clone Expected results: If the target size is too small (considering overhead), we should either get a clear error message before starting to clone/import or resize requested pvc size. Additional info: This doesn't happen when using 'storage' api for target vm: dataVolumeTemplates: spec: storage: Source dv and target vm yamls: $ cat dv-src.yaml apiVersion: cdi.kubevirt.io/v1alpha1 kind: DataVolume metadata: namespace: "openshift-virtualization-os-images" name: cirros-dv spec: source: http: url: "http://.../rhel-images/rhel-83.qcow2" secretRef: "" pvc: accessModes: - ReadWriteOnce resources: requests: storage: 30Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block $ cat vm-dv-clone-30.yaml apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: labels: kubevirt.io/vm: vm-dv-clone name: vm-dv-clone spec: running: true template: metadata: labels: kubevirt.io/vm: vm-dv-clone spec: domain: devices: disks: - disk: bus: virtio name: root-disk resources: requests: memory: 64M volumes: - dataVolume: name: target-dv-vm-clone name: root-disk dataVolumeTemplates: - metadata: name: target-dv-vm-clone spec: pvc: storageClassName: hostpath-provisioner volumeMode: Filesystem accessModes: - ReadWriteOnce resources: requests: storage: 30Gi source: pvc: namespace: "openshift-virtualization-os-images" name: "cirros-dv" $ oc get pvc -A NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE default target-dv-vm-clone Bound pvc-243763c7-e0e4-45bc-903c-853525139eb3 68Gi RWO hostpath-provisioner 14m openshift-storage ocs-deviceset-0-data-0cb4kz Bound local-pv-c63a60c0 70Gi RWO local-block 10h openshift-storage ocs-deviceset-0-data-1cxt7d Bound local-pv-a82afe98 70Gi RWO local-block 10h openshift-storage ocs-deviceset-0-data-2vvt2x Bound local-pv-f3ca87ab 70Gi RWO local-block 10h openshift-virtualization-os-images cirros-dv Bound pvc-87536c65-b6dd-4898-a5dc-dca020b5d74c 30Gi RWO ocs-storagecluster-ceph-rbd 17m $ oc logs cdi-upload-target-dv-vm-clone I1216 07:01:31.857162 1 uploadserver.go:69] Running server on 0.0.0.0:8443 I1216 07:01:51.261542 1 uploadserver.go:374] Content type header is "blockdevice-clone" I1216 07:01:51.261636 1 data-processor.go:340] Calculating available size I1216 07:01:51.261767 1 data-processor.go:352] Checking out file system volume size. I1216 07:01:51.261803 1 data-processor.go:360] Request image size not empty. I1216 07:01:51.261828 1 data-processor.go:365] Target size 30Gi. I1216 07:01:51.262002 1 data-processor.go:243] New phase: TransferDataFile I1216 07:01:51.262161 1 util.go:174] Writing data... I1216 07:04:57.516880 1 data-processor.go:243] New phase: Resize W1216 07:04:57.692691 1 data-processor.go:322] Available space less than requested size, resizing image to available space 30440161280. I1216 07:04:57.692708 1 data-processor.go:330] Calculated new size is < than current size, not resizing: requested size 30440161280, virtual size: 32212254720. I1216 07:04:57.692719 1 data-processor.go:249] Validating image E1216 07:04:57.895010 1 data-processor.go:240] Virtual image size 32212254720 is larger than available size 30440580710 (PVC size 32212254720, reserved overhead 0.055000%). A larger PVC is required. Unable to resize disk image to requested size kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause /remote-source/app/pkg/importer/data-processor.go:229 kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData /remote-source/app/pkg/importer/data-processor.go:172 kubevirt.io/containerized-data-importer/pkg/uploadserver.newUploadStreamProcessor /remote-source/app/pkg/uploadserver/uploadserver.go:439 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).processUpload /remote-source/app/pkg/uploadserver/uploadserver.go:381 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).uploadHandler.func1 /remote-source/app/pkg/uploadserver/uploadserver.go:407 net/http.HandlerFunc.ServeHTTP /usr/lib/golang/src/net/http/server.go:2049 net/http.(*ServeMux).ServeHTTP /usr/lib/golang/src/net/http/server.go:2428 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).ServeHTTP /remote-source/app/pkg/uploadserver/uploadserver.go:256 net/http.serverHandler.ServeHTTP /usr/lib/golang/src/net/http/server.go:2867 net/http.(*conn).serve /usr/lib/golang/src/net/http/server.go:1932 runtime.goexit /usr/lib/golang/src/runtime/asm_amd64.s:1371 E1216 07:04:57.895161 1 uploadserver.go:387] Saving stream failed: Unable to resize disk image to requested size: Virtual image size 32212254720 is larger than available size 30440580710 (PVC size 32212254720, reserved overhead 0.055000%). A larger PVC is required. I1216 07:04:59.676727 1 uploadserver.go:374] Content type header is "blockdevice-clone" I1216 07:04:59.676820 1 data-processor.go:340] Calculating available size E1216 07:04:59.678141 1 data-processor.go:344] exit status 1, blockdev: ioctl error on BLKGETSIZE64: Inappropriate ioctl for device I1216 07:04:59.678160 1 data-processor.go:352] Checking out file system volume size. I1216 07:04:59.678172 1 data-processor.go:360] Request image size not empty. I1216 07:04:59.678181 1 data-processor.go:365] Target size 30Gi. I1216 07:04:59.678275 1 util.go:39] deleting file: /data/disk.img I1216 07:04:59.899644 1 data-processor.go:243] New phase: TransferDataFile I1216 07:04:59.899811 1 util.go:174] Writing data... I1216 07:07:36.388073 1 data-processor.go:243] New phase: Resize W1216 07:07:36.574707 1 data-processor.go:322] Available space less than requested size, resizing image to available space 30440161280. I1216 07:07:36.574733 1 data-processor.go:330] Calculated new size is < than current size, not resizing: requested size 30440161280, virtual size: 32212254720. I1216 07:07:36.574743 1 data-processor.go:249] Validating image E1216 07:07:36.767359 1 data-processor.go:240] Virtual image size 32212254720 is larger than available size 30440580710 (PVC size 32212254720, reserved overhead 0.055000%). A larger PVC is required. Unable to resize disk image to requested size kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessDataWithPause /remote-source/app/pkg/importer/data-processor.go:229 kubevirt.io/containerized-data-importer/pkg/importer.(*DataProcessor).ProcessData /remote-source/app/pkg/importer/data-processor.go:172 kubevirt.io/containerized-data-importer/pkg/uploadserver.newUploadStreamProcessor /remote-source/app/pkg/uploadserver/uploadserver.go:439 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).processUpload /remote-source/app/pkg/uploadserver/uploadserver.go:381 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).uploadHandler.func1 /remote-source/app/pkg/uploadserver/uploadserver.go:407 net/http.HandlerFunc.ServeHTTP /usr/lib/golang/src/net/http/server.go:2049 net/http.(*ServeMux).ServeHTTP /usr/lib/golang/src/net/http/server.go:2428 kubevirt.io/containerized-data-importer/pkg/uploadserver.(*uploadServerApp).ServeHTTP /remote-source/app/pkg/uploadserver/uploadserver.go:256 net/http.serverHandler.ServeHTTP /usr/lib/golang/src/net/http/server.go:2867 net/http.(*conn).serve /usr/lib/golang/src/net/http/server.go:1932 runtime.goexit /usr/lib/golang/src/runtime/asm_amd64.s:1371 P.S. It would be great to edit the message in the log "reserved overhead 0.055000%" so it'll be either 5.5% or 0.055000 (without %)
Created attachment 1846504 [details] full-log-and-o-yamls-of-pvc-and-vm
Might be related to those bugs: https://bugzilla.redhat.com/show_bug.cgi?id=2001047 https://bugzilla.redhat.com/show_bug.cgi?id=2001038
Hi, Maya, could you please take a look? Maybe the DV condition could have clear messages, so the user will know how to fix it
As @brybacki suggested, reproduced on CNV v4.10.0-641, but no relevant events yet: $ oc get dv NAME PHASE PROGRESS RESTARTS AGE target-dv-vm-clone CloneInProgress 18.33% 2 12m $ oc describe dv target-dv-vm-clone ..... Phase: CloneInProgress Progress: 19.35% Restart Count: 2 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pending 12m datavolume-controller PVC target-dv-vm-clone Pending Normal Bound 12m datavolume-controller PVC target-dv-vm-clone Bound Normal CloneInProgress 12m datavolume-controller Cloning from openshift-virtualization-os-images/cirros-dv into default/target-dv-vm-clone in progress
Bartosz, prompted by the opposite problem (failing to clone fs->block because of inexact validation), has already merged a change for this to the 4.11 branch. Now when cloning block->fs with the exact same size the clone will not start (it's staying in CloneScheduled) and emits a warning event "target resources requests storage size is smaller than the source". However, while trying to reproduce this bug, I uncovered a similar issue - if we clone to a slightly larger pvc, the size of the scratch space PVC is too small, resulting in similar "endless loop" behavior.
Will backport this once 4.11.1 backports are possible.
Verified on version 4.11.1-20. Steps to verify: 1. Create a DV with Block volume mode, size: 30Gi. 2. Create a VM which source is the DV from step 1, using Filesystem volume mode, specifying size 30Gi and storageClassName hostpath-csi-basic. 3. Check the status of DV from step 2, and check CDI's log. Results: $ oc get dv NAME PHASE PROGRESS RESTARTS AGE target-dv-vm-clone CloneScheduled N/A 126m $ oc get pod No resources found in fuhui namespace. $ oc describe dv ... Phase: CloneScheduled Progress: N/A Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pending 66s datavolume-controller PVC target-dv-vm-clone Pending Normal Bound 65s datavolume-controller PVC target-dv-vm-clone Bound $ oc logs cdi-deployment ... {"level":"error", ... ,"msg":"Error validating clone spec, ignoring","PVC":"fuhui/target-dv-vm-clone","error":"target resources requests storage size is smaller than the source","errorVerbose":"target resources requests storage size is smaller than the source\n ...}
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 4.11.1 security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8750