Bug 2086825

Summary: VM restore PVC uses exact source PVC request size
Product: Container Native Virtualization (CNV) Reporter: Alex Kalenyuk <akalenyu>
Component: StorageAssignee: skagan
Status: CLOSED ERRATA QA Contact: Yan Du <yadu>
Severity: urgent Docs Contact:
Priority: high    
Version: 4.11.0CC: alitke, cnv-qe-bugs, jcall, mrashish, pelauter, skagan, yadu
Target Milestone: ---   
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: CNV v4.11.0-530 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2101168 (view as bug list) Environment:
Last Closed: 2022-09-14 19:33:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2101168    

Description Alex Kalenyuk 2022-05-16 15:28:11 UTC
Description of problem:
VM restore PVC uses exact source PVC request size, discarding the possibility that status.capacity.storage (and thus volumesnapshot.status.restoreSize) is different


Version-Release number of selected component (if applicable):
CNV 4.11.0


How reproducible:
100%

Steps to Reproduce:
1. Attempt to restore a VM with PVC whose spec.storage != status.capacity (manifests below)

Actual results:
ceph csi driver does not allow srcSize != targetSize, restored PVC Pending:
  Warning  ProvisioningFailed    116s (x11 over 9m24s)   openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-7d96f8b4d5-s6lzj_dc0077ff-a0f2-4034-ae60-c6e15766ee54  failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd": error getting handle for DataSource Type VolumeSnapshot by Name vmsnapshot-74690574-f6a4-457c-8682-c830f74d5e0b-volume-dv-disk: requested volume size 8053063680 is less than the size 8589934592 for the source snapshot vmsnapshot-74690574-f6a4-457c-8682-c830f74d5e0b-volume-dv-disk

Expected results:
Success

Additional info:
Reproduced on OCS using 7.5Gi which gets round-up by the provisioner
[akalenyu@localhost manifests]$ oc get pvc simple-dv -o yaml | grep storage:
      storage: 7680Mi
    storage: 8Gi

[akalenyu@localhost manifests]$ cat dv.yaml 
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: simple-dv
  namespace: akalenyu
spec:
  source:
      http:
         url: "http://.../Fedora-Cloud-Base-34-1.2.x86_64.qcow2"
  pvc:
    storageClassName: ocs-storagecluster-ceph-rbd
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 7.5Gi
[akalenyu@localhost manifests]$ cat vm.yaml 
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: simple-vm
spec:
  running: true
  template:
    metadata:
      labels: {kubevirt.io/domain: simple-vm,
        kubevirt.io/vm: simple-vm}
    spec:
      domain:
        devices:
          disks:
          - disk: {bus: virtio}
            name: dv-disk
          - disk: {bus: virtio}
            name: cloudinitdisk
        resources:
          requests: {memory: 2048M}
      volumes:
      - dataVolume: {name: simple-dv}
        name: dv-disk
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            password: fedora
            chpasswd: { expire: False }
        name: cloudinitdisk
[akalenyu@localhost manifests]$ cat snapshot.yaml 
apiVersion: snapshot.kubevirt.io/v1alpha1
kind: VirtualMachineSnapshot
metadata:
  name: snap-simple-vm
spec:
  source:
    apiGroup: kubevirt.io
    kind: VirtualMachine
    name: simple-vm
[akalenyu@localhost manifests]$ cat restore.yaml 
apiVersion: snapshot.kubevirt.io/v1alpha1
kind: VirtualMachineRestore
metadata:
  name: restore-simple-vm
spec:
  target:
    apiGroup: kubevirt.io
    kind: VirtualMachine
    name: simple-vm
  virtualMachineSnapshotName: snap-simple-vm

Comment 1 Adam Litke 2022-05-18 20:16:46 UTC
Shelly, is this the same as https://bugzilla.redhat.com/show_bug.cgi?id=2021354 ?

Comment 2 skagan 2022-05-19 08:41:15 UTC
Adam if you recall I asked you what should be done with this issue. The bug you mentioned is a bug that was caused by a bug in the smart clone. It uncovered a possible scenario of restoring a pvc that the user requested a size X, got a bigger actual size Y, so he wrote more then X. That in certain provisioners can cause this bug in the VM restore. The bug you mentioned was cause by the bug in the smart clone that by an error got to a similar scenario, thats why me and Alex talked about opening a new bugzilla.

Comment 3 Adam Litke 2022-06-17 14:39:51 UTC
Increasing severity and requesting blocker? because this issue will prevent VM Restore in almost all cases when using a Filesystem mode PV due to FS overhead causing the PVC size to be larger than requested.  Peter, can you set blocker+?

Comment 4 Yan Du 2022-06-28 09:29:20 UTC
Test on CNV-v4.11.0-536, issue has been fixed.

  Normal  ProvisioningSucceeded  6m35s                  openshift-storage.rbd.csi.ceph.com_csi-rbdplugin-provisioner-6cdb46484c-2s7wt_dda22819-f384-4c9c-951f-60d2956063c9  Successfully provisioned volume pvc-0c37008f-3d93-4612-93d2-494bfa3a0ac3

  Type    Reason                         Age   From                Message
  ----    ------                         ----  ----                -------
  Normal  VirtualMachineRestoreComplete  3m1s  restore-controller  Successfully completed VirtualMachineRestore restore-simple-vm

Comment 7 errata-xmlrpc 2022-09-14 19:33:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Virtualization 4.11.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:6526

Comment 8 Red Hat Bugzilla 2023-09-15 01:54:53 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days