Bug 2102123 - Cloning virtual machines is failing while creating DV with error "target resources requests storage size is smaller than the source" if source PVC is extended
Summary: Cloning virtual machines is failing while creating DV with error "target reso...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: User Experience
Version: 4.10.2
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: Aviv Turgeman
QA Contact: Guohua Ouyang
URL:
Whiteboard:
Depends On: 2100345
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-29 11:16 UTC by Tal Nisan
Modified: 2022-06-29 15:14 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 2100345
Environment:
Last Closed: 2022-06-29 15:00:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Tal Nisan 2022-06-29 11:16:14 UTC
+++ This bug was initially created as a clone of Bug #2100345 +++

Description of problem:

Created a virtual machine with the below DV spec (spec.pvc)

~~~
oc get dv rhel8-cold-landfowl -o yaml |yq -y '.spec'
pvc:   <<<<
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 30Gi <<<
  storageClassName: ocs-external-storagecluster-ceph-rbd
  volumeMode: Block
~~~

After provisioning, extended the backend PVC to 40 GB.

~~~
rhel8-cold-landfowl            Bound    pvc-634e5504-6090-42c9-8dda-8490153bd34f   40Gi       RWX            ocs-external-storagecluster-ceph-rbd   17m
~~~

Used the OpenShift console and tried to clone the virtual machine. The cloned VM gets created with 30 GB instead of 40 GB.

~~~
oc get vm rhel8-cold-landfowl-clone -o yaml |yq -y '.spec.dataVolumeTemplates'
- metadata:
    creationTimestamp: null
    name: rhel8-cold-landfowl-clone-rhel8-cold-landfowl-tksun
  spec:
    pvc:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 30Gi
      storageClassName: ocs-external-storagecluster-ceph-rbd
      volumeMode: Block
    source:
      pvc:
        name: rhel8-cold-landfowl
        namespace: nijin-cnv
~~~

Because of this, the creation of cloned DV is failing with the error below:

~~~
3s          Warning   FailedDataVolumeCreate            virtualmachine/rhel8-cold-landfowl-clone                                    Error creating DataVolume rhel8-cold-landfowl-clone-rhel8-cold-landfowl-tksun: admission webhook "datavolume-validate.cdi.kubevirt.io" denied the request:  target resources requests storage size is smaller than the source
~~~

Version-Release number of selected component (if applicable):

OpenShift Virtualization   4.10.2
OpenShift 4.10.15

How reproducible:

100%

Steps to Reproduce:
1. Create a VM with DV template 'spec.pvc'. 
2. Extend the PVC. 
3. Clone the VM. The cloned VM is taking the source DV size instead of PVC size. So the creation of cloned DV fails with the error "target resources requests storage size is smaller than the source".
  

Actual results:

Cloning virtual machines is failing while creating DV with error "target resources requests storage size is smaller than the source"

Expected results:

Cloning should work
Additional info:

--- Additional comment from Bartosz Rybacki on 2022-06-27 12:34:47 IDT ---

This is not a direct duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2084122, but the fix for 2084122 removes the validation hook that is clearly wrong in this situation. The fox for 2084122 moves the check to the controller, where the correct size can be evaluated and tested. 

Anyway, a new test has to be written to make sure all edge cases are handled.

--- Additional comment from Bartosz Rybacki on 2022-06-27 14:53:55 IDT ---

How does cloning a VM work in UI? Is it creating a VM with DVTemplate? If yes, is it setting the Volume size?

If the answear to both is yes then I suppose the problem is within the OpenShift UI. And it might be a duplicate of this: https://bugzilla.redhat.com/show_bug.cgi?id=2088220.

The expected behavior is to always get the size from PVC. The DV is only the request to create the PVC. After PVC exists it can only be used to find the PVC and the size/capacity/ all the other values should be consulted with the actual PVC.

--- Additional comment from Bartosz Rybacki on 2022-06-27 15:29:39 IDT ---

@tnisan I suppose this might actually be an Openshift console bug, can you help me verify it, or point me to someone who can?

--- Additional comment from nijin ashok on 2022-06-28 09:10:12 IDT ---

(In reply to Bartosz Rybacki from comment #2)
> How does cloning a VM work in UI? Is it creating a VM with DVTemplate? If
> yes, is it setting the Volume size?
> 
> If the answear to both is yes then I suppose the problem is within the
> OpenShift UI. 

I think yes, it's an issue with the console. In my test, the cloning sets the DV template in the target VM yaml and set the size from the DV and not from the PVC. Also, the issue is not reproducible with spec.storage. I think it gets pvc size for spec.storage and DV size for spec.pvc as per  https://github.com/openshift/console/blob/f1b7d2a37bcea66481f677d9ae03c75741295e8d/frontend/packages/kubevirt-plugin/src/k8s/helpers/vm-clone.ts#L156

--- Additional comment from Adam Litke on 2022-06-28 15:30:56 IDT ---

Moving to Console.  Tal, as the previous comment indicates, there is a discrepancy with how the clone size is determined based on how the DV Template looks in the source VM.  If the source VM uses the 'storage' API it fetches the size from the PVC.  This is the correct behavior.  However, if the source VM uses the 'pvc' API we fetch the size from the DataVolume spec.  This is not correct because the actual PVC size can be larger than the original DV request for various reasons.  The effect of this is that cloning VMs that use the 'pvc' API in the DVTemplate may not work correctly in the UI.

Comment 1 Aviv Turgeman 2022-06-29 15:00:13 UTC
cant reproduce this bug on main branch,

followed steps of reprocude:
1. created a VM with this YAML example:
```
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: vm-example
  labels:
    app: vm-example
    os.template.kubevirt.io/fedora35: 'true'
    flavor.template.kubevirt.io/small: 'true'
    vm.kubevirt.io/template: fedora-server-small
    workload.template.kubevirt.io/server: 'true'
  annotations:
    name.os.template.kubevirt.io/fedora35: Fedora 35
    description: VM example
  namespace: aviv-test
spec:
  dataVolumeTemplates:
    - apiVersion: cdi.kubevirt.io/v1beta1
      kind: DataVolume
      metadata:
        creationTimestamp: null
        name: dv-example
      spec:
        source:
          registry:
            url: 'docker://quay.io/containerdisks/fedora:latest'
        pvc:
          accessModes:
            - ReadWriteMany
          resources:
            requests:
              storage: 30Gi
          storageClassName: ocs-storagecluster-ceph-rbd
          volumeMode: Block
  running: false
  template:
    metadata:
      annotations:
        vm.kubevirt.io/flavor: small
        vm.kubevirt.io/os: fedora
        vm.kubevirt.io/workload: server
      labels:
        flavor.template.kubevirt.io/small: 'true'
        kubevirt.io/domain: vm-example
        kubevirt.io/size: small
        vm.kubevirt.io/name: vm-example
        os.template.kubevirt.io/fedora35: 'true'
        workload.template.kubevirt.io/server: 'true'
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 1
          threads: 1
        devices:
          disks:
            - bootOrder: 1
              disk:
                bus: virtio
              name: containerdisk
            - disk:
                bus: virtio
              name: disk-example
            - disk:
                bus: virtio
              name: cloudinitdisk
          interfaces:
            - masquerade: {}
              name: default
              model: virtio
          networkInterfaceMultiqueue: true
          rng: {}
        resources:
          requests:
            memory: 1G
      hostname: vm-example
      networks:
        - name: default
          pod: {}
      terminationGracePeriodSeconds: 0
      volumes:
        - containerDisk:
            image: 'quay.io/containerdisks/fedora:35'
          name: containerdisk
        - cloudInitNoCloud:
            userData: |-
              #cloud-config
              password: fedora
              chpasswd: { expire: False }
          name: cloudinitdisk
        - dataVolume:
            name: dv-example
          name: disk-example
```

2. expended PVC to 40Gi
3. cloned the VM, VM was cloned successfully with 40Gi PVC created also successfully

moving this bug to CLOSED as NOTABUG


Note You need to log in before you can comment on or make changes to this bug.