Bug 2109407

Summary: [4.11] Cloned VM's snapshot restore fails if the source VM disk is deleted
Product: Container Native Virtualization (CNV) Reporter: Yan Du <yadu>
Component: StorageAssignee: skagan
Status: CLOSED ERRATA QA Contact: dalia <dafrank>
Severity: high Docs Contact:
Priority: high    
Version: 4.11.0CC: alitke, cnv-qe-bugs, dafrank, nashok, skagan, yadu, ycui
Target Milestone: ---   
Target Release: 4.11.1   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: v4.11.1-29 Doc Type: Known Issue
Doc Text:
Cause: When a VM Snapshot is taken the DataVolumeTemplate in the VirtualMachineSnapshotContent mistakenly refers to the wrong PVC. Consequence: In cases where the VM being restored was originally cloned from a source VM and the source VM has been deleted, the restore will fail. Workaround (if any): Edit the VirtualMachineSnapshotContent object so that the DataVolumeTemplate refers to the correct PVC. Result: The VirtualMachine can be restored successfully.
Story Points: ---
Clone Of: 2104479 Environment:
Last Closed: 2022-12-01 21:10:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2104479    
Bug Blocks: 2109406    

Description Yan Du 2022-07-21 07:23:03 UTC
+++ This bug was initially created as a clone of Bug #2104479 +++

Description of problem:

[1] Created a new VM and is having below PVC:


rhel8-tory-koala                                                       Bound    pvc-6bafb6a3-e418-4860-91b2-f80af872a11f   30Gi          RWX            ocs-external-storagecluster-ceph-rbd   24s

[2] Created a clone from this VM:

rhel8-tory-koala-clone-rhel8-tory-koala-1m97p                          Bound    pvc-90e296b3-20ba-40d4-b0b7-5d9769ae2657   30Gi          RWX            ocs-external-storagecluster-ceph-rbd   4s

[3] Created a snapshot on the cloned VM:


NAME                                                                      READYTOUSE   SOURCEPVC                                       SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS                                     SNAPSHOTCONTENT                                    CREATIONTIME   AGE
vmsnapshot-70be6e17-b80d-46fb-a544-c52f6ee76a61-volume-rhel8-tory-koala   true         rhel8-tory-koala-clone-rhel8-tory-koala-1m97p                           30Gi          ocs-external-storagecluster-rbdplugin-snapclass   snapcontent-4ce6eab1-62b0-42f1-bd72-f4a0a2d63dae   19s            20s


[4] Deleted the VM and disk in [1].

[5] Tried to restore the snapshot. Restoration failed with the error below:

~~~
Error creating DataVolume restore-558e00db-857b-415d-8f37-bd7369194419-rhel8-tory-koala: admission webhook "datavolume-validate.cdi.kubevirt.io" denied the request: Source PVC default/rhel8-tory-koala not found
~~~

It is looking for PVC in [1] instead of PVC of the cloned VM.

Version-Release number of selected component (if applicable):

kubevirt-hyperconverged-operator.v4.10.2   

How reproducible:
100 %

Steps to Reproduce:

Please refer above.

Actual results:

Cloned VM's snapshot restore fails if the source VM disk is deleted.

Expected results:

Snapshot restore should work.

Additional info:

--- Additional comment from  on 2022-07-14 14:17:39 UTC ---

Hi @nashok I will really appreciate the yamls of the original VM, and of the cloned VM. Also an explanation of the process of the VM clone that was done in this case. Thanks

--- Additional comment from nijin ashok on 2022-07-18 12:58:59 UTC ---

Attaching the yamls of VMs.

The VM clone was done from the OpenShift console using "clone" option. 

Looks like the issue is because the VirtualMachineSnapshotContent of cloned VM refers to the source VM PVC instead of cloned PVC.

~~~
yq -y '.spec.source.virtualMachine.spec.dataVolumeTemplates' /tmp/vmsnapshot-content-53237ca8-7ca8-4894-ab85-0ba132a968e0.yaml
- metadata:
    creationTimestamp: null
    name: rhel8-resident-heron-clone-rhel8-resident-heron-2vald
  spec:
    source:
      pvc:
        name: rhel8-resident-heron <<<<
        namespace: nijin-cnv
    storage:
      accessModes:
        - ReadWriteMany
      resources:
        requests:
          storage: 30Gi
      storageClassName: ocs-external-storagecluster-ceph-rbd
      volumeMode: Block
~~~

The restore works if I manually edit this and changed it to cloned PVC.

--- Additional comment from nijin ashok on 2022-07-18 13:00:20 UTC ---

Comment 3 Yan Du 2022-11-03 12:14:39 UTC
Re-targeted to 4.11.2 according to #Comment2

Comment 7 dalia 2022-11-22 14:11:30 UTC
verified on 4.11.1

Comment 15 errata-xmlrpc 2022-12-01 21:10:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.11.1 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8750