Bug 2128872

Summary: [4.11]Can't restore cloned VM
Product: Container Native Virtualization (CNV) Reporter: dalia <dafrank>
Component: StorageAssignee: Álvaro Romero <alromero>
Status: CLOSED ERRATA QA Contact: dalia <dafrank>
Severity: unspecified Docs Contact:
Priority: high    
Version: 4.11.0CC: akalenyu, alitke, alromero, cnv-qe-bugs, j.thadden, mrashish, yadu
Target Milestone: ---   
Target Release: 4.11.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: virt-controller v4.11.1-11, CNV v4.11.1-92 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-12-01 21:12:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description dalia 2022-09-21 18:38:10 UTC
Description of problem:
Fail to restore a cloned VM.

Version-Release number of selected component (if applicable):
4.12

How reproducible:
100%

Steps to Reproduce:
1. Create a VM

2. Clone from it PVC to a  new VM

3. Snapshot the cloned VM

4. Restore the VM


Actual results:
VM restored 

Expected results:
restore failed

Additional info:

restore status:

reason: 'PersistentVolumeClaim "restore-3327a680-b539-4604-8358-69f3a0b25d69-dv-disk"
      is invalid: spec: Invalid value: field.Path{name:"dataSource", index:"", parent:(*field.Path)(0xc0955e66c0)}:
      must match dataSourceRef'
    status: "False"

-----------------------------

PVC spec:

  dataSource:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: cloned-dv
  dataSourceRef:
    apiGroup: snapshot.storage.k8s.io
    kind: VolumeSnapshot
    name: cloned-dv

Comment 2 Maya Rashish 2022-10-30 09:21:32 UTC
Note that VM clone isn't in 4.11 (the target release branch), but the fix might apply to this version anyway, what do you think alvaro?

Comment 3 Álvaro Romero 2022-11-02 08:50:12 UTC
(In reply to Maya Rashish from comment #2)
> Note that VM clone isn't in 4.11 (the target release branch), but the fix
> might apply to this version anyway, what do you think alvaro?

Thanks for noticing! Since the bug is in the restore controller and might still affect PVCs using dataSourceRef, I think it won't hurt to backport this.

Comment 4 Álvaro Romero 2022-11-02 08:58:28 UTC
(In reply to Álvaro Romero from comment #3)
> (In reply to Maya Rashish from comment #2)
> > Note that VM clone isn't in 4.11 (the target release branch), but the fix
> > might apply to this version anyway, what do you think alvaro?
> 
> Thanks for noticing! Since the bug is in the restore controller and might
> still affect PVCs using dataSourceRef, I think it won't hurt to backport
> this.

Now that I'm taking a closer look at the backport's code, I agree it might be unnecessary since we didn't create any restore PVC. I'll take a closer look but after seeing this I think it shouldn't be backported.

Comment 5 Álvaro Romero 2022-11-02 09:03:48 UTC
(In reply to Álvaro Romero from comment #4)
> (In reply to Álvaro Romero from comment #3)
> > (In reply to Maya Rashish from comment #2)
> > > Note that VM clone isn't in 4.11 (the target release branch), but the fix
> > > might apply to this version anyway, what do you think alvaro?
> > 
> > Thanks for noticing! Since the bug is in the restore controller and might
> > still affect PVCs using dataSourceRef, I think it won't hurt to backport
> > this.
> 
> Now that I'm taking a closer look at the backport's code, I agree it might
> be unnecessary since we didn't create any restore PVC. I'll take a closer
> look but after seeing this I think it shouldn't be backported.

Ignore my last comment, the PVC is created. I'll take a better look.

Comment 6 Maya Rashish 2022-11-02 11:44:03 UTC
Editing the target release was a mistake on my part. Reverting.
I thought about doing it, but instead wanted to ask Alvaro first, and forgot to undo it.

Comment 8 Joachim von Thadden 2022-11-17 15:57:36 UTC
This error also occurs on any of the templates having their disks being derived from openshift-os-images in ODF.

Workaround to be able to restore snapshots:
- any time BEFORE you restore the snapshot edit the virtualmachinesnapshotcontent
  oc edit virtualmachinesnapshotcontent.snapshot.kubevirt.io/vmsnapshot-content-1f0fe2f2-324b-4104-af25-cd2ccf90cef7
- search for dataSourceRef: (spec.volumeBackups[persistentVolumeClaim].spec.dataSourceRef)
- remove it

Now you can restore the snapshot without any problem. For sure this have to be done for each and every snapshot you do on such a machine.

Comment 9 Joachim von Thadden 2022-11-17 16:06:27 UTC
For anyone stuck in restoring such a machine, here is the way to come out of this without removing the VM:
- search for virtualmachinerestore.snapshot.kubevirt.io object that is stuck and shows "false" in COMPLETE column
  oc get virtualmachinerestore.snapshot.kubevirt.io
- delete the object
  oc delete oc delete virtualmachinerestore.snapshot.kubevirt.io/resotre-snapshot-<whatevername>
- extend oc/kubectl by installing krew (https://krew.sigs.k8s.io/docs/user-guide/setup/install/) and edit-status (https://github.com/ulucinar/kubectl-edit-status)
- then remove the status.restoreInProgress line
 
 oc edit-status virtualmachine.kubevirt.io/<yourMachine>

- now the VM can be started again
  virtctl start <yourMachine>

Comment 10 dalia 2022-11-22 09:21:52 UTC
verified on CNV 4.11.1

Comment 19 errata-xmlrpc 2022-12-01 21:12:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.11.1 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:8750