Description of problem: VMRestore doesn't get to the Complete state, restore DV stays WaitForFirstConsumer, restore PVC is Pending restore VM is Stopped and not Ready Version-Release number of selected component (if applicable): 4.12 How reproducible: Always on SNO cluster with snapshot capable storage with WaitForFirstConsumer volumeBindingMode (TopoLVM storage in our case - odf-lvm-vg1) Steps to Reproduce: 1. Create a VM - VM is Running 2. Create a VMSnapshot - VMSnapshot is ReadyToUse 3. Create a VMRestore Actual results: VMRestore is not Complete $ oc get vmrestore NAME TARGETKIND TARGETNAME COMPLETE RESTORETIME ERROR restore-my-vm VirtualMachine vm-restored false Expected results: VMRestore is Complete (PVC Bound, DV Succeded and garbage collected) Workaround and ONE MORE ISSUE: 1. Start the restored VM 2. See the VM is Ready and Running, DV succeeded, PVC Bound 3. See the VMRestore is still not Complete: $ oc get vmrestore NAME TARGETKIND TARGETNAME COMPLETE RESTORETIME ERROR restore-my-vm VirtualMachine vm-restored false $ oc describe vmrestore restore-my-vm | grep Events -A 10 Events: Type Reason Age From Message ---- ------ ---- ---- Warning VirtualMachineRestoreError 4m4s (x23 over 4m21s) restore-controller VirtualMachineRestore encountered error invalid RunStrategy "Always" 4. See the restored VM runStrategy: $ oc get vm vm-restored -oyaml | grep running running: true *** PLEASE NOTE that the restored VM on OCS with Immediate volumeBindingMode on the multi-node cluster gets the "running: false", despite that the source VM had it "true", and we are not getting the above error, and VMRestore becomes Complete: $ oc get vm vm-restored-ocs -oyaml | grep running running: false *** 5. Stop the restored VM 6. See the VMRestore is Complete: $ oc get vmrestore NAME TARGETKIND TARGETNAME COMPLETE RESTORETIME ERROR restore-my-vm VirtualMachine vm-restored true 1s Additional info: VM yaml: $ cat vm.yaml apiVersion: kubevirt.io/v1alpha3 kind: VirtualMachine metadata: name: vm-cirros-source labels: kubevirt.io/vm: vm-cirros-source spec: dataVolumeTemplates: - metadata: name: cirros-dv-source spec: storage: resources: requests: storage: 1Gi storageClassName: odf-lvm-vg1 source: http: url: <cirros-0.4.0-x86_64-disk.qcow2> running: true template: metadata: labels: kubevirt.io/vm: vm-cirros-source spec: domain: devices: disks: - disk: bus: virtio name: datavolumev machine: type: "" resources: requests: memory: 100M terminationGracePeriodSeconds: 0 volumes: - dataVolume: name: cirros-dv-source name: datavolumev VMSnapshot yaml: $ cat snap.yaml apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineSnapshot metadata: name: my-vmsnapshot spec: source: apiGroup: kubevirt.io kind: VirtualMachine name: vm-cirros-source VMRestore yaml: $ cat vmrestore.yaml apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineRestore metadata: name: restore-my-vm spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: vm-restored virtualMachineSnapshotName: my-vmsnapshot
Just to keep the info in this BZ: this bug was discussed at KubeVirt SIG-Storage Meeting, and the current approach to fix it is to mark VMRestore Complete when DV is WFFC and PVC is Pending.
Verified on SNO cluster with TopoLVM with WFFC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Virtualization 4.12.3 Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:3283