Bug 1894897

Summary:

[v2v][VMIO] VMimport CR is not reported as failed when target VM is deleted during the import

Product:

Container Native Virtualization (CNV)

Reporter:

Maayan Hadasi <mguetta>

Component:

V2V

Assignee:

Sam Lucidi <slucidi>

Status:

CLOSED ERRATA

QA Contact:

Amos Mastbaum <amastbau>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

2.5.0

CC:

amastbau, cnv-qe-bugs, fdupont, istein, slucidi

Target Milestone:

---

Target Release:

2.6.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-03-10 11:18:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
vm-import-controller yaml	none
vmware-vmimport-1-describe	none
vm-import-controller log	none
importer-vmware-import-1-harddisk1 pod log	none

Description Maayan Hadasi 2020-11-05 11:51:11 UTC

Description of problem:
VMware vmimport CR has no "failed" status as expected after deleting the importing VM during disk copy/conversion image stage
It seems in UI with status "importing" 


Version-Release number of selected component (if applicable):
CNV 2.5.0-413 (iib-24150)
OCP 4.6.1


How reproducible:
100%


Steps to Reproduce:
1. Have a running VM in VMware
2. Create VMimport CR via API
3. After source VM is powered off and the disk copy was started -> delete the importing VM: oc delete vm <vm-name>


Actual results:
VMimport status is "Processing"
UI: VMimport is displayed in VM page with status "importing"
VMimport stays in this situation till deleting VMimport CR


Expected results:
VMimport should have status "failed" due to VM deletion


Additional info:
* Tested using NFS storage class
* Regarding pods created during import:
- Deleting VM in disk_copy stage -> import pod still running, at some point it is removed (as it was completed) but no vmimport.v2v.kubevirt pod is created after.
- Deleting VM in conversion stage -> vmimport.v2v.kubevirt pod still running and got completed
* Source VM is off till deleting the VMimport CR itself


Attachments:
logs: vm-import-controller, importer pod
vm-import-controller yaml
vmimport CR describe

Comment 1 Maayan Hadasi 2020-11-05 11:52:19 UTC

Created attachment 1726847 [details]
vm-import-controller yaml

Comment 2 Maayan Hadasi 2020-11-05 11:53:01 UTC

Created attachment 1726848 [details]
vmware-vmimport-1-describe

Comment 3 Maayan Hadasi 2020-11-05 11:53:51 UTC

Created attachment 1726849 [details]
vm-import-controller log

Comment 4 Maayan Hadasi 2020-11-05 11:55:27 UTC

Created attachment 1726850 [details]
importer-vmware-import-1-harddisk1 pod log

Comment 5 Fabien Dupont 2020-11-06 08:04:29 UTC

In the vm-import-controller log, we can see the following message:

{"level":"error","ts":1604566865.2775548,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"virtualmachineimport-controller","name":"vmware-vmimport-1","namespace":"default","error":"VirtualMachine.kubevirt.io \"vmware-import-1\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

So, the vm-import-controller knows that the VM has been deleted. It could then delete the DataVolume and mark the import as failed with a meaningful message.
When the DataVolume is deleted, I guess that the importer pod is terminated too. Something to verify.

Comment 6 Maayan Hadasi 2020-11-08 08:31:24 UTC

(In reply to Fabien Dupont from comment #5)
> In the vm-import-controller log, we can see the following message:
> 
> {"level":"error","ts":1604566865.2775548,"logger":"controller-runtime.
> controller","msg":"Reconciler
> error","controller":"virtualmachineimport-controller","name":"vmware-
> vmimport-1","namespace":"default","error":"VirtualMachine.kubevirt.io
> \"vmware-import-1\" not
> found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/
> github.com/kubevirt/vm-import-operator/vendor/github.com/go-logr/zapr/zapr.
> go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> reconcileHandler\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/
> sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:
> 248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> processNextWorkItem\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/
> sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:
> 222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> worker\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/
> controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/
> apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/
> vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.
> io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/vm-
> import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/
> apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/vm-import-
> operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
> 
> So, the vm-import-controller knows that the VM has been deleted. It could
> then delete the DataVolume and mark the import as failed with a meaningful
> message.
> When the DataVolume is deleted, I guess that the importer pod is terminated
> too. Something to verify.

It seems as the importer pod is terminated too. Please see "Additional info" in bug description

Comment 7 Fabien Dupont 2020-11-18 15:27:31 UTC

It is not related to a specific provider. Marking BZ#1894900 as duplicate to reduce admin work.

Comment 8 Fabien Dupont 2020-11-18 15:27:51 UTC

*** Bug 1894900 has been marked as a duplicate of this bug. ***

Comment 9 Fabien Dupont 2021-01-25 10:20:28 UTC

@slucidi do you think this could be fixed in CNV 2.6.0? If not, do you think it's worth fixing it in CNV at all?

Comment 10 Sam Lucidi 2021-01-25 16:13:33 UTC

I think I'll have time to fix it for 2.6.

Comment 11 Fabien Dupont 2021-01-28 16:56:05 UTC

The fix should be in hco-bundle-registry-container-v2.6.0-521 and onwards. Moving to ON_QA.

Comment 12 Amos Mastbaum 2021-02-04 10:17:09 UTC

verified build: iib-42945 hco-v2.6.0-523
ovirt+vmware

VMNotFound: target VM XXX-for-tests not found

Comment 13 Amos Mastbaum 2021-02-04 10:24:15 UTC

verified build: iib-42945 hco-v2.6.0-523
ovirt+vmware

VMNotFound: target VM XXX-for-tests not found

Comment 16 errata-xmlrpc 2021-03-10 11:18:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799