Bug 1894897 - [v2v][VMIO] VMimport CR is not reported as failed when target VM is deleted during the import
Summary: [v2v][VMIO] VMimport CR is not reported as failed when target VM is deleted d...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: V2V
Version: 2.5.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 2.6.0
Assignee: Sam Lucidi
QA Contact: Amos Mastbaum
URL:
Whiteboard:
: 1894900 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-11-05 11:51 UTC by Maayan Hadasi
Modified: 2021-03-10 11:19 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-10 11:18:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
vm-import-controller yaml (7.63 KB, text/plain)
2020-11-05 11:52 UTC, Maayan Hadasi
no flags Details
vmware-vmimport-1-describe (3.42 KB, text/plain)
2020-11-05 11:53 UTC, Maayan Hadasi
no flags Details
vm-import-controller log (2.62 MB, text/plain)
2020-11-05 11:53 UTC, Maayan Hadasi
no flags Details
importer-vmware-import-1-harddisk1 pod log (5.01 KB, text/plain)
2020-11-05 11:55 UTC, Maayan Hadasi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:0799 0 None None None 2021-03-10 11:19:51 UTC

Description Maayan Hadasi 2020-11-05 11:51:11 UTC
Description of problem:
VMware vmimport CR has no "failed" status as expected after deleting the importing VM during disk copy/conversion image stage
It seems in UI with status "importing" 


Version-Release number of selected component (if applicable):
CNV 2.5.0-413 (iib-24150)
OCP 4.6.1


How reproducible:
100%


Steps to Reproduce:
1. Have a running VM in VMware
2. Create VMimport CR via API
3. After source VM is powered off and the disk copy was started -> delete the importing VM: oc delete vm <vm-name>


Actual results:
VMimport status is "Processing"
UI: VMimport is displayed in VM page with status "importing"
VMimport stays in this situation till deleting VMimport CR


Expected results:
VMimport should have status "failed" due to VM deletion


Additional info:
* Tested using NFS storage class
* Regarding pods created during import:
- Deleting VM in disk_copy stage -> import pod still running, at some point it is removed (as it was completed) but no vmimport.v2v.kubevirt pod is created after.
- Deleting VM in conversion stage -> vmimport.v2v.kubevirt pod still running and got completed
* Source VM is off till deleting the VMimport CR itself


Attachments:
logs: vm-import-controller, importer pod
vm-import-controller yaml
vmimport CR describe

Comment 1 Maayan Hadasi 2020-11-05 11:52:19 UTC
Created attachment 1726847 [details]
vm-import-controller yaml

Comment 2 Maayan Hadasi 2020-11-05 11:53:01 UTC
Created attachment 1726848 [details]
vmware-vmimport-1-describe

Comment 3 Maayan Hadasi 2020-11-05 11:53:51 UTC
Created attachment 1726849 [details]
vm-import-controller log

Comment 4 Maayan Hadasi 2020-11-05 11:55:27 UTC
Created attachment 1726850 [details]
importer-vmware-import-1-harddisk1 pod log

Comment 5 Fabien Dupont 2020-11-06 08:04:29 UTC
In the vm-import-controller log, we can see the following message:

{"level":"error","ts":1604566865.2775548,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"virtualmachineimport-controller","name":"vmware-vmimport-1","namespace":"default","error":"VirtualMachine.kubevirt.io \"vmware-import-1\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}

So, the vm-import-controller knows that the VM has been deleted. It could then delete the DataVolume and mark the import as failed with a meaningful message.
When the DataVolume is deleted, I guess that the importer pod is terminated too. Something to verify.

Comment 6 Maayan Hadasi 2020-11-08 08:31:24 UTC
(In reply to Fabien Dupont from comment #5)
> In the vm-import-controller log, we can see the following message:
> 
> {"level":"error","ts":1604566865.2775548,"logger":"controller-runtime.
> controller","msg":"Reconciler
> error","controller":"virtualmachineimport-controller","name":"vmware-
> vmimport-1","namespace":"default","error":"VirtualMachine.kubevirt.io
> \"vmware-import-1\" not
> found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/
> github.com/kubevirt/vm-import-operator/vendor/github.com/go-logr/zapr/zapr.
> go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> reconcileHandler\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/
> sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:
> 248\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> processNextWorkItem\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/
> sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:
> 222\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).
> worker\n\t/go/src/github.com/kubevirt/vm-import-operator/vendor/sigs.k8s.io/
> controller-runtime/pkg/internal/controller/controller.go:201\nk8s.io/
> apimachinery/pkg/util/wait.JitterUntil.func1\n\t/go/src/github.com/kubevirt/
> vm-import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\nk8s.
> io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/github.com/kubevirt/vm-
> import-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\nk8s.io/
> apimachinery/pkg/util/wait.Until\n\t/go/src/github.com/kubevirt/vm-import-
> operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88"}
> 
> So, the vm-import-controller knows that the VM has been deleted. It could
> then delete the DataVolume and mark the import as failed with a meaningful
> message.
> When the DataVolume is deleted, I guess that the importer pod is terminated
> too. Something to verify.

It seems as the importer pod is terminated too. Please see "Additional info" in bug description

Comment 7 Fabien Dupont 2020-11-18 15:27:31 UTC
It is not related to a specific provider. Marking BZ#1894900 as duplicate to reduce admin work.

Comment 8 Fabien Dupont 2020-11-18 15:27:51 UTC
*** Bug 1894900 has been marked as a duplicate of this bug. ***

Comment 9 Fabien Dupont 2021-01-25 10:20:28 UTC
@slucidi do you think this could be fixed in CNV 2.6.0? If not, do you think it's worth fixing it in CNV at all?

Comment 10 Sam Lucidi 2021-01-25 16:13:33 UTC
I think I'll have time to fix it for 2.6.

Comment 11 Fabien Dupont 2021-01-28 16:56:05 UTC
The fix should be in hco-bundle-registry-container-v2.6.0-521 and onwards. Moving to ON_QA.

Comment 12 Amos Mastbaum 2021-02-04 10:17:09 UTC
verified build: iib-42945 hco-v2.6.0-523
ovirt+vmware

VMNotFound: target VM XXX-for-tests not found

Comment 13 Amos Mastbaum 2021-02-04 10:24:15 UTC
verified build: iib-42945 hco-v2.6.0-523
ovirt+vmware

VMNotFound: target VM XXX-for-tests not found

Comment 16 errata-xmlrpc 2021-03-10 11:18:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799


Note You need to log in before you can comment on or make changes to this bug.