Bug 1910019
| Summary: | [v2v][VM import from RHV] Communication issue is not reflected in the VM import failure in CNV UI | |||
|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | Ilanit Stein <istein> | |
| Component: | V2V | Assignee: | Sam Lucidi <slucidi> | |
| Status: | CLOSED WONTFIX | QA Contact: | Daniel Gur <dagur> | |
| Severity: | medium | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 2.5.1 | CC: | cnv-qe-bugs, fdupont, mrashish, slucidi | |
| Target Milestone: | --- | Flags: | istein:
needinfo+
istein: needinfo- |
|
| Target Release: | 2.6.2 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1954008 (view as bug list) | Environment: | ||
| Last Closed: | 2021-04-27 12:24:32 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1954008 | |||
|
Description
Ilanit Stein
2020-12-22 10:42:12 UTC
@slucidi, could you please check if CDI is reporting this error in its status. IIUC, whatever is reported in the status will bubble up in VMIO, but if it's only in the events it won't, right ? If yes, either CDI should report the error in the status, or VMIO should check the events. Leaving the BZ in NEW state, as we need to investigate more to know which component is "faulty". It appears that the importer pod error is recorded in the termination log for the container, and in the event log. It looks like VMIO will retrieve the termination message and re-emit it, and retry until it hits the crash loop backoff limit. That means that the VirtualMachineImport should have the termination errors in its event log, but the status once the import fails completely will be "pod CrashLoopBackoff restart exceeded". Ilanit, do you have a reproducer environment, or can you check the VirtualMachineImport event log to see if the messages appear there? Tested on OCP-4.7/CNV-2.6.0. VM import of a 100GB disk VM when there's only 65GB on the Ceph storage on OCP side. After couple of hours VM import in UI remain in 46% $ oc describe vmimports/vm-import-v2vmigrationvm0-lvjt5 Shows these events: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ImportScheduled 30m virtualmachineimport-controller Import of Virtual Machine default/v2vmigrationvm0 started Normal ImportInProgress 30m virtualmachineimport-controller Import of Virtual Machine default/v2vmigrationvm0 disk v2vmigrationvm0-03072434-e45b-430c-8860-ff50b0c71a2c in progress Warning EventPVCImportFailed 31s (x385 over 5m31s) virtualmachineimport-controller Unable to process data: unexpected EOF I can provide this "mgn04" cluster details offline, if needed. The fix should be part of hco-bundle-registry build v2.6.2-4 / iib:66925. Tested on hco-v2.6.2-23 iib:68580 It is not possible to verify the bug since the importer pod doesn't fail but continues forever to try: The cdi importer behavior on this version is different. VM import from RHV to Ceph-RBD/Block. OCP Ceph size is 70GB, and imported VM require 106GB. The importer log endlessly shows progress: "I0422 08:40:02.495231 1 prometheus.go:69] 100.00" When bug was reported, it used to fail with crash loop back, after few minutes, but now it continues forever. This is a problem because it is not reflected to the user that there is not enough space to do the import. @Maya, Can you please confirm that this is indeed the expected behavior? Adding that when cancelling the VM import the PVC and the importer pod remains in Terminating status, and the PV remains occupied. But this is not new and I think there is an OCS bug for it. Based on the test result detailed in comment #7 this bug cannot be verified. It cannot be fixed also from VM import side since on CNV-2.6.2 when the Ceph gets full importer pod doesn't fail anymore. @Fabien, @Sam, Based on the above would it be OK to move this bug to won't fix? and for 2.6.2 would this documentation for VM import from RHV would be OK?: Make sure there is enough space for the VM import, and if VM import remain at 75% with no progress for long time check the importer log, and if it repeatedly show progress 100, then the Ceph storage needs to be expanded? Regarding the no option to release the Ceph storage even though VM import is cancelled we already have this bug: Bug 1893528 - [v2v][VM import] Not possible to release target Ceph-RBD/Block space after "partial" disk copy. that was closed as duplicate on this OCS bug: Bug 1897351 - [Tracking bug 1910288] Capacity limit usability concerns I'm fine with closing this BZ as WONTFIX and only updating the docs. Based on comment #10 closing this bug on won't fix. Cloning it to a doc bug to document comment #9 |