Bug 1985827
Summary: | Start or remove VM failure even v2v has already finished | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | Xiaodai Wang <xiaodwan> | ||||
Component: | virt-v2v | Assignee: | Richard W.M. Jones <rjones> | ||||
Status: | CLOSED ERRATA | QA Contact: | Xiaodai Wang <xiaodwan> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | high | ||||||
Version: | 8.6 | CC: | ahadas, chhu, juzhou, kkiwi, lersek, mxie, nsoffer, rjones, tyan, tzheng, vwu | ||||
Target Milestone: | rc | Keywords: | Automation, Triaged | ||||
Target Release: | 8.5 | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | virt-v2v-1.42.0-19.el8 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1985830 (view as bug list) | Environment: | |||||
Last Closed: | 2022-11-08 09:18:32 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Comments from Nir Soffer, thanks. Looking at the code related to creating a vm it is expected that the vm will not finish the import when virt-v2v finishes, since it does not wait for compilation. vm = vms_service.add( types.Vm( cluster=cluster, initialization=types.Initialization( configuration=types.Configuration( type=types.ConfigurationType.OVA, data=ovf, ) ) ) ) This completes before the vm is created. Waiting until an operation is completed is a great pain in RHV. The best way to do this for most operations is to use jobs, as used here: https://gitlab.com/nirs/ovirt-stress/-/blob/bb2088f05b3e186733da99260f7b5ecb0a0d03a8/delete-snapshot/test.py#L148 The general idea is to add correlation_id when calling the API, and then wait for the job with that id. Nir, would you be able to provide a patch for this? (In reply to Xiaodai Wang from comment #1) > The best way to do this for most operations is to use jobs, as used here: > https://gitlab.com/nirs/ovirt-stress/-/blob/ > bb2088f05b3e186733da99260f7b5ecb0a0d03a8/delete-snapshot/test.py#L148 This can lower the chances of getting this failure but it won't be bullet proof because it monitors VDSM tasks that there are no VDSM tasks in this flow of import-from-configuration We were discussing a better way to monitor when an entity is created in https://gerrit.ovirt.org/#/c/ovirt-engine/+/115673/ - which AFAIK is a work in progress (In reply to Arik from comment #3) > (In reply to Xiaodai Wang from comment #1) > > The best way to do this for most operations is to use jobs, as used here: > > https://gitlab.com/nirs/ovirt-stress/-/blob/ > > bb2088f05b3e186733da99260f7b5ecb0a0d03a8/delete-snapshot/test.py#L148 > > This can lower the chances of getting this failure but it won't be bullet > proof because it monitors VDSM tasks that there are no VDSM tasks in this > flow of import-from-configuration Why jobs would not work for this case? The import code is not integrated with jobs properly? > We were discussing a better way to monitor when an entity is created in > https://gerrit.ovirt.org/#/c/ovirt-engine/+/115673/ - which AFAIK is a work > in progress Do you mean changing the status of the VM after the memory lock is released? Isn't this already solved by jobs? Ah no, you're right - your code monitors the execution jobs that are correlated with the correlation id you provide, like OST does, so that should work But still if we manage to get https://gerrit.ovirt.org/#/c/ovirt-engine/+/115838/ in and change the polling mechanism of ImportVmFromConfiugration to jobs (rather than VDSM tasks) then you won't need the extra code that polls the execution jobs on the client side. We could then just invoke this operation in a synchronous (blocking) mode (In reply to Richard W.M. Jones from comment #2) > Nir, would you be able to provide a patch for this? Any updates here? Tomáš and I talked about making the call synchronous in virt-v2v It won't change the situation much until https://gerrit.ovirt.org/#/c/ovirt-engine/+/116263/ gets in - so in the meantime there would need to be a workaround in the test (like a sleep of a few seconds after the call to vms_service.add that was mentioned in comment 1) The fix on ovirt-engine is expected to land in v4.5 and then the workaround could be dropped from the test Tomáš will post a patch and we can discuss it further Bulk update: Move RHEL-AV bugs to RHEL9. If necessary to resolve in RHEL8, then clone to the current RHEL8 release. I'm going to put this back to RHEL AV (ie 8.6) because RHV is basically RHEL AV only. Of course we will need to fix it in RHEL 9 too eventually. However we have no fix upstream yet, and so it's best to leave it assigned to Tomas and ask for his help. There is theoretically no RHEL-AV 8.6.0 - all fixes would need to be done in RHEL. However, I will leave this bug on AV for now and once we know a fix is ready request the 8.5.z-stream and move this to RHEL 8.6.0 to be processed. RHV is planning to consume that. The clone and move is just a "workaround" because the tool doesn't clone RHEL-AV bugs from RHEL bugs. I added RHEV as a dependent product Moving to RHEL8 instead of RHEL-AV. If an AV resolution is required, then z-streams must be used. AIUI, RHV will move to RHEL 8.6 in the future. I triggered one of our automation jobs which can always reproduce the issue and checked the latest result with the latest build. As you can see in the link below[1], almost all cases are PASS or FIXED. I didn't came across the issue any more. So I think this bug can be moved to VERIFIED. [1] https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/v2v-RHEL-8.7-runtest-x86_64-function-specific_kvm-rhel/5/testReport/rhel/specific_kvm/ Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7472 |
Created attachment 1805713 [details] ovirt-engine log Description of problem: Start or remove VM failure even v2v has already finished Version-Release number of selected component (if applicable): virt-v2v-1.42.0-14.module+el8.5.0+11846+77888a74.x86_64 Software Version:4.4.7.6-0.11.el8ev How reproducible: 30% Steps to Reproduce: 1. Run virt-v2v to convert a VM to rhv by rhv-upload. 2. Start the VM or Remove the VM once v2v finishes off. The issue is hard to reproduce manually, it's better to keep running following script by automation. function_test_xen.positive_test.rhev.scsi_disk.output_mode.rhev.rhv_upload Actual results: 2021-07-22 13:22:49,124+08 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-876) [] Operation Failed: [Cannot run VM. VM Auto-xen-rhel7.9-scsidisk is being imported.] Expected results: The VM operation should success when v2v finishes. Additional info: