Bug 1668049
Summary: | [v2v][OSP] Migrations failure: Fails to create Instance after Disk conversion is finished | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Kedar Kulkarni <kkulkarn> | ||||||||
Component: | libguestfs | Assignee: | Tomáš Golembiovský <tgolembi> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Lukas Svaty <lsvaty> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | unspecified | CC: | apinnick, bthurber, fdupont, iovadia, maufart, ohochman, rjones, tgolembi, ytale | ||||||||
Target Milestone: | ovirt-4.2.8-1 | Keywords: | Regression, TestBlocker | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | v2v | ||||||||||
Fixed In Version: | async in OSP 14 and RHV 4.2.8 | Doc Type: | Bug Fix | ||||||||
Doc Text: |
A change in the virt-v2v output prevented virt-v2v-wrapper from detecting OpenShift Platform volumes, causing failed migrations. The current release fixes this issue, so that OpenShift Platform volume IDs are detected.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 1668777 1668791 (view as bug list) | Environment: | |||||||||
Last Closed: | 2019-02-13 15:33:20 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1607368, 1651352, 1654861, 1668679, 1668791, 1668816, 1672591 | ||||||||||
Attachments: |
|
Created attachment 1522237 [details]
virt-v2v logs
Created attachment 1522238 [details]
virt-v2v-wrapper-log
I've run a migration with the latest appliance and it failed as described in this BZ. Environment is composed of: - OpenStack 13 - CloudForms 5.10.0.31.20190108221820_a0968c8 - rhos-v2v-appliance 14.0-20190116.1, with: - ovirt-ansible-v2v-conversion-host 1.9.0-3 - virt-v2v-1.38.2-12.22.lp.el7ev.x86_64 - nbdkit-1.2.6-1.el7_6.2.x86_64 - nbdkit-plugin-vddk-1.2.6-1.el7_6.2.x86_64 - nbdkit-plugin-python2-1.2.6-1.el7_6.2.x86_64 - nbdkit-plugin-python-common-1.2.6-1.el7_6.2.x86_64 The migration of this specific VM worked earlier with a conversion host created manually from RHEL 7.6. The difference between the migrations is that virt-v2v-wrapper.py is unable to identify the volume id in virt-v2v log. A successful identification will produce the following kind of line in the wrapper log: 2019-01-15 02:17:40,913:DEBUG: Volume at index 1 has id='310718f5-048e-49cb-a65c-513904e068cc' (virt-v2v-wrapper:912) With the latest appliance, it doesn't appear in the log. And that's caused because the regex fails to catch the line in virt-v2v log. For the failed migration, the line is: openstack [...] volume set --description hridbxprd001 disk 1/1 converted by virt-v2v --property virt_v2v_version=1.38.2rhel=7,release=12.28.lp.el7ev,libvirt --property virt_v2v_conversion_date=2019/01/22 08:55:51 --property virt_v2v_guest_name=hridbxprd001 --property virt_v2v_disk_index=1/1 --property virt_v2v_guest_id=d5d9f7bf-ffc5-4f82-ac0e-e9d961fc569d ab42e303-4d31-4932-9589-9f3726ebdae7 In virt-v2v-wrapper.py, the regex is the following: OSP_VOLUME_PROPS = re.compile( br'openstack .*\'volume\' \'set\'.*' br'\'--property\'' br' \'virt_v2v_disk_index=(?P<volume>[0-9]+)/[0-9]+\'.*' br'\'(?P<uuid>[a-fA-F0-9-]*)\'') The other difference I noticed between a conversion host where it was successful and the one based on rhos-v2v-appliance is the version of nbdkit, which is 1.2.6-1.1 on the successful conversion host and 1.2.6 on the failing conversion host. I checked the virt-v2v log line on the machine that succeeded and the format is different: openstack '--os-username=admin' '--os-identity-api-version=3' '--os-user-domain-name=default' '--os-auth-url=http://controller.v2v.bos.redhat.com:5000/v3' '--os-project-name=admin' '--os-password=100Root-' 'volume' 'set' '--description' 'hriprdweb001 disk 1/1 converted by virt-v2v' '--property' 'virt_v2v_version=1.38.2rhel=7,release=12.22.lp.el7ev,libvirt' '--property' 'virt_v2v_conversion_date=2019/01/15 07:13:30' '--property' 'virt_v2v_guest_name=hriprdweb001' '--property' 'virt_v2v_disk_index=1/1' '--property' 'virt_v2v_guest_id=ff908ae6-385f-4edd-ba87-b0cd82189313' '310718f5-048e-49cb-a65c-513904e068cc' You can see that there are single quotes when they are not there on failing VM. FYI. Found volumes are not getting cleaned up in this case despite it is Failed. While cancel migration issue (BZ1666799) assumption was failed migration should cleanups failed volumes (which it was also doing earlier before this case). (In reply to Fabien Dupont from comment #6) > I checked the virt-v2v log line on the machine that succeeded and the format > is different: > > openstack '--os-username=admin' '--os-identity-api-version=3' > '--os-user-domain-name=default' > '--os-auth-url=http://controller.v2v.bos.redhat.com:5000/v3' > '--os-project-name=admin' '--os-password=100Root-' 'volume' 'set' > '--description' 'hriprdweb001 disk 1/1 converted by virt-v2v' '--property' > 'virt_v2v_version=1.38.2rhel=7,release=12.22.lp.el7ev,libvirt' '--property' > 'virt_v2v_conversion_date=2019/01/15 07:13:30' '--property' > 'virt_v2v_guest_name=hriprdweb001' '--property' 'virt_v2v_disk_index=1/1' > '--property' 'virt_v2v_guest_id=ff908ae6-385f-4edd-ba87-b0cd82189313' > '310718f5-048e-49cb-a65c-513904e068cc' > > You can see that there are single quotes when they are not there on failing > VM. I see - this is indeed a side effect of fixing bug 1664310: https://github.com/libguestfs/libguestfs/commit/fc028bf57a3ff128d21b904583f9ea02f672ed5b A different function is used to print the command, which doesn't quote. I think the real lesson from this is that we should write some kind of metadata file which the wrapper can pick up, rather than relying on parsing the debug logs which are not intended for this and are just going to cause ongoing problems of this sort in future. (In reply to Yadnyawalk Tale from comment #7) > FYI. Found volumes are not getting cleaned up in this case despite it is > Failed. While cancel migration issue (BZ1666799) assumption was failed > migration should cleanups failed volumes (which it was also doing earlier > before this case). This is another side effect of the current bug. I have edit virt-v2v-wrapper.py with following fix and pass the migration https://github.com/oVirt/ovirt-ansible-v2v-conversion-host/commit/29f5317f0b668bed413c360903940ef5a10cb297 Moiviung over to RHV as fix is in virt-v2v-wrapper. Cleared blocker flag. Will ship fix async for RHV 4.2.8 and OSP 14 conversion host image. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0338 |
Created attachment 1522236 [details] Screenshot Description of problem: VM migration with latest CloudForms build fail after the Disk conversion is finished. My observation is Disk conversion finishes 100%, virt-v2v finishes with return code 0, but migration fails to create instance and Migration plan fails. Virt-v2v and virt-v2v-wrapper logs are attached. Version-Release number of selected component (if applicable): 5.10.0.32 virt-v2v-1.38.2-12.28.lp.el7ev.x86_64 # Wrapper version VERSION = "11" How reproducible: 100% ( tried 4 different VM migrations all failed same ) Steps to Reproduce: 1.Configure CFME appliance and Conversion Instance for migration 2.Create Infra mapping and migration plans 3.Execute migration plan and wait for completion Actual results: See logs and attached screenshot, migration is failed after 100% conversion of disk Expected results: Migration should finish successfully and launch instance. Additional info: Logs are attached only for 1 migration, but all 4 migrations that were run, failed similarly.