Bug 1668049 - [v2v][OSP] Migrations failure: Fails to create Instance after Disk conversion is finished
Summary: [v2v][OSP] Migrations failure: Fails to create Instance after Disk conversion...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: libguestfs
Version: unspecified
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ovirt-4.2.8-1
: ---
Assignee: Tomáš Golembiovský
QA Contact: Lukas Svaty
URL:
Whiteboard: v2v
Depends On:
Blocks: 1668791 1607368 1651352 1654861 1668679 1668816 1672591
TreeView+ depends on / blocked
 
Reported: 2019-01-21 19:20 UTC by Kedar Kulkarni
Modified: 2019-02-13 15:33 UTC (History)
9 users (show)

Fixed In Version: async in OSP 14 and RHV 4.2.8
Doc Type: Bug Fix
Doc Text:
A change in the virt-v2v output prevented virt-v2v-wrapper from detecting OpenShift Platform volumes, causing failed migrations. The current release fixes this issue, so that OpenShift Platform volume IDs are detected.
Clone Of:
: 1668777 1668791 (view as bug list)
Environment:
Last Closed: 2019-02-13 15:33:20 UTC
oVirt Team: Virt
Target Upstream Version:


Attachments (Terms of Use)
Screenshot (79.51 KB, image/png)
2019-01-21 19:20 UTC, Kedar Kulkarni
no flags Details
virt-v2v logs (1.90 MB, text/plain)
2019-01-21 19:21 UTC, Kedar Kulkarni
no flags Details
virt-v2v-wrapper-log (11.77 KB, text/plain)
2019-01-21 19:22 UTC, Kedar Kulkarni
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0338 None None None 2019-02-13 15:33:26 UTC

Description Kedar Kulkarni 2019-01-21 19:20:30 UTC
Created attachment 1522236 [details]
Screenshot

Description of problem:
VM migration with latest CloudForms build fail after the Disk conversion is finished. 

My observation is Disk conversion finishes 100%, virt-v2v finishes with return code 0, but migration fails to create instance and Migration plan fails. Virt-v2v and virt-v2v-wrapper logs are attached.

Version-Release number of selected component (if applicable):
5.10.0.32
virt-v2v-1.38.2-12.28.lp.el7ev.x86_64
# Wrapper version
VERSION = "11"

How reproducible:
100% ( tried 4 different VM migrations all failed same )

Steps to Reproduce:
1.Configure CFME appliance and Conversion Instance for migration
2.Create Infra mapping and migration plans 
3.Execute migration plan and wait for completion

Actual results:
See logs and attached screenshot, migration is failed after 100% conversion of disk

Expected results:
Migration should finish successfully and launch instance.

Additional info:
Logs are attached only for 1 migration, but all 4 migrations that were run, failed similarly.

Comment 3 Kedar Kulkarni 2019-01-21 19:21:44 UTC
Created attachment 1522237 [details]
virt-v2v logs

Comment 4 Kedar Kulkarni 2019-01-21 19:22:03 UTC
Created attachment 1522238 [details]
virt-v2v-wrapper-log

Comment 5 Fabien Dupont 2019-01-22 09:30:24 UTC
I've run a migration with the latest appliance and it failed as described in this BZ.
Environment is composed of:
  - OpenStack 13
  - CloudForms 5.10.0.31.20190108221820_a0968c8
  - rhos-v2v-appliance 14.0-20190116.1, with:
    - ovirt-ansible-v2v-conversion-host 1.9.0-3
    - virt-v2v-1.38.2-12.22.lp.el7ev.x86_64
    - nbdkit-1.2.6-1.el7_6.2.x86_64
    - nbdkit-plugin-vddk-1.2.6-1.el7_6.2.x86_64
    - nbdkit-plugin-python2-1.2.6-1.el7_6.2.x86_64
    - nbdkit-plugin-python-common-1.2.6-1.el7_6.2.x86_64

The migration of this specific VM worked earlier with a conversion host created manually from RHEL 7.6.
The difference between the migrations is that virt-v2v-wrapper.py is unable to identify the volume id in virt-v2v log.
A successful identification will produce the following kind of line in the wrapper log:

2019-01-15 02:17:40,913:DEBUG: Volume at index 1 has id='310718f5-048e-49cb-a65c-513904e068cc' (virt-v2v-wrapper:912)

With the latest appliance, it doesn't appear in the log. And that's caused because the regex fails to catch the line in virt-v2v log. For the failed migration, the line is:

openstack [...] volume set --description hridbxprd001 disk 1/1 converted by virt-v2v --property virt_v2v_version=1.38.2rhel=7,release=12.28.lp.el7ev,libvirt --property virt_v2v_conversion_date=2019/01/22 08:55:51 --property virt_v2v_guest_name=hridbxprd001 --property virt_v2v_disk_index=1/1 --property virt_v2v_guest_id=d5d9f7bf-ffc5-4f82-ac0e-e9d961fc569d ab42e303-4d31-4932-9589-9f3726ebdae7

In virt-v2v-wrapper.py, the regex is the following:

    OSP_VOLUME_PROPS = re.compile(
        br'openstack .*\'volume\' \'set\'.*'
        br'\'--property\''
        br' \'virt_v2v_disk_index=(?P<volume>[0-9]+)/[0-9]+\'.*'
        br'\'(?P<uuid>[a-fA-F0-9-]*)\'')

The other difference I noticed between a conversion host where it was successful and the one based on rhos-v2v-appliance is the version of nbdkit, which is 1.2.6-1.1 on the successful conversion host and 1.2.6 on the failing conversion host.

Comment 6 Fabien Dupont 2019-01-22 09:38:41 UTC
I checked the virt-v2v log line on the machine that succeeded and the format is different:

openstack '--os-username=admin' '--os-identity-api-version=3' '--os-user-domain-name=default' '--os-auth-url=http://controller.v2v.bos.redhat.com:5000/v3' '--os-project-name=admin' '--os-password=100Root-' 'volume' 'set' '--description' 'hriprdweb001 disk 1/1 converted by virt-v2v' '--property' 'virt_v2v_version=1.38.2rhel=7,release=12.22.lp.el7ev,libvirt' '--property' 'virt_v2v_conversion_date=2019/01/15 07:13:30' '--property' 'virt_v2v_guest_name=hriprdweb001' '--property' 'virt_v2v_disk_index=1/1' '--property' 'virt_v2v_guest_id=ff908ae6-385f-4edd-ba87-b0cd82189313' '310718f5-048e-49cb-a65c-513904e068cc'

You can see that there are single quotes when they are not there on failing VM.

Comment 7 Yadnyawalk Tale 2019-01-22 09:42:11 UTC
FYI. Found volumes are not getting cleaned up in this case despite it is Failed. While cancel migration issue (BZ1666799) assumption was failed migration should cleanups failed volumes (which it was also doing earlier before this case).

Comment 8 Richard W.M. Jones 2019-01-22 09:55:45 UTC
(In reply to Fabien Dupont from comment #6)
> I checked the virt-v2v log line on the machine that succeeded and the format
> is different:
> 
> openstack '--os-username=admin' '--os-identity-api-version=3'
> '--os-user-domain-name=default'
> '--os-auth-url=http://controller.v2v.bos.redhat.com:5000/v3'
> '--os-project-name=admin' '--os-password=100Root-' 'volume' 'set'
> '--description' 'hriprdweb001 disk 1/1 converted by virt-v2v' '--property'
> 'virt_v2v_version=1.38.2rhel=7,release=12.22.lp.el7ev,libvirt' '--property'
> 'virt_v2v_conversion_date=2019/01/15 07:13:30' '--property'
> 'virt_v2v_guest_name=hriprdweb001' '--property' 'virt_v2v_disk_index=1/1'
> '--property' 'virt_v2v_guest_id=ff908ae6-385f-4edd-ba87-b0cd82189313'
> '310718f5-048e-49cb-a65c-513904e068cc'
> 
> You can see that there are single quotes when they are not there on failing
> VM.

I see - this is indeed a side effect of fixing bug 1664310:

https://github.com/libguestfs/libguestfs/commit/fc028bf57a3ff128d21b904583f9ea02f672ed5b

A different function is used to print the command, which doesn't quote.

I think the real lesson from this is that we should write some kind of metadata
file which the wrapper can pick up, rather than relying on parsing the debug
logs which are not intended for this and are just going to cause ongoing
problems of this sort in future.

(In reply to Yadnyawalk Tale from comment #7)
> FYI. Found volumes are not getting cleaned up in this case despite it is
> Failed. While cancel migration issue (BZ1666799) assumption was failed
> migration should cleanups failed volumes (which it was also doing earlier
> before this case).

This is another side effect of the current bug.

Comment 9 Ido Ovadia 2019-01-22 14:44:44 UTC
I have edit virt-v2v-wrapper.py with following fix and pass the migration 

https://github.com/oVirt/ovirt-ansible-v2v-conversion-host/commit/29f5317f0b668bed413c360903940ef5a10cb297

Comment 10 Brett Thurber 2019-01-23 04:50:20 UTC
Moiviung over to RHV as fix is in virt-v2v-wrapper.  Cleared blocker flag.  Will ship fix async for RHV 4.2.8 and OSP 14 conversion host image.

Comment 18 errata-xmlrpc 2019-02-13 15:33:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0338


Note You need to log in before you can comment on or make changes to this bug.