Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1749347

Summary:

[v2v] virt-v2v-wrapper Failed to start transient service unit

Product:

Red Hat Enterprise Virtualization Manager

Reporter:

Ilanit Stein <istein>

Component:

v2v-conversion-host

Assignee:

Tomáš Golembiovský <tgolembi>

Status:

CLOSED ERRATA

QA Contact:

Maayan Hadasi <mguetta>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

unspecified

CC:

dagur, mguetta, mtessun, rdlugyhe, rjones, tgolembi

Target Milestone:

ovirt-4.4.0

Target Release:

4.4.1

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

rhv-4.4.0-29

Doc Type:

Bug Fix

Doc Text:

Previously, systemd units from failed conversions were not removed from the host. These could cause collisions and prevent subsequent conversions from starting because the service name was already "in use." The current release fixes this issue. If the conversion fails, the units are explicitly removed so they cannot interfere with subsequent conversions.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2020-08-04 13:27:17 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Virt

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
v2v-import-wrapper.log	none

Description Ilanit Stein 2019-09-05 12:10:38 UTC

Description of problem:
v2v VM migration fail on virt-v2v-wrapper log error:

2019-09-05 07:00:46,246:INFO: systemd-run returned: Failed to start transient service unit: Unit run-6566.service already exists.
 (virt-v2v-wrapper:1447)
2019-09-05 07:00:46,247:INFO: Waiting for PID... (virt-v2v-wrapper:1454)
2019-09-05 07:00:46,298:INFO: Running with PID: 6567 (virt-v2v-wrapper:1462)
2019-09-05 07:00:56,310:INFO: have exception: ('Error while monitoring virt-v2v',) {} (virt-v2v-wrapper:1078)
2019-09-05 07:00:56,310:ERROR: Error while monitoring virt-v2v (virt-v2v-wrapper:1079)
Traceback (most recent call last):
  File "/usr/bin/virt-v2v-wrapper.py", line 1892, in wrapper
    with log_parser(not data['daemonize']) as parser:
  File "/usr/lib64/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/bin/virt-v2v-wrapper.py", line 1726, in log_parser
    parser = OutputParser(duplicate)
  File "/usr/bin/virt-v2v-wrapper.py", line 1161, in __init__
    self._log = open(state.v2v_log, 'rbU')
IOError: [Errno 2] No such file or directory: '/var/log/vdsm/import/v2v-import-20190905T070045-6473.log'


Richard W.M. Jones:
==================
"Actually it looks as if we need to run ‘systemctl reset-failed [service]’
in the wrapper for units which fail (which is a bit rubbish to say the
least).

See also:
https://bugzilla.redhat.com/show_bug.cgi?id=1255432"

Version-Release number of selected component (if applicable):
v2v-conversion-host-wrapper-1.14.2-1.el7ev.noarch


How reproducible:
So far it happened once,
to one VM out of 20 migrated, using ssh transport method


Additional info:
virt-v2v corresponding log was not created (actually the failure is on that this file is missing)

ls -ltr  /run/systemd/system/*.service 
[root@f02-h21-000-r620 import]# systemctl list-units | grep run-
show the same output of 28 run-XXXX.service

Comment 1 Ilanit Stein 2019-09-05 12:11:23 UTC

Created attachment 1611942 [details]
v2v-import-wrapper.log

Comment 2 Tomáš Golembiovský 2019-09-09 11:17:18 UTC

What seems more worrying to me is the fact that auto-generated names can lead to duplicates. Still, it looks like we can fix the behavior by using the --collect argument.

Comment 3 Tomáš Golembiovský 2019-09-09 11:24:20 UTC

But it requires EL8.

Comment 4 Richard W.M. Jones 2019-09-09 12:28:17 UTC

(In reply to Tomáš Golembiovský from comment #3)
> But it requires EL8.

That's annoying.  Would another possibility (for RHEL 7 only) be to generate
our own names, ie adding the --unit=... option?  We can ensure they don't
conflict.

Comment 5 Tomáš Golembiovský 2019-09-09 13:21:09 UTC

(In reply to Richard W.M. Jones from comment #4)
> (In reply to Tomáš Golembiovský from comment #3)
> > But it requires EL8.
> 
> That's annoying.  Would another possibility (for RHEL 7 only) be to generate
> our own names, ie adding the --unit=... option?  We can ensure they don't
> conflict.

It may be good idea to add --unit with uniqe name on both RHEL7/RHEL8. But that solves only a part of the problem. I think we still need to always call `reset-failed` on the failed units to avoid leaving garbage around.

Comment 6 Tomáš Golembiovský 2019-12-04 15:45:03 UTC

Fixed upstream in: https://github.com/oVirt/v2v-conversion-host/commit/d8c9b24dedefe36a43f7aaf30e9c5545f125ee02

Comment 8 Daniel Gur 2020-07-03 08:54:28 UTC

Tomas, As this bug not reproduced, can you provide specific validation instructions or we will close it as not reproduced?

Comment 9 Tomáš Golembiovský 2020-07-08 07:34:59 UTC

What do you mean by "not reproduced"? Which version did you use to reproduce the problem?

It has been fixed in 1.16.0-3.

Comment 10 Daniel Gur 2020-07-09 10:36:03 UTC

So as we talked in the mail. This issue happened once and we don't actually know how to reproduce or to validate it.
Waiting for your input Tomas as per your mail.

Comment 11 Tomáš Golembiovský 2020-07-09 14:12:27 UTC

I just realized that we don't have to reproduce the collision from the original report. We just have to check that there are no stale services after failed conversion. This turns the verification/reproduction from super difficult to super easy. So to verify:

1) start a conversion
2) wait for virt-v2v to start
3) kill the virt-v2v process to fail the conversion
4) from wrapper log find out name of the transient service, the line looks like this:

    2020-06-24 20:02:40,508:INFO: systemd-run returned: b'Running as unit: run-9574.service\n' (runners:135)

5) verify with systemctl status that no such service exists in the system:

   $ systemctl status run-9574.service
   Unit run-9574.service could not be found.


With the fix systemctl in step 5) should report there is no such unit. Without the fix systemctl will report information about the service and that the service failed.

Comment 12 Maayan Hadasi 2020-07-14 13:08:21 UTC

Verified as fixed.

I tested it on RHV 4.3 host - once with v2v-conversion-host-1.16.0.3 (includes fix) and once with v2v-conversion-host-1.15.0-1 (should not include fix)
And both tries have been passed - it means that I am getting the expected behavior that indicates as the bug was fixed for both versions

Verification steps were:
1. start a migration plan (by CFME UI)
2. Wait for virt-v2v to start
3. kill the virt-v2v process to fail the conversion (by 'Cancel Migration' button in CFME or by 'kill' command - both ways have been tried)
4. from wrapper log find out name of the transient service, the line looks like this:

    2020-06-24 20:02:40,508:INFO: systemd-run returned: b'Running as unit: run-9574.service\n' (runners:135)

5. verify with systemctl status that no such service exists in the conversion host:

   $ systemctl status run-9574.service
   Unit run-9574.service could not be found.

Note: the bug was mentioned as fixed in RHV 4.4 -> it was also verified with RHV 4.4 host, v2v-conversion-host-1.16.2-3, same result.
We cannot downgrade the v2v-conversion-host version in this case due to other bugs related to RHV 4.4

Comment 20 errata-xmlrpc 2020-08-04 13:27:17 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246