Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1608348 - [downstream clone - 4.2.5] Live merge fails on the RHV-M Engine with "Invalid UUID string: payload" followed by exception.
[downstream clone - 4.2.5] Live merge fails on the RHV-M Engine with "Invalid...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
4.2.4
Unspecified Unspecified
unspecified Severity urgent
: ovirt-4.2.5
: ---
Assigned To: Eyal Shenitzky
Elad
: ZStream
Depends On: 1598594
Blocks:
  Show dependency treegraph
 
Reported: 2018-07-25 06:57 EDT by RHV Bugzilla Automation and Verification Bot
Modified: 2018-07-31 13:50 EDT (History)
17 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1598594
Environment:
Last Closed: 2018-07-31 13:50:11 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3520881 None None None 2018-07-25 06:59 EDT
oVirt gerrit 93036 master MERGED core: fix volume lookup on figuring merge status 2018-07-25 06:59 EDT
oVirt gerrit 93052 ovirt-engine-4.2 MERGED core: fix volume lookup on figuring merge status 2018-07-25 06:59 EDT
Red Hat Product Errata RHBA-2018:2318 None None None 2018-07-31 13:50 EDT

  None (edit)
Description RHV Bugzilla Automation and Verification Bot 2018-07-25 06:57:12 EDT
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1598594 +++
======================================================================

Description of problem:

Live merge fails on the RHV-M Engine with "Invalid UUID string: payload" followed by NEP.

~~~
2018-07-04 19:18:58,512+02 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-8) [83f5f303-af1a-4fd9-9a89-664a156559ce] Command 'org.ovirt.engine.core.bll.MergeStatusCommand' failed: Invalid UUID string: payload
2018-07-04 19:18:58,512+02 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-8) [83f5f303-af1a-4fd9-9a89-664a156559ce] Exception: java.lang.IllegalArgumentException: Invalid UUID string: payload
~~~

The merge completes on the host VDSM side.  

Commvault backup used to create/delete snapshots.

Version-Release number of selected component (if applicable):

ovirt-engine-4.2.4.5-0.1.el7_3.noarch

vdsm-4.20.32-1.el7ev.x86_64
glusterfs-3.8.4-54.10.el7rhgs.x86_64

How reproducible:

100% on end user system.

Not able to reproduce it locally.

Steps to Reproduce:
1.
2.
3.

Actual results:

Snapshot deletion doesn't complete on the RHV-M Engine side but completes on the host VDSM side.  As a result, the images are not removed from the RHV-M DB and on the Storage domain.

Expected results:

Snapshot deletion should complete without any failures.

Additional info:

(Originally by Bimal Chollera)
Comment 7 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:57:50 EDT
Bimal, can you please add steps to reproduce this bug?

(Originally by Eyal Shenitzky)
Comment 10 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:09 EDT
I was able to reproduce this issue. The issue happens when the cluster is upgraded from 4.1 to 4.2. All the VMs which is having the payload (started as run once with cloud-init) will face this issue if it was UP during the cluster upgrade.

Steps to Reproduce:

1. In a 4.1 cluster, start a VM in RunOnce mode with cloud-init initialization. 

2. Confirm the VM is having "payload" attached using the command virsh -r domblklist vmname.

3. Upgrade the cluster compatibility to 4.2.

4. The VM will be having "Next run" configuration after upgrading the cluster to 4.2.

5. Create a snapshot on the VM. This will work fine.

6. Delete the created snapshot online. It will fail with the error below.

2018-07-10 07:40:11,046-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [bc03c305-9c1e-4e84-8ad8-28656dadfa88] START, DumpXmlsVDSCommand(HostName = 10.74.130.120, Params:{hostId='9b2d8619-7f8a-45d3-820b-d3136804f2e1', vmIds='[4654683b-c315-4ef6-976a-6f504eb06a4e]'}), log id: fc8a7b
2018-07-10 07:40:11,062-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [bc03c305-9c1e-4e84-8ad8-28656dadfa88] FINISH, DumpXmlsVDSCommand, return: [{vmId=4654683b-c315-4ef6-976a-6f504eb06a4e, devices=[Ljava.util.Map;@2a36f34f, guestDiskMapping={}}], log id: fc8a7b
2018-07-10 07:40:11,062-04 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-2) [bc03c305-9c1e-4e84-8ad8-28656dadfa88] Exception: java.lang.IllegalArgumentException: Invalid UUID string: payload

If I start a new VM as run once with cloud-init in 4.2 compatibility, you would only see below warning and the snapshot delete will complete successfully.

2018-07-10 07:52:13,766-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [263f9885-0248-4d28-be8a-2904da5a11dc] START, DumpXmlsVDSCommand(HostName = 10.74.130.120, Params:{hostId='9b2d8619-7f8a-45d3-820b-d3136804f2e1', vmIds='[d44e901a-6280-439a-b983-1212ff6a9b1f]'}), log id: 611e7c67
2018-07-10 07:52:13,785-04 WARN  [org.ovirt.engine.core.vdsbroker.libvirt.VmDevicesConverter] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [263f9885-0248-4d28-be8a-2904da5a11dc] unmanaged disk with path '/var/run/vdsm/payload/d44e901a-6280-439a-b983-1212ff6a9b1f.572cdbd5eb01e684f008fb9e8bdc8867.img' is ignored
2018-07-10 07:52:13,787-04 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [263f9885-0248-4d28-be8a-2904da5a11dc] FINISH, DumpXmlsVDSCommand, return: [{vmId=d44e901a-6280-439a-b983-1212ff6a9b1f, devices=[Ljava.util.Map;@e7c97d, guestDiskMapping={}}], log id: 611e7c67


So the issue is with the live merge on a VMs which are having RunOnce configuration and was UP during the cluster upgrade.

(Originally by Nijin Ashok)
Comment 12 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:22 EDT
Eyal,

Nijin++ added a reproducer procedure.

- Bimal.

(Originally by Bimal Chollera)
Comment 13 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:27 EDT
(In reply to Bimal Chollera from comment #11)
> Eyal,
> 
> Nijin++ added a reproducer procedure.
> 
> - Bimal.

Thanks for the detailed information, I will investigate it.

(Originally by Eyal Shenitzky)
Comment 14 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:33 EDT
Arik, can you please take a look?

I think that it may be related to - https://gerrit.ovirt.org/#/c/92599/

(Originally by Eyal Shenitzky)
Comment 15 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:40 EDT
(In reply to Eyal Shenitzky from comment #13)
> Arik, can you please take a look?
> 
> I think that it may be related to - https://gerrit.ovirt.org/#/c/92599/

Nah, this bug occurred on 4.2.4, before that patch got in.
As a matter of fact, this patch might actually prevent this issue.
Anyway, posted a patch that fixes incorrect logic in MergeStatusCommand that would surely solve this.
Eyal, could you please verify that?

(Originally by Arik Hadas)
Comment 16 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:45 EDT
Sure, thanks.

(Originally by Eyal Shenitzky)
Comment 17 RHV Bugzilla Automation and Verification Bot 2018-07-25 06:58:51 EDT
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops@redhat.comINFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.2.z': '?'}', ]

For more info please contact: rhv-devops@redhat.com

(Originally by rhv-bugzilla-bot)
Comment 19 Elad 2018-07-25 10:28:49 EDT
Tested using 4.2.5.2-0.1 against 4.2.5.1-0.1. 

The following steps:
1) On 4.1 DC and cluster, create a VM with a disk on iSCSI, with cloud init initialization and as run once.
2) While the VM is running, upgraded the cluster and DC to 4.2, a 'next run' snapshot created for the VM
3) Created live snapshot
4) live merged the snapshot


On 4.2.5.1-0.1, live merged failed with:

2018-07-25 17:15:07,886+03 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-5) [41801c7e-1da1-4a55-bc92-369a3a1d12bc] Exception: java.lang.IllegalArgumentEx
ception: Invalid UUID string: payload
        at java.util.UUID.fromString(UUID.java:194) [rt.jar:1.8.0_171]
        at org.ovirt.engine.core.compat.Guid.<init>(Guid.java:67) [compat.jar:]

And on 4.2.5.2-0.1, live merge succeeded.


============================
Used:
rhvm-4.2.5.2-0.1.el7ev.noarch
vdsm-4.20.35-1.el7ev.x86_64
Comment 21 errata-xmlrpc 2018-07-31 13:50:11 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2318

Note You need to log in before you can comment on or make changes to this bug.