Bug 1779664 - MERGE_STATUS fails with 'Invalid UUID string: mapper' when Direct LUN that already exists is hot-plugged [RHV clone - 4.3.8]
Summary: MERGE_STATUS fails with 'Invalid UUID string: mapper' when Direct LUN that al...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.3.5
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.3.8
: 4.3.8
Assignee: shani
QA Contact: Shir Fishbain
URL:
Whiteboard:
Depends On: 1750212
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-04 13:23 UTC by RHV bug bot
Modified: 2022-07-09 16:05 UTC (History)
13 users (show)

Fixed In Version: ovirt-engine-4.3.8.2
Doc Type: Bug Fix
Doc Text:
Previously, when you deleted a snapshot of a VM with a LUN disk, its image ID parsed incorrectly and used "mapper" as its value, which caused a null pointer exception. The current release fixes this issue by avoiding disks whose image ID parses as 'mapper' so deleting the VM snapshot is successful.
Clone Of: 1750212
Environment:
Last Closed: 2020-02-13 15:24:42 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
lsvaty: testing_plan_complete-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHV-47517 0 None None None 2022-07-09 16:05:13 UTC
Red Hat Knowledge Base (Solution) 4399711 0 None None None 2019-12-04 13:24:02 UTC
Red Hat Product Errata RHSA-2020:0498 0 None None None 2020-02-13 15:24:54 UTC
oVirt gerrit 105321 0 master MERGED core: Avoid wrong parsing of LUN ids 2020-08-11 18:52:49 UTC
oVirt gerrit 105376 0 ovirt-engine-4.3 MERGED core: Avoid wrong parsing of LUN ids 2020-08-11 18:52:48 UTC

Description RHV bug bot 2019-12-04 13:23:35 UTC
+++ This bug is a downstream clone. The original bug is: +++
+++   bug 1750212 +++
======================================================================

Description of problem:

MERGE_STATUS fails with "Invalid UUID string: mapper" if the VM has a Direct LUN that has been hot-plugged in a certain way.

1. Start with a VM with 1 disk from Storage Domain.
2. Run the VM
3. Go to Storage->Disks and create a Direct LUN
4. Go to Compute->Virtual Machines->VM->Disks and hotplug the Direct LUN without changing any options.  (see below)
5. Hotplug another disk from Storage Domain.
6. Create a snapshot
7. Delete a snapshot fails (see below)

Some steps in more details:

[4] Hotplug LUN by first creating the Direct LUN in the Disks tab and then going to the VM->Disks and attaching it. In this case the Direct LUN is is added as "device=disk" instead of "device=lun":
2019-09-09 10:46:19,693+10 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugDiskVDSCommand] (EE-ManagedThreadFactory-engine-Thread-459996) [5ecae6ed-44f7-4ed0-aefd-71fc46352a52] Disk hot-plug: <?xml version="1.0" encoding="UTF-8"?><hotplug>
  <devices>
    <disk snapshot="no" type="block" device="disk">   <----- disk, not lun
      <target dev="sda" bus="scsi"/>
      <source dev="/dev/mapper/36001405156d88f1cc594c4a94ffe1418">
        <seclabel model="dac" type="none" relabel="no"/>
      </source>
      <driver name="qemu" io="native" type="raw" error_policy="stop" cache="none"/>
      <alias name="ua-0ba9f0fd-b3d3-4723-8959-1068b46550f8"/>
      <address bus="0" controller="0" unit="3" type="drive" target="0"/>
    </disk>
  </devices>

[7] Delete Snapshot fails on MERGE_STATUS. It seems to be trying to get the Volume Chain of the DirectLUN, as its type is 'Disk' and not 'LUN' [A]
2019-09-09 10:23:56,219+10 ERROR [org.ovirt.engine.core.bll.MergeStatusCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-3) [4a758799-48e7-4389-9eba-54fe8c51034e] Exception: java.lang.IllegalArgumentException: Invalid UUID string: mapper
        at java.util.UUID.fromString(UUID.java:194) [rt.jar:1.8.0_222]
        at org.ovirt.engine.core.compat.Guid.<init>(Guid.java:67) [compat.jar:]
        at org.ovirt.engine.core.compat.Guid.createGuidFromStringWithDefault(Guid.java:87) [compat.jar:]
        at org.ovirt.engine.core.compat.Guid.createGuidFromString(Guid.java:76) [compat.jar:]
        at org.ovirt.engine.core.bll.MergeStatusCommand.getVolumeChain(MergeStatusCommand.java:152) [bll.jar:]
        at org.ovirt.engine.core.bll.MergeStatusCommand.attemptResolution(MergeStatusCommand.java:75) [bll.jar:]
        at org.ovirt.engine.core.bll.MergeStatusCommand.executeCommand(MergeStatusCommand.java:59) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1157) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1315) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1964) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1375) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:419) [bll.jar:]


Version-Release number of selected component (if applicable):
rhvm-4.3.5.4-0.1.el7.noarch

How reproducible:
Always

Steps to Reproduce:
As above.

Actual results:
- Snapshot delete fails

Expected results:
- Snapshot delete succeeds

Additional info:
[A] https://github.com/oVirt/ovirt-engine/blob/ovirt-engine-4.3.5.z/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/MergeStatusCommand.java#L151

(Originally by Diego Huertas Alvarez)

Comment 1 RHV bug bot 2019-12-04 13:23:37 UTC
Diego, could you please attach the engine logs from our labs?

(Originally by Germano Veit Michel)

Comment 3 RHV bug bot 2019-12-04 13:23:41 UTC
Benny, looks related to the change you did in live merge lately, we probably don't filter by disk type when we get the volume chain info, please have a look

(Originally by Tal Nisan)

Comment 4 RHV bug bot 2019-12-04 13:23:43 UTC
(In reply to Tal Nisan from comment #3)
> Benny, looks related to the change you did in live merge lately, we probably
> don't filter by disk type when we get the volume chain info, please have a
> look

My latest change was introduced only in 4.3.6

this looks like https://bugzilla.redhat.com/show_bug.cgi?id=1598594

(Originally by Benny Zlotnik)

Comment 5 RHV bug bot 2019-12-04 13:23:45 UTC
(In reply to Benny Zlotnik from comment #4)
> (In reply to Tal Nisan from comment #3)
> > Benny, looks related to the change you did in live merge lately, we probably
> > don't filter by disk type when we get the volume chain info, please have a
> > look
> 
> My latest change was introduced only in 4.3.6
> 
> this looks like https://bugzilla.redhat.com/show_bug.cgi?id=1598594

I agree, this looks like a virt issue (same as bug 1598594)

(Originally by Eyal Shenitzky)

Comment 6 RHV bug bot 2019-12-04 13:23:47 UTC
(In reply to Eyal Shenitzky from comment #5)
> (In reply to Benny Zlotnik from comment #4)
> > (In reply to Tal Nisan from comment #3)
> > > Benny, looks related to the change you did in live merge lately, we probably
> > > don't filter by disk type when we get the volume chain info, please have a
> > > look
> > 
> > My latest change was introduced only in 4.3.6
> > 
> > this looks like https://bugzilla.redhat.com/show_bug.cgi?id=1598594
> 
> I agree, this looks like a virt issue (same as bug 1598594)

On this bug the problem seems to be that the Direct LUN has VmDeviceType Disk instead of LUN, and then the engine tries to do volume lookup instead of ignoring it.

(Originally by Germano Veit Michel)

Comment 7 RHV bug bot 2019-12-04 13:23:49 UTC
Indeed seems like a Virt issue, most likely in the Domain XML part of hotplugging the disk which doesn't attach the device properly, Ryan can someone have a look?

(Originally by Tal Nisan)

Comment 8 RHV bug bot 2019-12-04 13:23:51 UTC
LUNs are always hotplugged as type=disk, and they have been for years. There haven't been any changes around that handling since 2017

Is there a reason why snapshot merging is even trying to touch unmanaged storage instead of filtering them out? It seems like a saner solution.

(Originally by Ryan Barry)

Comment 10 RHV bug bot 2019-12-05 17:50:09 UTC
WARN: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Found non-acked flags: '{'rhevm-4.3.z': '?'}', ]

For more info please contact: rhv-devops

Comment 13 Shir Fishbain 2019-12-15 09:24:48 UTC
Verified - The delete snapshot succeeds

2019-12-15 11:19:02,212+02 INFO  [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-9) [11345ff1-9bb2-4032-ba85-983cbdc07874] Successfully merged snapshot '5c9b489c-692a-43b1-b94a-8cff957863c1' images 'b9744599-23d6-4a95-837c-b9ea0db28ad0'..'ff90d51f-e098-47db-804a-293f49e5e999'
2019-12-15 11:19:02,232+02 INFO  [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-9) [11345ff1-9bb2-4032-ba85-983cbdc07874] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand' successfully.
2019-12-15 11:19:02,234+02 INFO  [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotSingleDiskLiveCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-9) [11345ff1-9bb2-4032-ba85-983cbdc07874] Lock freed to object 'EngineLock:{exclusiveLocks='', sharedLocks='[90c757ea-a9d7-4599-bc75-06dcc6a4fe60=TEMPLATE]'}'
2019-12-15 11:19:03,278+02 INFO  [org.ovirt.engine.core.bll.ConcurrentChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-1) [11345ff1-9bb2-4032-ba85-983cbdc07874] Command 'RemoveSnapshot' id: '539618a2-0b13-479d-8c6c-90376ec8f808' child commands '[ef244aaa-3ac9-4850-b82c-5e4a98324906, 4e28b7fc-dd60-4e8b-92d9-dba950f6562d]' executions were completed, status 'SUCCEEDED'
2019-12-15 11:19:04,317+02 INFO  [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-13) [11345ff1-9bb2-4032-ba85-983cbdc07874] Ending command 'org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand' successfully.

ovirt-engine-4.3.8.1-0.1.master.el7.noarch
vdsm-4.30.39-1.el7ev.x86_64

Comment 17 errata-xmlrpc 2020-02-13 15:24:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:0498


Note You need to log in before you can comment on or make changes to this bug.