Bug 1532464 - Teardown is called on snapshot volume if vm is paused
Summary: Teardown is called on snapshot volume if vm is paused
Keywords:
Status: CLOSED DUPLICATE of bug 1514901
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.6
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
: ---
Assignee: Ala Hino
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-09 02:28 UTC by Germano Veit Michel
Modified: 2021-03-11 19:37 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-09 15:55:48 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 3314421 0 None None None 2018-01-09 03:19:51 UTC

Description Germano Veit Michel 2018-01-09 02:28:49 UTC
Description of problem:

1) User create live snapshot of vm (with memory volume)

2) VM briefly goes paused (due to saving memory I assume)

2017-12-21 15:50:47,101-03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (DefaultQuartzScheduler8) [4c58e0f9] VM 'b51cc80c-1e07-43e0-865a-9fef89ea790c' moved from 'Up' --> 'Paused'                                                                                                           

3) GetQemuImageInfoVDSCommand is called

2017-12-21 15:51:00,196-03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.GetQemuImageInfoVDSCommand] (DefaultQuartzScheduler2) [2c60ab76-3474-4238-a89b-d7e1fb35ba04] START, GetQemuImageInfoVDSCommand(HostName = xxxx, GetVolumeInfoVDSCommandParameters:{runAsync='true', hostId='2cd98f1f-68e4-49e6-b54b-4bf2b67a0d47', storagePoolId='59bb079a-002c-01fc-02bf-000000000303', storageDomainId='4b1fde81-2fa8-4e1a-bc60-79fd33486d1d', imageGroupId='bbdfbf7a-a7f0-4eb3-a3db-546f4e18b89c', imageId='e221ddb1-239f-48dc-a823-e7ba05c03d1b'}), log id: 76d16784

4) Since VM is paused, hostIdToExecuteQemuImageInfo is null:

    private void setQcowCompatByQemuImageInfo(Guid storagePoolId,
            Guid newImageGroupId,
            Guid newImageId,
            Guid newStorageDomainID) {

        // If the VM is running then the volume is already prepared in the guest's host so there
        // is no need for prepare and teardown.
        Guid hostIdToExecuteQemuImageInfo = null;
        List<Pair<VM, VmDevice>> attachedVmsInfo =
                vmDao.getVmsWithPlugInfo(getDestinationDiskImage().getId());
        for (Pair<VM, VmDevice> pair : attachedVmsInfo) {
            VM vm = pair.getFirst();
            if (Boolean.TRUE.equals(pair.getSecond().getIsPlugged())) {
                if (vm.isStartingOrUp()) {      
                    hostIdToExecuteQemuImageInfo = vm.getRunOnVds();
                    break;
                }
            }
        }

isStartingOrUp is not true for Paused VMs, so hostIdToExecuteQemuImageInfo is null (as if the VM is not running).

5) This results in setQcowCompat calling getQemuImageInfoFromVdsm with last argument (shouldPrepareAndTeardown) as True.

6) Host tries to deactivate in use LV (this is live snapshot)

2017-12-21 15:50:18,914-0300 WARN  (tasks/3) [storage.ResourcesFactories] Failure deactivate LV 4b1fde81-2fa8-4e1a-bc60-79fd33486d1d/d136e0d7-aa17-4a40-b173-d7dc86817d24 (Cannot deactivate Logical Volume: ('General Storage Exception: ("5 [] [\'  Logical volume 4b1fde81-2fa8-4e1a-bc60-79fd33486d1d/d136e0d7-aa17-4a40-b173-d7dc86817d24 in use.\']\\n4b1fde81-2fa8-4e1a-bc60-79fd33486d1d/[u\'d136e0d7-aa17-4a40-b173-d7dc86817d24\']",)',)) (resourceFactories:58)

7) General Storage Exception ERROR is propagated to the engine, user is worried.

Version-Release number of selected component (if applicable):
ovirt-engine-4.1.6.2-0.1.el7.noarch

Actual results:
Teardown

Expected results:
No teardown as volume is in use

Additional info:

It looks like this is already fixed in RHV 4.2 by these 2 commits by Ala, not sure if intentional for the Paused case, but seems to do it.

core: Cleanup BaseImagesCommand code
      commit da4c9bb6d9ff6c437f627492c1a626dcb55efd4dcode

core: Enhance BaseImagesCommand.setQcowCompatByQemuImageInfo code
      commit cf502720b4cf7bc09694ee0821f3e7f73d6e6e0f

Just please confirm and evaluate 4.1.z, as there is a customer case involved.

Other than that I don't see any major harm except for the error message being propagated to the user.

Comment 3 Tal Nisan 2018-01-09 15:03:12 UTC
Ala, what's you're take on it, is this bug harmful in any way? How risky is it to backport the fix to 4.1.9?

Comment 4 Ala Hino 2018-01-09 15:23:29 UTC
Already backported to 4.1.8: https://gerrit.ovirt.org/#/c/84350/

Comment 5 Tal Nisan 2018-01-09 15:55:48 UTC

*** This bug has been marked as a duplicate of bug 1514901 ***

Comment 6 Elad 2018-08-02 08:19:52 UTC
DUP of bug 1514901 which is qe testing covered.

Comment 7 Franta Kust 2019-05-16 12:54:56 UTC
BZ<2>Jira re-sync


Note You need to log in before you can comment on or make changes to this bug.