Bug 1826348 - [Incremental backup] Full backup during live disk migration should not be allowed
Summary: [Incremental backup] Full backup during live disk migration should not be all...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.0
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ovirt-4.4.1
: 4.4.1
Assignee: Eyal Shenitzky
QA Contact: Ilan Zuckerman
URL:
Whiteboard:
Depends On: 1836627
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-04-21 13:30 UTC by Ilan Zuckerman
Modified: 2020-07-08 08:26 UTC (History)
4 users (show)

Fixed In Version: ovirt-engine-4.4.1.5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-08 08:26:06 UTC
oVirt Team: Storage
Embargoed:
pm-rhel: ovirt-4.4+
aefrat: planning_ack?
aefrat: devel_ack?
aefrat: testing_ack+


Attachments (Terms of Use)
engine log (319.39 KB, text/plain)
2020-04-21 13:30 UTC, Ilan Zuckerman
no flags Details
vdsm log (5.04 MB, text/plain)
2020-04-21 13:31 UTC, Ilan Zuckerman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 109121 0 master MERGED core: add validation for disks status before starting a VM backup 2020-08-05 08:08:31 UTC

Description Ilan Zuckerman 2020-04-21 13:30:43 UTC
Created attachment 1680557 [details]
engine log

Description of problem:

When invoking full backup for a disk which is currently being migrated to another SD, the process is attempting to be started although the disk is locked.
This is causing the engine to throw Exception:

2020-04-21 16:03:05,859+03 ERROR [org.ovirt.engine.core.bll.StartVmBackupCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-4) [e9340d04-5937-4dc0-a6c9-e9ed9ab2d09c] Failed to execute VM backup operation 'StartVmBackup': {}: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to StartVmBackupVDS, error = Backup Error: {'vm_id': 'fefd90c8-94a5-4c68-83ae-9c3462655ca9', 'backup': <vdsm.virt.backup.BackupConfig object at 0x7fb9b420d550>, 'reason': "Failed to find one of the backup disks: No such drive: '{'domainID': 'f3f88292-1287-4635-83ac-f8c2cf482a9e', 'imageID': 'c3d127e4-b6cf-4dda-a5fb-3064953e67a3', 'volumeID': '4cab4cf5-0cfc-4eac-8e15-42bbdb9a4e7c'}'"}, code = 1600 (Failed with error unexpected and code 16)


And causing vdsm to throw ERROR:

LookupError: No such drive: '{'domainID': 'f3f88292-1287-4635-83ac-f8c2cf482a9e', 'imageID': 'c3d127e4-b6cf-4dda-a5fb-3064953e67a3', 'volumeID': '4cab4cf5-0cf
c-4eac-8e15-42bbdb9a4e7c'}'


Also, the API response returning phase "starting" instead of error message telling me that the disk is locked and can not be backed up:

POST {{engine}}vms/{{myvm_id}}/backups

Body:

<backup>
    <disks>
        <disk id="{{qcow_disk_id}}" />
    </disks>
</backup>

Response:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<backup href="/ovirt-engine/api/vms/38e0898b-b6a8-4238-b400-e17928f6a926/backups/8d8079fa-2d25-4faf-853e-856ba22f3889" id="8d8079fa-2d25-4faf-853e-856ba22f3889">
    <actions>
        <link href="/ovirt-engine/api/vms/38e0898b-b6a8-4238-b400-e17928f6a926/backups/8d8079fa-2d25-4faf-853e-856ba22f3889/finalize" rel="finalize"/>
    </actions>
    <link href="/ovirt-engine/api/vms/38e0898b-b6a8-4238-b400-e17928f6a926/backups/8d8079fa-2d25-4faf-853e-856ba22f3889/disks" rel="disks"/>
    <creation_date>2020-04-21T11:31:36.755+03:00</creation_date>
    <phase>starting</phase>
    <vm href="/ovirt-engine/api/vms/38e0898b-b6a8-4238-b400-e17928f6a926" id="38e0898b-b6a8-4238-b400-e17928f6a926"/>
</backup>


Version-Release number of selected component (if applicable):
vdsm-4.40.13-1.el8ev.x86_64
ovirt-engine-4.4.0-0.33.master.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1. Create blank vm
2. Create Qcow disk with incremental backup enabled + attach it to the vm as os disk on ISCSI
3. Start the vm -> wait till it starts
4. Migrate the disk to another ISCSI domain
5. As soon as the migration starts, and the disk getting locked, invoke with API full backup for the subject disk


Actual results:

The full backup request starts to process, causing Exceptions on VDSM and ENGINE.
API response should be something different than regular response which indicates that the full backup process has started.
For example: "the disk is locked and can not be backed up currently"

Expected results:
The backup operations should be blocked during disk/VM migration (live or cold) and vice-versa.

Additional info:
Attaching Engine log and relevant vdsm log.

Comment 1 Ilan Zuckerman 2020-04-21 13:31:35 UTC
Created attachment 1680558 [details]
vdsm log

Comment 2 Sandro Bonazzola 2020-06-19 09:45:51 UTC
This bug is in modified state and targeting 4.4.2. Can this be re-targeted to 4.4.1?

Comment 3 Ilan Zuckerman 2020-06-29 12:56:35 UTC
Verified on rhv-release-4.4.1-5-001.noarch

1. Create blank vm
2. Create Qcow disk with incremental backup enabled + attach it to the vm as os disk on ISCSI
3. Start the vm -> wait till it starts
4. Migrate the disk to another ISCSI domain
5. As soon as the migration starts, and the disk getting locked, invoke with API full backup for the subject disk

Expected:
The backup operations should be blocked during disk/VM migration (live or cold) and vice-versa.

Actual:
Backup operation is blocked with the following error message from engine:

"Cannot backup VM: The following disks are locked: 26780_qcow_incr_enabled. Please try again in a few minutes."

Comment 4 Sandro Bonazzola 2020-07-08 08:26:06 UTC
This bugzilla is included in oVirt 4.4.1 release, published on July 8th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.