Bug 1981297 - [RFE] Add new backup phases and disable backup/image transfers DB instant cleanup
Summary: [RFE] Add new backup phases and disable backup/image transfers DB instant cle...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.8
: ---
Assignee: Pavel Bar
QA Contact: Amit Sharir
URL:
Whiteboard:
Depends On: 1980428
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-12 10:26 UTC by Pavel Bar
Modified: 2021-09-08 14:12 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-08 14:12:04 UTC
oVirt Team: Storage
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:3460 0 None None None 2021-09-08 14:12:16 UTC
oVirt gerrit 115008 0 None MERGED core: disable 'vm_backups' DB table cleanup after backup is over 2021-07-12 12:18:49 UTC
oVirt gerrit 115036 0 None MERGED BackupPhase: add new backup finished phases 2021-07-12 12:21:35 UTC
oVirt gerrit 115185 0 None MERGED core: add 2 new phase statuses for backup flows 2021-07-12 12:21:35 UTC
oVirt gerrit 115207 0 None MERGED core: update engine validations in backup flows 2021-07-12 12:21:35 UTC
oVirt gerrit 115251 0 None MERGED core: add DB cleanup thread to clean backups and image transfers 2021-07-12 12:21:35 UTC
oVirt gerrit 115267 0 None MERGED backup: use newly added "SUCCEEDED/FAILED" backup phases 2021-07-12 12:21:35 UTC
oVirt gerrit 115350 0 None MERGED core: disable 'image_transfers' DB table cleanup after image transfer is over 2021-07-12 12:21:35 UTC

Description Pavel Bar 2021-07-12 10:26:54 UTC
Description of problem:
After backup / image transfer operation finishes, all the execution data disappears.
That means, that the user doesn't know the final execution state of the operation that was visible via DB and API while the backup / image transfer execution was still in progress.
In case of backup operation, the situation was even worse - there was no indication for success/failure, the last thing the user might be able to see is the 'FINALIZING' status.

How reproducible:
Simply execute backup / image transfer flows.

Steps to Reproduce:
1. Run backup / image transfer.
2. Check the "vm_backups" & "vm_backup_disk_map" (for backup) or the "image_transfers" for either image download or full backup (that contains inside the "download" step). You should also check via REST API.

Actual results:
While the process is ongoing, you see the relevant data there.
When the operation is finished, the relevant DB entry disappears (both from the DB and from the REST API).

Expected results:
We want the data to be kept for some time for user to use if he wants to, and then to be deleted automatically. So the user will be able to see the operation result. On the other hand we also don't want to over-polute the database with too old data.

Additional info:
What should be implemented and then tested:
1. Add 2 new backup phases to show possible execution end statuses: "SUCCEEDED" & "FAILED".
2. Disable 'vm_backups', "vm_backup_disk_map" & "image_transfers" DB tables instant cleanup after the backup / image transfer operation is over to allow DB & API status retrieval by user.
3. Add DB cleanup scheduled thread to automatically clean backups and image transfers once in a while: the thread will be run every 10 minutes and will clean all the success entries that are 15 minutes old and failed ones that are 30 minutes old.
Separate values for backup & for image transfer operations, an additional value for the cleanup thread rate (all 5 values are configurable):
DbEntitiesCleanupRateInMinutes 10
SucceededBackupCleanupTimeInMinutes 15
FailedBackupCleanupTimeInMinutes 30
SucceededImageTransferCleanupTimeInMinutes 15
FailedImageTransferCleanupTimeInMinutes 30

Comment 3 Amit Sharir 2021-07-26 13:39:06 UTC
Version: 
vdsm-4.40.80.2-1.el8ev.x86_64
ovirt-engine-4.4.8.1-0.9.el8ev.noarch


Verification steps:
I split my verification into 2 main flows - "succeeded" and "failed" flows.


1. Created a VM with multiple disks via UI. 
2. Took multiple snapshots of the VM. 
3. Started a full backup (for double validation I did the full-backup scenario via API and SDK)
3a. API call <{{engine}}vms/35ae3ad2-f4cb-4849-9308-83c012e840ae/backups> 
3b. SDK script </python3 backup_vm.py -c engine full <vm-id>>
4. The full backup of step 3 created image_transfers + backup object.
5. Checked the DB tables of "vm_backups","vm_backup_disk_map", "image_transfers" to check the phases.
6. Checked via the API the "image-transfer" and the "backup" phase - (API calls I used: https://<engine>/ovirt-engine/api/imagetransfers, https://<engine>/ovirt-engine/api/vms/<vm-uuid>/backups)  

Succeeded flow: 
1. I checked that initially the phase for the backup was "succeeded" both via API and via DB. 
2. I checked that initially, the phase for the image-transfer was "9" in the DB and "finished_succeded" via API.
3. After ~15 minutes the backup and image-transfer objects vanished from both DB and API - as expected.

Failed flow:
1. Since it is hard to reproduce a failed phase in a normal user flow I used SQL injection to change the values of the tables. 
2. For the vm_backups table I used SQL call <update vm_backups set phase = 'Failed' where phase = 'Succeeded';>
3. For the image_transfer table I used SQL call <update image_transfers set phase = 10 where phase = 9;>
4. Step 2+3 changed the phase in the DB and in the API. 
4a. For the backup api phase "succeeded" -> "failed".
4b. For the image-transfer API phase "finished_succeded" -> "finished_failed"

Verification conclusions:
The expected output matched the actual output.
The total flow mentioned was done with no errors.
The backup and image-transfer phase vanished in ~15/30 in accordance to the expected behaviour relevant to their phase.

Comment 9 errata-xmlrpc 2021-09-08 14:12:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV Manager (ovirt-engine) [ovirt-4.4.8]), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3460


Note You need to log in before you can comment on or make changes to this bug.