Bug 1984308

Summary: [CBT] Full backup fails when trying to make another backup right after the previous one is reported as done
Product: [oVirt] ovirt-engine Reporter: sshmulev
Component: BLL.StorageAssignee: Pavel Bar <pbar>
Status: CLOSED CURRENTRELEASE QA Contact: sshmulev
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.7CC: aefrat, ahadas, bugs, dfodor, eshenitz
Target Milestone: ovirt-4.4.9Keywords: ZStream
Target Release: ---Flags: pm-rhel: ovirt-4.4+
eshenitz: blocker+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-21 07:27:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1923178, 2014017    
Bug Blocks:    
Attachments:
Description Flags
logs none

Description sshmulev 2021-07-21 08:19:35 UTC
Created attachment 1803984 [details]
logs

Description of problem:
When full Backup is reported that is done successfully, right after the second try it reports that is still in the middle of a backup operation:

Version-Release number of selected component (if applicable):
vdsm-4.40.70.6-1.el8ev.x86_64
ovirt-engine-4.4.7.7-0.1.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
1.Perform full backup:
python3 ./backup_vm.py -c engine full <vm-uuid>
2.run the script above right after every message: "Full backup completed successfully" a couple of times until it fails.

Actual results:
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot backup VM. The VM is during a backup operation.]". HTTP response code is 409.

Expected results:
Every new full backup should pass successfully after the previous is reported as done.

Additional info:

First backup:
# python3 ./backup_vm.py -c engine full ed2f0fbf-7250-497c-9b1c-46b33f0efc74
[   0.0 ] Starting full backup for VM 'ed2f0fbf-7250-497c-9b1c-46b33f0efc74'
[   1.6 ] Waiting until backup 'd35cc42b-5844-4476-b9a8-a79e37925a55' is ready
[  25.0 ] Created checkpoint '84bc4c4f-5178-46ac-989b-3ae9bd873b51' (to use in --from-checkpoint-uuid for the next incremental backup)
[  25.1 ] Creating image transfer for disk '3464c919-e88e-440d-b7f3-a39d7debe9e8'
[  26.4 ] Image transfer '77abcfd0-fab7-4471-9d07-5f6ec7a105a0' is ready
[ 100.00% ] 10.00 GiB, 11.00 seconds, 930.92 MiB/s                             
[  37.4 ] Finalizing image transfer
[  40.5 ] Finalizing backup
[  40.6 ] Waiting until backup is being finalized
[  40.7 ] Full backup completed successfully



second backup:
# python3 ./backup_vm.py -c engine full ed2f0fbf-7250-497c-9b1c-46b33f0efc74
[   0.0 ] Starting full backup for VM 'ed2f0fbf-7250-497c-9b1c-46b33f0efc74'
Traceback (most recent call last):
  File "./backup_vm.py", line 493, in <module>
    main()
  File "./backup_vm.py", line 176, in main
    args.command(args)
  File "./backup_vm.py", line 190, in cmd_full
    backup = start_backup(connection, args)
  File "./backup_vm.py", line 319, in start_backup
    from_checkpoint_id=args.from_checkpoint_uuid
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 34139, in add
    return self._internal_add(backup, headers, query, wait)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 232, in _internal_add
    return future.wait() if wait else future
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 229, in callback
    self._check_fault(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault
    self._raise_error(response, body)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot backup VM. The VM is during a backup operation.]". HTTP response code is 409.



Third backup:
# python3 ./backup_vm.py -c engine full ed2f0fbf-7250-497c-9b1c-46b33f0efc74
[   0.0 ] Starting full backup for VM 'ed2f0fbf-7250-497c-9b1c-46b33f0efc74'
[   1.3 ] Waiting until backup '6892f2f6-7f53-42e8-862d-d69b43f7c0e3' is ready
[  19.5 ] Created checkpoint '4a2c97ec-15b2-48f7-b4b3-5befd98df61a' (to use in --from-checkpoint-uuid for the next incremental backup)
[  19.6 ] Creating image transfer for disk '3464c919-e88e-440d-b7f3-a39d7debe9e8'
[  20.8 ] Image transfer 'e5d10a8e-4d3c-410f-a358-c9e9d0bada8e' is ready
[ 100.00% ] 10.00 GiB, 12.57 seconds, 814.94 MiB/s                             
[  33.3 ] Finalizing image transfer
[  35.4 ] Finalizing backup
[  35.6 ] Waiting until backup is being finalized
[  35.7 ] Full backup completed successfully

Comment 1 sshmulev 2021-10-17 09:08:59 UTC
Verified.

Versions:
ovirt-engine-4.4.9.2-0.6.el8ev.noarch
vdsm-4.40.90.2-1.el8ev.x86_64

Steps to Reproduce:
1. Perform full backup:
python3 ./backup_vm.py -c engine full <vm-uuid>
2. run the script above right after every message: "Full backup completed successfully" a couple of times until it fails.

Actual results:
The full backup one right after one works ok without errors as expected.

Comment 2 Sandro Bonazzola 2021-10-21 07:27:14 UTC
This bugzilla is included in oVirt 4.4.9 release, published on October 20th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.9 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.