Bug 1938750 - Cold migration from block sd to file sd fails - GenerationMismatch: The provided generation does not match the actual generation: 'requested=0, actual=3'
Summary: Cold migration from block sd to file sd fails - GenerationMismatch: The provi...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.5.9
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.5
: ---
Assignee: Benny Zlotnik
QA Contact: Evelina Shames
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-15 07:44 UTC by Evelina Shames
Modified: 2021-03-23 17:11 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2021-03-23 17:11:24 UTC
oVirt Team: Storage
Embargoed:
aoconnor: blocker+


Attachments (Terms of Use)
Logs (294.40 KB, application/zip)
2021-03-15 07:44 UTC, Evelina Shames
no flags Details
test_copy_disk.TestCaseCopyAttachedDisk(fcp SD) (1.35 MB, application/zip)
2021-03-15 08:53 UTC, Shir Fishbain
no flags Details
test_copy_disk.TestCaseCopyAttachedDisk(iscsi SD) (1.94 MB, application/zip)
2021-03-15 08:54 UTC, Shir Fishbain
no flags Details
Manual rep logs (1.20 MB, application/gzip)
2021-03-15 10:22 UTC, Avihai
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 113864 0 None MERGED core: set vdsId for parameters 2021-03-15 10:29:47 UTC
oVirt gerrit 113891 0 ovirt-engine-4.4.5.z MERGED core: set vdsId for parameters 2021-03-15 15:45:27 UTC

Description Evelina Shames 2021-03-15 07:44:21 UTC
Created attachment 1763322 [details]
Logs

Description of problem:
As part of our automation tests, we saw that cold migration from iscsi/fc to nfs/gluster fails with the following errors:

2021-03-15 09:22:28,071+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-
Thread-28) [disks_syncAction_d92398b6-b483-476b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure.
2021-03-15 09:22:30,102+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataCommand] (EE-ManagedScheduledExecutorService-engineSch
eduledThreadPool-Thread-25) [disks_syncAction_d92398b6-b483-476b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupVolumesDataComman
d' with failure.
2021-03-15 09:22:31,109+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand] (EE-ManagedScheduledExecutorService-engineSchedu
ledThreadPool-Thread-35) [disks_syncAction_d92398b6-b483-476b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyImageGroupWithDataCommand' wit
h failure.
2021-03-15 09:22:34,144+02 ERROR [org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [disks_syncAction_d92398b6-b483-476b] Ending command 'org.ovirt.engine.core.bll.storage.disk.MoveOrCopyDiskCommand' with failure.
2021-03-15 09:22:34,159+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.MoveImageGroupCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [disks_syncAction_d92398b6-b483-476b] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.MoveImageGroupCommand' with failure.
2021-03-15 09:22:34,317+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [] EVENT_ID: USER_MOVED_DISK_FINISHED_FAILURE(2,011), User admin@internal-authz have failed to move disk disk_virtio_scsiraw_1509202149 to domain nfs_2.

vdsm:
2021-03-15 09:22:24,809+0200 DEBUG (tasks/7) [storage.TaskManager.Task] (Task='a8da8d7d-cc42-4682-a83f-6d72c460be0b') Job.run: running copy_data: <bound metho
d Job.run of <Job id=9cdc8862-3a4a-4821-99c0-481bfcd52a1e status=pending at 0x140565881955496>> (args: () kwargs: {}) callback None (task:347)
2021-03-15 09:22:25,609+0200 ERROR (tasks/7) [root] Job '9cdc8862-3a4a-4821-99c0-481bfcd52a1e' failed (jobs:223)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/copy_data.py", line 91, in _run
    with self._dest.volume_operation():
  File "/usr/lib64/python3.6/contextlib.py", line 81, in __enter__
    return next(self.gen)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 687, in operation
    raise se.GenerationMismatch(requested_gen, actual_gen)
vdsm.storage.exception.GenerationMismatch: The provided generation does not match the actual generation: 'requested=0, actual=3'


Version-Release number of selected component (if applicable):
ovirt-engine-4.4.5.9-0.1.el8ev.noarch
vdsm-4.40.50.8-1.el8ev.x86_64

How reproducible:
100% with automation, manually I couldn't manage to reproduce it.

Steps to Reproduce:
1. Create VM from rhel8.3 template.
2. Add 4 disks on block sd (iscsi/fc):
   - disk_virtio_scsicow
   - disk_virtio_scsiraw
   - disk_virtiocow
   - disk_virtioraw
3. Move disks to file sd (nfs/gluster)

Actual results:
Migration fails

Expected results:
Operation shpuld succeed.

Additional info:
Attaching engine+vdsm logs.

Comment 1 Shir Fishbain 2021-03-15 08:48:04 UTC
The errors from vdsm and engine appear also in test_copy_disk.TestCaseCopyAttachedDisk.test_same_domain_same_alias in our automation when trying to copy bootable disk from to block SD:

2021-03-11 20:57:31,068+0200 ERROR (tasks/1) [root] Job 'e773e798-0fec-4f0c-baf8-f8b2b4114160' failed (jobs:223)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/storage/sdm/api/update_volume.py", line 41, in _run
    self._vol_attr)
  File "/usr/lib/python3.6/site-packages/vdsm/storage/volume.py", line 436, in update_attributes
    raise se.GenerationMismatch(generation, meta[sc.GENERATION])
vdsm.storage.exception.GenerationMismatch: The provided generation does not match the actual generation: 'requested=0, actual=1'

2021-03-13 01:04:26,369+02 ERROR [org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-77) [disks_syncAction_0218c270-12b9-4664] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.CopyDataCommand' with failure.

2021-03-13 01:04:32,550+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-41) [] EVENT_ID: USER_COPIED_DISK_FINISHED_FAILURE(2,007), User admin@internal-authz finished with error copying disk disk_virtio_scsicow_1300381397_copy_disk to domain iscsi_0.

2021-03-11 20:58:28,003+02 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-75) [] EVENT_ID: USER_COPIED_DISK_FINISHED_SUCCESS(2,006), User admin@internal-authz finished copying disk bootable_copy_disk to domain iscsi_0.

Logs attached for iscsi and fcp SD

Comment 2 Shir Fishbain 2021-03-15 08:53:19 UTC
Created attachment 1763332 [details]
test_copy_disk.TestCaseCopyAttachedDisk(fcp SD)

Comment 3 Shir Fishbain 2021-03-15 08:54:15 UTC
Created attachment 1763333 [details]
test_copy_disk.TestCaseCopyAttachedDisk(iscsi SD)

Comment 5 Avihai 2021-03-15 10:22:08 UTC
Created attachment 1763347 [details]
Manual rep logs

Comment 7 Evelina Shames 2021-03-18 18:56:57 UTC
Verified on rhv-4.4.5-10.

Comment 8 Sandro Bonazzola 2021-03-23 17:11:24 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.