Description of problem: Virtual Machine is stuck in Migrating status in the engine but the migration already completed. engine=# select vm_name,status,migrating_to_vds from vms; vm_name | status | migrating_to_vds ---------+--------+------------------ Ubuntu | 5 | From engine side: 1. Migration Starts 2020-09-10 12:06:49,708+10 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-11) [8c0d1349-c17f-4d97-ad7c-ddbbf3f89643] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: Ubuntu, Source: devel.kvm, Destination: host1.kvm, User: admin@internal-authz). 2. Destination VDSM restarted 2020-09-10 12:07:11,987+10 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1708) [] EVENT_ID: VDS_HOST_NOT_RESPONDING(9,027), Host host1.kvm is not responding. Host cannot be fenced automatically because power management for the host is disabled. 3. Migration finished, source resports VM as down: 2020-09-10 12:07:13,215+10 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-25) [25e0f57e] VM '45c63b1d-64e2-4e93-b84c-be6d343594d2' was reported as Down on VDS 'c306685d-2fd9-4bb1-b9d2-32c51d37b500'(devel.kvm) 4. VM handover 2020-09-10 12:07:13,532+10 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-25) [25e0f57e] Handing over VM '45c63b1d-64e2-4e93-b84c-be6d343594d2'(Ubuntu) to Host '4815fb17-70fa-4c29-82b6-c3c6c9043237'. Setting VM to status 'MigratingTo' 5. Destroy on Source 2020-09-10 12:07:13,215+10 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-25) [25e0f57e] START, DestroyVDSCommand(HostName = devel.kvm, DestroyVmVDSCommandParameters:{hostId='c306685d-2fd9-4bb1-b9d2-32c51d37b500', vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 37cb9588 6. Destination VDSM restarted, host is now Up 2020-09-10 12:07:16,492+10 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-8) [16554103] EVENT_ID: VDS_DETECTED(13), Status of host host1.kvm was set to Up. 7. Migration finished 2020-09-10 12:07:26,999+10 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [] START, MigrateStatusVDSCommand(HostName = host1.kvm, MigrateStatusVDSCommandParameters:{hostId='4815fb17-70fa-4c29-82b6-c3c6c9043237', vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2'}), log id: 6b956136 2020-09-10 12:07:27,002+10 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateStatusVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [] FINISH, MigrateStatusVDSCommand, return: , log id: 6b956136 2020-09-10 12:07:27,015+10 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [] EVENT_ID: VM_MIGRATION_DONE(63), Migration completed (VM: Ubuntu, Source: devel.kvm, Destination: host1.kvm, Duration: 26 seconds, Total: 37 seconds, Actual downtime: (N/A)) 8. So far so good. But... the destination host is now reporting the VM as "Migration Source" instead of "Migration Destination" or "Up". 2020-09-10 12:07:26,986+10 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [] VM '45c63b1d-64e2-4e93-b84c-be6d343594d2'(Ubuntu) moved from 'MigratingTo' --> 'MigratingFrom' ------- Let's take a look at the VDSM side (destination): 1. Migration Create, status is "Migration Destination" 2020-09-10 12:06:48,966+1000 INFO (jsonrpc/2) [api.virt] FINISH migrationCreate return={'status': {'code': 0, 'message': 'Done'}, 'migrationPort': 0, 'params': {'vmId': '45c63b1d-64e2-4e93-b84c-be6d343594d2', 'status': 'Migration Destination' 2. VDSM restarted 2020-09-10 12:07:06,683+1000 INFO (MainThread) [vds] (PID: 5499) I am the actual vdsm 4.40.22.1 host1.kvm (4.18.0-193.13.2.el8_2.x86_64) (vdsmd:155) 3. Recovery of the VM 2020-09-10 12:07:07,043+1000 INFO (vmrecovery) [vds] recovery [1:1/1]: recovered domain 45c63b1d-64e2-4e93-b84c-be6d343594d2 (recovery:123) 4. VM is running fine on the destination host, but VDSM is reporting "Migration Source" forever: 2020-09-10 12:19:43,042+1000 INFO (jsonrpc/6) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) 2020-09-10 12:32:14,755+1000 INFO (jsonrpc/4) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) 2020-09-10 12:44:46,547+1000 INFO (jsonrpc/6) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) 2020-09-10 12:57:18,347+1000 INFO (jsonrpc/4) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) 2020-09-10 13:09:50,085+1000 INFO (jsonrpc/6) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) So the VDSM restart made the VM status swtich from "Migration Destination" to "Migration Source", and it doesn't report the VM as 'Up'. The VM is UP and the underlying migration finished sucessfully. Version-Release number of selected component (if applicable): vdsm-4.40.22-1.el8ev.x86_64 ovirt-engine-4.4.1.10-0.1.el8ev.noarch How reproducible: Always Steps to Reproduce: 1. Start Migration 2. Restart VDSM on the destination host Actual results: - VM stuck in migrating state on the engine - Destination VDSM reporting wrong status Expected results: - VM Up status Additional info: See this sequence: 2020-09-10 12:07:07,042+1000 INFO (vm/45c63b1d) [virt.vm] (vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2') VM wrapper has started (vm:2497) 2020-09-10 12:07:07,043+1000 INFO (vmrecovery) [vds] VM recovered: 45c63b1d-64e2-4e93-b84c-be6d343594d2 (recovery:97) 2020-09-10 12:07:07,043+1000 INFO (vmrecovery) [vds] recovery [1:1/1]: recovered domain 45c63b1d-64e2-4e93-b84c-be6d343594d2 (recovery:123) 2020-09-10 12:07:07,276+1000 INFO (vm/45c63b1d) [virt.vm] (vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2') Starting connection (guestagent:246) 2020-09-10 12:07:07,278+1000 INFO (vm/45c63b1d) [virt.vm] (vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2') CPU stopped: domain initialization (vm:5551) 2020-09-10 12:07:07,289+1000 WARN (vm/45c63b1d) [vds] Not ready yet, ignoring event '|virt|VM_status|45c63b1d-64e2-4e93-b84c-be6d343594d2' args={'45c63b1d-64e2-4e93-b84c-be6d343594d2': {'status': 'Migration Source', 'elapsedTime': '1026', 'monitorResponse': '0', 'clientIp': '', 'timeOffset': '0', 'pauseCode': 'NOERR', 'cpuUser': '0.00', 'cpuSys': '0.00', 'cpuUsage': '0.00', 'displayInfo': [{'type': 'spice', 'port': '5900', 'tlsPort': '5901', 'ipAddress': '192.168.100.1'}, {'type': 'vnc', 'port': '5902', 'tlsPort': '', 'ipAddress': '192.168.100.1'}], 'hash': '-6523322131148256054', 'vcpuQuota': '-1', 'vcpuPeriod': 100000}} (clientIF:215) 2020-09-10 12:07:10,593+1000 INFO (libvirt/events) [virt.vm] (vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2') CPU running: onResume (vm:5551) 2020-09-10 12:07:11,641+1000 INFO (jsonrpc/1) [throttled] Current getAllVmStats: {'45c63b1d-64e2-4e93-b84c-be6d343594d2': 'Migration Source'} (throttledlog:104) 2020-09-10 12:07:18,058+1000 INFO (vmrecovery) [vds] recovery [1/1]: preparing paths for domain 45c63b1d-64e2-4e93-b84c-be6d343594d2 (clientIF:738) 2020-09-10 12:07:18,088+1000 INFO (vmrecovery) [virt.vm] (vmId='45c63b1d-64e2-4e93-b84c-be6d343594d2') Enabling drive monitoring (drivemonitor:56) 2020-09-10 12:07:26,079+1000 INFO (jsonrpc/4) [api.virt] START getMigrationStatus() from=::ffff:192.168.100.253,48716, vmId=45c63b1d-64e2-4e93-b84c-be6d343594d2 (api:48) 2020-09-10 12:07:26,079+1000 INFO (jsonrpc/4) [api.virt] FINISH getMigrationStatus return={'status': {'code': 0, 'message': 'Done'}, 'migrationStats': {'status': {'code': 0, 'message': 'Migration in progress'}, 'progress': 0}} from=::ffff:192.168.100.253,48716, vmId=45c63b1d-64e2-4e93-b84c-be6d343594d2 (api:54)
(In reply to Germano Veit Michel from comment #0) > Version-Release number of selected component (if applicable): > vdsm-4.40.22-1.el8ev.x86_64 > ovirt-engine-4.4.1.10-0.1.el8ev.noarch Above are the versions I reproduced the bug. But customer hit it on: ovirt-engine-4.3.9.4-11.el7.noarch vdsm-4.30.44-1.el7ev.x86_64
What is the logic in setting the VM as Migration Source here if its not post-copy? https://github.com/oVirt/vdsm/blob/master/lib/vdsm/virt/vm.py#L915
A second VDSM restart fixes the status of the VM to Up.
Works fine on vdsm-4.40.33-1.el8ev.x86_64.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: Red Hat Virtualization security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5179