Bug 1983694 - Migration hangs if vm is shutdown during live migration [rhel-8.4.0.z]
Summary: Migration hangs if vm is shutdown during live migration [rhel-8.4.0.z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: 8.5
Assignee: Jiri Denemark
QA Contact: Fangge Jin
URL:
Whiteboard:
Depends On: 1949869
Blocks: 1966121
TreeView+ depends on / blocked
 
Reported: 2021-07-19 14:06 UTC by RHEL Program Management Team
Modified: 2021-08-31 08:15 UTC (History)
7 users (show)

Fixed In Version: libvirt-7.0.0-14.3.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1949869
Environment:
Last Closed: 2021-08-31 08:07:47 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2021:3340 0 None None None 2021-08-31 08:08:00 UTC

Comment 5 Fangge Jin 2021-07-21 09:52:20 UTC
Test with libvirt-7.0.0-14.3

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Scenario 1: poweroff inside vm during live migration.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

I got 3 different kinds result with virsh:
1) It returned "Input/output error"
[root@rhel8-4 ~]# virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  1 %]error: End of file while reading data: : Input/output error

2) It returned nothing
[root@rhel8-4 ~]# virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  1 %]
[root@rhel8-4 ~]# 

3) It returned "domain is not running" which is expected
[root@rhel8-4 ~]# virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  3 %]error: operation failed: domain is not running


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Scenario 2: destroy src vm during live migration.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[root@rhel8-4 ~]#  virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  0 %]error: operation failed: domain is not running


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Scenario 3: kill src qemu-kvm process during live migration.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#  virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  0 %]error: operation failed: domain is not running


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Scenario 4: destroy dest vm during live migration.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#  virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  0 %]error: End of file while reading data: : Input/output error

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Scenario 5: kill dest qemu-kvm process during live migration.~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#  virsh migrate avocado-vt-vm1 qemu+ssh://10.0.151.215/system --live --verbose --p2p --persistent --migrateuri tcp://10.0.151.215 --bandwidth 2
Migration: [  0 %]error: operation failed: domain is no longer running

Comment 6 Jiri Denemark 2021-07-21 12:57:02 UTC
The different behavior can be partially caused by --verbose because it results
in virDomainGetJobInfo (which calls query-migrate QMP command) changing the
interactions and timing between EOF callback and the thread controlling the
migration. Another reason (when the progress says 0%) might caused by killing
QEMU too early before actual migration starts.

Anyway, scenario 1, case 2 is strange and deserves some investigation (as a
separate bug, because migration did not get stuck and thus this BZ is fixed)
in case you are able to reproduce it and provide debug logs from both the
source libvirtd and virsh.

Comment 7 Fangge Jin 2021-07-22 05:49:04 UTC
I can't reproduce scenario 1, case 2 now. I will file a bug if I can reproduce it in the future.

Comment 9 errata-xmlrpc 2021-08-31 08:07:47 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3340


Note You need to log in before you can comment on or make changes to this bug.