Created attachment 1684886 [details] Example domain XML Description of problem: When a VM is started as paused from libvirt, migrated to another host and then migrated back to the original host, QEMU crashes with the following error: qemu-kvm: Failed to load virtio_pci/modern_queue_state:avail qemu-kvm: Failed to load virtio_pci/modern_state:vqs qemu-kvm: Failed to load virtio/extra_state:extra_state qemu-kvm: Failed to load virtio-rng:virtio qemu-kvm: error while loading state for instance 0x0 of device '0000:00:01.2:00.0/virtio-rng' qemu-kvm: load of migration failed: Input/output error shutting down, reason=crashed Version-Release number of selected component (if applicable): qemu-kvm-4.2.0-19.module+el8.2.0+6296+6b821950.x86_64 kernel-4.18.0-193.el8.x86_64 libvirt-daemon-6.0.0-17.module+el8.2.0+6257+0d066c28.x86_64 I experience the bug also on el7 with qemu-kvm-ev-2.12.0-44.1.el7_8.1.x86_64. How reproducible: 100% (on RHV hosts) Steps to Reproduce: 1. Take the attached domain XML and start the corresponding VM as paused using virsh: # virsh create domain.xml --paused 2. Migrate the VM to another host: # virsh migrate test qemu+tls://ANOTHER-HOST/system --live 3. Migrate the VM back to the original host from ANOTHER-HOST: # virsh migrate test qemu+tls://ORIGINAL-HOST/system --live Actual results: The second migration fails with error: operation failed: domain is not running Expected results: The migration succeeds.
I could reproduce this bz via libvirt, still try to reproduce it from qemu side
Find two points to reproduce bz: 1.Only when firmware is ovmf that will reproduce this bz, seabios is ok; 2.And ping-pong migration succeed when ovmf + running vm, but ovmf + paused vm will reproduce bz. Also reproduce on qemu side, steps are followings: 1.boot a vm with clis[1] on src host; 2.boot a vm with same clis but append "-incoming defer"; 3.execute qmp commands on src&dst host: (1)src qmp: {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]}} (2)dst qmp: {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"late-block-activate","state":true}]}} {"execute":"migrate-incoming","arguments":{"uri":"tcp:[::]:49152"}} (3)src qmp: {"execute": "migrate","arguments":{"uri": "tcp:10.73.130.69:49152"}} {"execute":"query-migrate"} {"execute":"migrate-continue","arguments":{"state":"pre-switchover"}} 4.After migrate vm from src to dst host, quit qemu and start a vm with same clis[1] but append "-incoming defer" on src host. 5.execute qmp commands on src&dst host to migrate vm back to src host: (1)dst qemu: {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"pause-before-switchover","state":true}]}} {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"late-block-activate","state":false}]}} (2)src qemu: {"execute":"migrate-set-capabilities","arguments":{"capabilities":[{"capability":"late-block-activate","state":true}]}} {"execute":"migrate-incoming","arguments":{"uri":"tcp:[::]:49152"}} (3)dst qemu {"execute": "migrate","arguments":{"uri": "tcp:10.73.130.67:49152"}} {"execute":"query-migrate"} {"execute":"migrate-continue","arguments":{"state":"pre-switchover"}} Actual Result: After step 5-(3), qemu on src host will quit with following errors, and qemu on dst host will crash (will provide the core dump file later): [root@hp-dl385g10-13 home]# sh libvirt2.sh QEMU 4.2.0 monitor - type 'help' for more information (qemu) 2020-05-07T08:53:21.238754Z qemu-kvm: Failed to load virtio_pci/modern_queue_state:avail 2020-05-07T08:53:21.238843Z qemu-kvm: Failed to load virtio_pci/modern_state:vqs 2020-05-07T08:53:21.238862Z qemu-kvm: Failed to load virtio/extra_state:extra_state 2020-05-07T08:53:21.238894Z qemu-kvm: Failed to load virtio-rng:virtio 2020-05-07T08:53:21.238920Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:01.2:00.0/virtio-rng' 2020-05-07T08:53:21.239074Z qemu-kvm: load of migration failed: Input/output error clis[1]: /usr/libexec/qemu-kvm \ -name guest=test,debug-threads=on \ -S \ -blockdev node-name=libvirt-pflash0-storage,driver=file,auto-read-only=on,discard=unmap,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd \ -blockdev node-name=libvirt-pflash0-format,read-only=on,driver=raw,file=libvirt-pflash0-storage \ -blockdev node-name=libvirt-pflash1-storage,driver=file,auto-read-only=on,discard=unmap,filename=/tmp/OVMF_VARS.fd \ -blockdev node-name=libvirt-pflash1-format,read-only=off,driver=raw,file=libvirt-pflash1-storage \ -machine pc-q35-rhel8.1.0,accel=kvm,usb=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format \ -cpu qemu64 \ -m 1024 \ -overcommit mem-lock=off \ -smp 1,sockets=1,cores=1,threads=1 \ -uuid be5615db-b7fe-44f7-aacb-ce7ac05367ed \ -smbios type=1,manufacturer=oVirt,product=RHEL,version=8.2-1.0.el8,serial=5b34bd5b-2f90-4609-87c0-6f74ef4de39f,uuid=be5615db-b7fe-44f7-aacb-ce7ac05367ed,family=oVirt \ -display none \ -no-user-config -nodefaults \ -device sga \ -chardev socket,id=charmonitor,path=/home/hello1,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc \ -no-shutdown \ -boot menu=on,splash-time=30000,strict=on \ -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 \ -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \ -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \ -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \ -device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \ -device qemu-xhci,id=usb,bus=pci.1,addr=0x0 \ -device virtio-serial-pci,id=ua-9c2cb3cc-b24f-4155-9eca-fc523fdee5d1,max_ports=16,bus=pci.5,addr=0x0 \ -chardev socket,id=charua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2,path=/home/hello2,server,nowait \ -device isa-serial,chardev=charua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2,id=ua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2 \ -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x0 \ -object rng-random,id=objua-40c09ebf-e493-4209-a7ff-860dd23b6081,filename=/dev/urandom \ -device virtio-rng-pci,rng=objua-40c09ebf-e493-4209-a7ff-860dd23b6081,id=ua-40c09ebf-e493-4209-a7ff-860dd23b6081,bus=pci.3,addr=0x0 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on \ -monitor stdio \
You say: > Actual Result: > After step 5-(3), qemu on src host will quit with following errors, and qemu on dst host will crash (will provide the core dump file later): > [root@hp-dl385g10-13 home]# sh libvirt2.sh > QEMU 4.2.0 monitor - type 'help' for more information > (qemu) 2020-05-07T08:53:21.238754Z qemu-kvm: Failed to load virtio_pci/modern_queue_state:avail > 2020-05-07T08:53:21.238843Z qemu-kvm: Failed to load virtio_pci/modern_state:vqs > 2020-05-07T08:53:21.238862Z qemu-kvm: Failed to load virtio/extra_state:extra_state > 2020-05-07T08:53:21.238894Z qemu-kvm: Failed to load virtio-rng:virtio > 2020-05-07T08:53:21.238920Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:01.2:00.0/virtio-rng' > 2020-05-07T08:53:21.239074Z qemu-kvm: load of migration failed: Input/output error That's the errors from the destination - what about the source side - what errors did it print? (My guess is that this is a block assert, we've got a few relating to already paused migrations)
(In reply to Dr. David Alan Gilbert from comment #3) > You say: > > > Actual Result: > > After step 5-(3), qemu on src host will quit with following errors, and qemu on dst host will crash (will provide the core dump file later): > > [root@hp-dl385g10-13 home]# sh libvirt2.sh > > QEMU 4.2.0 monitor - type 'help' for more information > > (qemu) 2020-05-07T08:53:21.238754Z qemu-kvm: Failed to load virtio_pci/modern_queue_state:avail > > 2020-05-07T08:53:21.238843Z qemu-kvm: Failed to load virtio_pci/modern_state:vqs > > 2020-05-07T08:53:21.238862Z qemu-kvm: Failed to load virtio/extra_state:extra_state > > 2020-05-07T08:53:21.238894Z qemu-kvm: Failed to load virtio-rng:virtio > > 2020-05-07T08:53:21.238920Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:01.2:00.0/virtio-rng' > > 2020-05-07T08:53:21.239074Z qemu-kvm: load of migration failed: Input/output error > > That's the errors from the destination - what about the source side - what > errors did it print? The errors on source side: (qemu) qemu-kvm: block.c:5659: bdrv_inactivate_recurse: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed. libvirt.sh: line 39: 367426 Aborted (core dumped) /usr/libexec/qemu-kvm -name guest=test,debug-threads=on -S -blockdev node-name=libvirt-pflash0-storage,driver=file,auto-read-only=on,discard=unmap,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd -blockdev node-name=libvirt-pflash0-format,read-only=on,driver=raw,file=libvirt-pflash0-storage -blockdev node-name=libvirt-pflash1-storage,driver=file,auto-read-only=on,discard=unmap,filename=/tmp/OVMF_VARS.fd -blockdev node-name=libvirt-pflash1-format,read-only=off,driver=raw,file=libvirt-pflash1-storage -machine pc-q35-rhel8.1.0,accel=kvm,usb=off,dump-guest-core=off,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format -cpu qemu64 -m 1024 -overcommit mem-lock=off -smp 1,sockets=1,cores=1,threads=1 -uuid be5615db-b7fe-44f7-aacb-ce7ac05367ed -smbios type=1,manufacturer=oVirt,product=RHEL,version=8.2-1.0.el8,serial=5b34bd5b-2f90-4609-87c0-6f74ef4de39f,uuid=be5615db-b7fe-44f7-aacb-ce7ac05367ed,family=oVirt -display none -no-user-config -nodefaults -device sga -chardev socket,id=charmonitor,path=/home/hello1,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot menu=on,splash-time=30000,strict=on -device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1 -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 -device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 -device qemu-xhci,id=usb,bus=pci.1,addr=0x0 -device virtio-serial-pci,id=ua-9c2cb3cc-b24f-4155-9eca-fc523fdee5d1,max_ports=16,bus=pci.5,addr=0x0 -chardev socket,id=charua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2,path=/home/hello2,server,nowait -device isa-serial,chardev=charua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2,id=ua-41db8caa-b9e3-461c-b7d2-dc343a26a5b2 -device virtio-balloon-pci,id=balloon0,bus=pci.2,addr=0x0 -object rng-random,id=objua-40c09ebf-e493-4209-a7ff-860dd23b6081,filename=/dev/urandom -device virtio-rng-pci,rng=objua-40c09ebf-e493-4209-a7ff-860dd23b6081,id=ua-40c09ebf-e493-4209-a7ff-860dd23b6081,bus=pci.3,addr=0x0 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on -monitor stdio -incoming defer And please visit attachment to see the qemu core dump log on source side > > (My guess is that this is a block assert, we've got a few relating to > already paused migrations)
Created attachment 1686213 [details] qemu-core-dump-log
Yeh that's the assert I thought it would be; this is a dupe *** This bug has been marked as a duplicate of bug 1713009 ***