Bug 1733203 - guest sometimes couldn't boot when do migration during the early stage of boot
Summary: guest sometimes couldn't boot when do migration during the early stage of boot
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.1
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: rc
: ---
Assignee: Dr. David Alan Gilbert
QA Contact: Li Xiaohui
URL:
Whiteboard:
Depends On: 1732846
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-25 12:17 UTC by Li Xiaohui
Modified: 2020-02-05 23:01 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1732846
Environment:
Last Closed:
Type: Bug
Target Upstream Version:


Attachments (Terms of Use)

Description Li Xiaohui 2019-07-25 12:17:00 UTC
+++ This bug was initially created as a clone of Bug #1732846 +++

Description of problem:
guest sometimes couldn't boot when do migration during the early stage of boot


Version-Release number of selected component (if applicable):
src&dst host info: kernel-4.18.0-118.el8.x86_64 & qemu-img-2.12.0-82.module+el8.1.0+3738+0d8c0249.x86_64
guest info: win10 guest with virtio-win-prewhql-0.1-172


How reproducible:
2/3


Steps to Reproduce:
1.boot guest with "-S" and "-monitor tcp.." on src host:
/usr/libexec/qemu-kvm \
-S \
-enable-kvm \
-nodefaults \
-machine q35 \
-m 8G  \
-smp 8  \
-cpu EPYC \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
-device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-2 \
-drive file=/mnt/glusterfs/win10-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,media=disk,cache=none,werror=stop,rerror=stop \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,queues=4 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=d0:67:26:cc:07:3c,bus=pcie.0-root-port-3,vectors=10,mq=on \
-qmp tcp:0:3333,server,nowait \
-vnc :1 \
-device VGA \
-monitor tcp:0:5555,server,nowait \
2.boot guest without "-S" and with "-monitor tcp..." & "-incoming tcp.."
/usr/libexec/qemu-kvm \
-enable-kvm \
-nodefaults \
-machine q35 \
-m 8G  \
-smp 8  \
-cpu EPYC \
-device pcie-root-port,id=pcie.0-root-port-2,slot=2,chassis=2,addr=0x2,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-3,slot=3,chassis=3,addr=0x3,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-4,slot=4,chassis=4,addr=0x4,bus=pcie.0 \
-device pcie-root-port,id=pcie.0-root-port-5,slot=5,chassis=5,addr=0x5,bus=pcie.0 \
-device virtio-scsi-pci,id=scsi0,bus=pcie.0-root-port-2 \
-drive file=/mnt/glusterfs/win10-64-virtio-scsi.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,media=disk,cache=none,werror=stop,rerror=stop \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown,queues=4 \
-device virtio-net-pci,netdev=hostnet0,id=net0,mac=d0:67:26:cc:07:3c,bus=pcie.0-root-port-3,vectors=10,mq=on \
-qmp tcp:0:3333,server,nowait \
-vnc :1 \
-device VGA \
-monitor tcp:0:5555,server,nowait \
-incoming tcp:0:5800 \
3.Start live migration with running below script on src host
[root@hp-dl385g10-13 qemu-sh]# echo c | nc localhost 5555; sleep 0.6; echo migrate tcp:10.73.130.69:5800 | nc localhost 5555


Actual results:
after step3, migrate successfully in src host qemu, and status is running in dst host qemu, but view guest, found it kept in "Booting from Hard Disk"(wait for 20mins, still in this status), like attachment picture1
(1)on src host:
root@hp-dl385g10-13 qemu-sh]# nc localhost 5555
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) info status
info status
VM status: paused (postmigrate)
(qemu) info migrate
info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off postcopy-ram: off x-colo: off release-ram: off return-path: off pause-before-switchover: off x-multifd: off dirty-bitmaps: off late-block-activate: off 
Migration status: completed
total time: 581 milliseconds
downtime: 13 milliseconds
setup: 36 milliseconds
transferred ram: 19806 kbytes
throughput: 283.03 mbps
remaining ram: 0 kbytes
total ram: 8405832 kbytes
duplicate: 2101192 pages
skipped: 0 pages
normal: 334 pages
normal bytes: 1336 kbytes
dirty sync count: 3
page size: 4 kbytes
(2)on dst host:
[root@hp-dl385g10-14 ~]# nc localhost 5555
QEMU 2.12.0 monitor - type 'help' for more information
(qemu) info status
info status
VM status: running
(qemu) info migrate	


Expected results:
guest starts successfully after migration


Additional info:
1.rhel8.1.0 guest hit this issue on the same environment, too(1/4)
2.on fast train(qemu-img-4.0.0-5.module+el8.1.0+3622+5812d9bf.x86_64), win10 guest hit this issue(1/5 reproduce), will clone a bz after try on qemu-kvm 4.1

--- Additional comment from Li Xiaohui on 2019-07-24 13:57 UTC ---

Comment 1 Li Xiaohui 2019-07-25 12:21:52 UTC
reproduce this bz on rhel8.1.0 host(upstream qemu-4.1.0-rc1 and kernel-4.18.0-119.el8.x86_64), use same commands with comment 0, rhel8.1.0 guest, reproduce: 1/10 times

Comment 5 Ademar Reis 2020-02-05 23:01:33 UTC
QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks


Note You need to log in before you can comment on or make changes to this bug.