Red Hat Bugzilla – Bug 623552
SCP image fails from host to guest with vhost on when do migration
Last modified: 2013-01-09 18:00:20 EST
Description of problem: First boot guest in hostA. second,scp a big .qcow image(rhel6.0_2.59_64.qcow2) to guest.in the process of scp image,do migration from hostA to HostB.Once migration completed,scp failed immediately. Version-Release number of selected component (if applicable): Host Kernel and qemu-kvm version #uname -r 2.6.32-62.el6.x86_64 #rpm -q qemu-kvm qemu-kvm-0.12.1.2-2.109.el6.x86_64 Guest kernel uname -r #2.6.32-59.1.el6.x86_64 How reproducible: Steps to Reproduce: 1.Boot guest in HostA # /usr/libexec/qemu-kvm -m 4G -smp 2 -drive file=/home/rhel6.0_2.59_64.qcow2,if=none,id=test,boot=on,cache=none,format=qcow2, werror=stop,rerror=stop -device virtio-blk-pci,drive=test -cpu qemu64,+sse2,+x2apic,-kvmclock -monitor stdio -drive file=/root/zhangjunyi/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0 -boot order=cdn,menu=on -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,id=net0,mac=22:11:22:45:66:94 -vnc :10 -qmp tcp:0:4445,server,nowait 2.Boot listening mode in remote HostB. #<commandLine> -incoming tcp:0:5555 3.After guest booted,scp rhel6.0_2.59_64.qcow2 from HostA to guest. #scp rhel6.0_2.59_64.qcow2 10.66.91.95:/root 4.In the process of scp image,do migration from hostA to HostB #{"execute": "migrate","arguments":{"uri": "tcp:10.66.91.104:5555"}} Actual results: Once migration completed,we will see the following messages. #scp rhel6.0_2.59_64.qcow2 10.66.91.186:/root/ root@10.66.91.186's password: rhel6.0_2.59_64.qcow2 29% 541MB 2.5MB/s 08:49 ETAReceived disconnect from 10.66.91.186: 2: Packet corrupt lost connection We can reconnect the guest after scp failed.we can do this command again #scp rhel6.0_2.59_64.qcow2 10.66.91.186:/root/ root@10.66.91.186's password: rhel6.0_2.59_64.qcow2 16% 304MB 44.2MB/s 00:35 ETA Expected results: scp big file from host to guest successful when do migration. Additional info: Please note the following message 1.If we set vhost=off.command "scp rhel6.0_2.59_64.qcow2 10.66.91.186:/root/" can completed successful.this is issue only hit when vhost=on. 2.If no migration,we just execute "scp rhel6.0_2.59_64.qcow2 10.66.91.186:/root/".both can completed successful whatever vhost=on/vhost=off.
I suspect this problem occurs because the vhost device is not stopped until the virtio-net save routine is called. At this point a final dirty bitmap sync is done with the kernel, but qemu has already done it's final rounds of syncing dirty pages, so these changes are lost. Re-assigning to Michael to resolve.
If the issue happens after migration, try ae8894c00b560bde4cbbc2115f532df997e15d14 git://git.kernel.org/pub/scm/virt/kvm/kvm.git
Try this kernel build https://brewweb.devel.redhat.com/taskinfo?taskID=2861817 if this fixes the issue it's a kernel bug same as 647367.
Additionally, please try running with sndbuf=0 option for tap, to check whether this is the issue of packets sent after migration confusing the bridge.
Verified on on qemu-kvm-0.12.1.2-2.129.el6.x86_64,tested two times using steps as same as comment0,didn't hit this issue.this issue has been fixed.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: Cause: bug when syncing the dirty bits of vhost in qemu-kvm Consequence: Some dirty pages may not be transferred to dest host and the scp may fail during migration. Fix: Fix dirty page bit sync code in vhost. Result: scp during migration would not fail anymore.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2011-0534.html