Description of problem: When migrate vm with stress testing running, qemu-kvm would hang during the migration. Version-Release number of selected component (if applicable): Host OS version: Linux amd-8750-4-2 2.6.18-157.el5 #1 SMP Mon Jul 6 18:12:07 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux Could also reproduce in rhev-hypervisor-5.4-2.0.99.10.3.el5rhev Host KVM version: etherboot-zroms-kvm-5.4.4-10.el5 kvm-debuginfo-83-87.el5 kmod-kvm-83-87.el5 kvm-83-87.el5 kvm-qemu-img-83-87.el5 kvm-tools-83-87.el5 Guest OS version: RHEL-5.3-Server x86_64 How reproducible: 100% Steps to Reproduce: 1. boot the vm 2. run the stress test with the cmd line: stress -c N -i N -d N -m N where N is 2 * vcpu number 3. do the migration Actual results: 1. qemu-kvm hang during migration Expected results: 1. migration should finish successfully. Additional info: 1. qemu-kvm cmdline: src: qemu-kvm -drive file=RHEL-Server-5.3-64.0.qcow2,if=ide,cache=off,index=0 -net nic,vlan=0,model=e1000,macaddr=00:33:44:55:11:22 -net tap,vlan=0 -vnc :10 -m 2048 -smp 2 -no-hpet -rtc-td-hack -cpu qemu64,+sse2 -vnc :10 -monitor stdio dst: qemu-kvm -drive file=RHEL-Server-5.3-64.0.qcow2,if=ide,cache=off,index=0 -net nic,vlan=0,model=e1000,macaddr=00:33:44:55:11:22 -net tap,vlan=0 -vnc :10 -m 2048 -smp 2 -no-hpet -rtc-td-hack -cpu qemu64,+sse2 -vnc :10 -monitor stdio -incoming tcp:0:4444
Created attachment 351462 [details] strace result on the src host
Created attachment 351463 [details] strace result on the src host
Created attachment 351464 [details] strace result on the dst host
Could be reproduced in 83-81el5,83-71el5.
It's probably a duplicate of bug 511199, due to the amount of EAGAINs in the source, and the stalling happening on recvfrom in the destination. This report is, however, much more feature complete. I'll try to reproduce it. But meanwhile, can you try it with the patch dor posted on that BZ? thanks!
btw, can you point me to this "stress" thing?
Created attachment 351618 [details] Stress test
I've tested this case in 83-93el5, could not be reproduced.
test on kvm-83-94.el5: Test with 2vpus and 4vcpus (host has 4 cpu) command used: source: /usr/libexec/qemu-kvm -no-hpet -rtc-td-hack -smp 2 -m 2G -name vm1 -drive file=/mnt/RHEL-Server-5.4-32.qcow2,if=ide,cache=off,index=0 -uuid d073cee9-8836-47da-b4c1-f5583f2dc747 -net nic,macaddr=00:26:9B:DE:C8:58,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup-switch -usbdevice tablet -vnc :5 -boot c -monitor stdio run stress testing on vm: #stress -c N -i N -d N -m N des: /usr/libexec/qemu-kvm -no-hpet -rtc-td-hack -smp 2 -m 2G -name vm1 -drive file=/mnt/RHEL-Server-5.4-32.qcow2,if=ide,cache=off,index=0 -uuid d073cee9-8836-47da-b4c1-f5583f2dc747 -net nic,macaddr=00:26:9B:DE:C8:58,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup-switch -usbdevice tablet -vnc :5 -boot c -monitor stdio -incoming tcp:0:6000 try five times, can not reproduce. Change the status to *VERIFIED*, please reopen it if the issue is reproduced.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-1272.html