Bug 511031 - qemu-kvm hang during migration when stress test is running in the guest
qemu-kvm hang during migration when stress test is running in the guest
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.4
All Linux
low Severity medium
: rc
: ---
Assigned To: Glauber Costa
Lawrence Lim
:
Depends On:
Blocks: LiveMigration
  Show dependency treegraph
 
Reported: 2009-07-13 07:42 EDT by jason wang
Modified: 2014-03-25 20:58 EDT (History)
8 users (show)

See Also:
Fixed In Version: kvm-83-93.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-09-02 05:34:04 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
strace result on the src host (115 bytes, application/x-bzip-compressed-tar)
2009-07-13 07:48 EDT, jason wang
no flags Details
strace result on the src host (1.38 MB, application/x-bzip-compressed-tar)
2009-07-13 07:51 EDT, jason wang
no flags Details
strace result on the dst host (1.98 MB, application/x-bzip-compressed-tar)
2009-07-13 07:57 EDT, jason wang
no flags Details
Stress test (159.88 KB, application/x-gzip)
2009-07-14 11:47 EDT, jason wang
no flags Details

  None (edit)
Description jason wang 2009-07-13 07:42:34 EDT
Description of problem:
When migrate vm with stress testing running, qemu-kvm would hang during the migration.

Version-Release number of selected component (if applicable):
Host OS version:
Linux amd-8750-4-2 2.6.18-157.el5 #1 SMP Mon Jul 6 18:12:07 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
Could also reproduce in rhev-hypervisor-5.4-2.0.99.10.3.el5rhev
Host KVM version:
etherboot-zroms-kvm-5.4.4-10.el5
kvm-debuginfo-83-87.el5
kmod-kvm-83-87.el5
kvm-83-87.el5
kvm-qemu-img-83-87.el5
kvm-tools-83-87.el5
Guest OS version:
RHEL-5.3-Server x86_64

How reproducible:
100%

Steps to Reproduce:
1. boot the vm
2. run the stress test with the cmd line: stress -c N -i N -d N -m N where N is 2 * vcpu number
3. do the migration
  
Actual results:
1. qemu-kvm hang during migration

Expected results:
1. migration should finish successfully.

Additional info:
1. qemu-kvm cmdline:
src:
qemu-kvm -drive file=RHEL-Server-5.3-64.0.qcow2,if=ide,cache=off,index=0 -net nic,vlan=0,model=e1000,macaddr=00:33:44:55:11:22 -net tap,vlan=0 -vnc :10 -m 2048 -smp 2 -no-hpet -rtc-td-hack -cpu qemu64,+sse2 -vnc :10 -monitor stdio
dst:
qemu-kvm -drive file=RHEL-Server-5.3-64.0.qcow2,if=ide,cache=off,index=0 -net nic,vlan=0,model=e1000,macaddr=00:33:44:55:11:22 -net tap,vlan=0 -vnc :10 -m 2048 -smp 2 -no-hpet -rtc-td-hack -cpu qemu64,+sse2 -vnc :10 -monitor stdio -incoming tcp:0:4444
Comment 1 jason wang 2009-07-13 07:48:04 EDT
Created attachment 351462 [details]
strace result on the src host
Comment 2 jason wang 2009-07-13 07:51:45 EDT
Created attachment 351463 [details]
strace result on the src host
Comment 3 jason wang 2009-07-13 07:57:52 EDT
Created attachment 351464 [details]
strace result on the dst host
Comment 4 jason wang 2009-07-14 02:14:59 EDT
Could be reproduced in 83-81el5,83-71el5.
Comment 5 Glauber Costa 2009-07-14 11:04:46 EDT
It's probably a duplicate of bug 511199, due to the amount of EAGAINs in the source, and the stalling happening on recvfrom in the destination.

This report is, however, much more feature complete. I'll try to reproduce it.
But meanwhile, can you try it with the patch dor posted on that BZ?

thanks!
Comment 6 Glauber Costa 2009-07-14 11:09:31 EDT
btw, can you point me to this "stress" thing?
Comment 7 jason wang 2009-07-14 11:47:51 EDT
Created attachment 351618 [details]
Stress test
Comment 14 jason wang 2009-07-22 04:16:09 EDT
I've tested this case in 83-93el5, could not be reproduced.
Comment 18 Suqin Huang 2009-07-23 05:51:17 EDT
test on kvm-83-94.el5:

Test with 2vpus and 4vcpus (host has 4 cpu)
command used:

source:
/usr/libexec/qemu-kvm -no-hpet -rtc-td-hack -smp 2 -m 2G -name vm1 -drive file=/mnt/RHEL-Server-5.4-32.qcow2,if=ide,cache=off,index=0 -uuid d073cee9-8836-47da-b4c1-f5583f2dc747 -net nic,macaddr=00:26:9B:DE:C8:58,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup-switch -usbdevice tablet -vnc :5 -boot c -monitor stdio

run stress testing on vm:
#stress -c N -i N -d N -m N


des:
/usr/libexec/qemu-kvm -no-hpet -rtc-td-hack -smp 2 -m 2G -name vm1 -drive file=/mnt/RHEL-Server-5.4-32.qcow2,if=ide,cache=off,index=0 -uuid d073cee9-8836-47da-b4c1-f5583f2dc747 -net nic,macaddr=00:26:9B:DE:C8:58,model=e1000 -net tap,vlan=0,script=/etc/qemu-ifup-switch -usbdevice tablet -vnc :5 -boot c -monitor stdio -incoming tcp:0:6000


try five times, can not reproduce.
Change the status to *VERIFIED*, please reopen it if the issue is reproduced.
Comment 20 errata-xmlrpc 2009-09-02 05:34:04 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1272.html

Note You need to log in before you can comment on or make changes to this bug.