Bug 601619

Summary: migration never ends with default migrate_downtime under high network stress
Product: Red Hat Enterprise Linux 6 Reporter: Keqin Hong <khong>
Component: qemu-kvmAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: low    
Version: 6.0CC: michen, mjenner, mkenneth, mst, syeghiay, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-08 11:16:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Keqin Hong 2010-06-08 10:34:30 UTC
Description of problem:
when using virtio with vhost-net, and when migration is taken under stress of incoming and outgoing packets, migration never ends.

Version-Release number of selected component (if applicable):
qemu-kvm 0.12.1.2-2.71
kernel-2.6.32-33.el6.x86_64

How reproducible:
almost always

CLI:
# /usr/libexec/qemu-kvm -m 2G -smp 2 -drive file=RHEL6.0-64-virtio-0603.1.qcow2,if=none,id=drive-virtio0,boot=on -device virtio-blk-pci,drive=drive-virtio0,id=virtio-blk-pci0,addr=0x3 -netdev tap,id=hostnet0,vhost=on -device virtio-net-pci,netdev=hostnet0,mac=02:00:40:3F:20:20,bus=pci.0,addr=0x4 -boot order=c,menu=on -uuid 11144ecc-d3a1-4d3c-a386-12daf50015f2 -rtc base=utc -rtc-td-hack -no-kvm-pit-reinjection -monitor stdio -cpu qemu64,+sse2 -balloon none -vnc :1

Steps to Reproduce:
1. start guest with vhost=on
2. start netserver on both host and guest
host$ ./netserver -p 5912
guest$ ./netserver -p 5920
3. run netperf from host to guest
host$ while true; do ./netperf -H $guestip -p 5920; done
4. at the same time run netperf from guest to host
guest$ while true; do ./netperf -H $hostip -p 5912; done
5. migrate guest
  
Actual results:
Migration starts but never ends.
1. (qemu) info migrate shows the remaining ram fluctuating around a certain point, as in
"(qemu) info migrate 
Migration status: active
transferred ram: 3187272 kbytes
remaining ram: 39644 kbytes
total ram: 2113920 kbytes
(qemu) info migrate 
Migration status: active
transferred ram: 11353476 kbytes
remaining ram: 42604 kbytes
total ram: 2113920 kbytes
(qemu) info migrate 
Migration status: active
transferred ram: 27299628 kbytes
remaining ram: 39564 kbytes
total ram: 2113920 kbytes
...
"
2. On stop either host side or guest side netperf process, which reduces network stress, migration will soon complete. 

Expected results:
Migration shouldn't proceed in an endless loop; Complete or fail within a certain amount of time.

Comment 2 RHEL Program Management 2010-06-08 10:52:57 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 3 Dor Laor 2010-06-08 11:16:27 UTC
It's not a bug, it is the expected behaviour.
In this case libvirt should increase the migration bandwidth or even pause the guest and stop the live part.