Bug 890636

Summary: Migration speed is very slow (about 1M/s) with xbzrle enabled and the xbzrle pages are transferred
Product: Red Hat Enterprise Linux 7 Reporter: Qunfang Zhang <qzhang>
Component: qemu-kvmAssignee: Hai Huang <hhuang>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 7.0CC: hhuang, juzhang, knoel, michen, quintela, qzhang, virt-maint
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: QEMU 1.5.3 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-06-13 12:11:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qunfang Zhang 2012-12-28 10:11:36 UTC
Description of problem:
Boot a guest and run some memory stress test inside it. And then do live migration with xbzrle enabled. Set migration speed to 100M and xbzrle cache size to 4G. During migration, at the beginning migration speed is about 100M (or a little smaller than 100M) when the xbzrle pages have NOT been transferred. But then the migration speed is very slow (about 1M/s) after xbzrle pages are transferred.

Version-Release number of selected component (if applicable):
Install tree: RHEL-7.0-20121217.0
kernel-3.6.0-0.29.el7.x86_64
qemu-kvm-1.2.0-21.el7.x86_64

Guest: RHEL6.4

How reproducible:
Always

Steps to Reproduce:
1. Boot a guest on source host:

 /usr/libexec/qemu-kvm -cpu SandyBridge -enable-kvm -m 2048 -smp 2,sockets=2,cores=1,threads=1 -enable-kvm -name t2-rhel6.4-32 -uuid 61b6c504-5a8b-4fe1-8347-6c929b750dde -k en-us -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device usb-tablet,id=input0 -drive file=/root/rhel6.4-64-virtio.qcow2,if=none,id=disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device ide-drive,bus=ide.0,unit=1,drive=disk0,id=disk0  -drive file=/root/boot.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,drive=drive-ide0-1-0,bus=ide.1,unit=0,id=cdrom -netdev tap,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=44:37:E6:5E:91:85,bus=pci.0,addr=0x5 -monitor stdio -qmp tcp:0:6666,server,nowait -chardev socket,path=/tmp/isa-serial,server,nowait,id=isa1 -device isa-serial,chardev=isa1,id=isa-serial1 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x8 -chardev socket,id=charchannel0,path=/tmp/serial-socket,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -chardev socket,path=/tmp/foo,server,nowait,id=foo -device virtconsole,chardev=foo,id=console0  -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x9 -vnc :10 -k en-us -boot c -chardev socket,path=/tmp/qga.sock,server,nowait,id=qga0 -device virtserialport,bus=virtio-serial0.0,chardev=qga0,name=org.qemu.guest_agent.0  -global  PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0

2. Boot guest with listening mode on dst host.

3. Running stress inside guest. #stress -m 2

4. (qemu) migrate_set_capability xbzrle on (on both src and dst)
(qemu) info migrate_capabilities 
capabilities: xbzrle: on 

(qemu) migrate_set_cache_size 4G
(qemu)
(qemu) migrate_set_speed 100m

5. Start migration
(qemu) migrate -d tcp:t2:5800

6. (qemu)info migration (implement this command every 1 sec)

  
Actual results:
After xbzrle starts to be transferred, the migration speed is about 1M/s

Expected results:
Migration speed should be always around 100M/s

Additional info:

(qemu) info migrate
capabilities: xbzrle: on 
Migration status: active
total time: 1811764 milliseconds
transferred ram: 2977037 kbytes
remaining ram: 144208 kbytes
total ram: 2113920 kbytes
duplicate: 6873272 pages
normal: 485325 pages
normal bytes: 1941300 kbytes
cache size: 4294967296 bytes
xbzrle transferred: 1029025 kbytes
xbzrle pages: 8039876 pages
xbzrle cache miss: 485213
xbzrle overflow : 112

1 sec later, check it again:

(qemu) info migrate
capabilities: xbzrle: on 
Migration status: active
total time: 1812644 milliseconds
transferred ram: 2977410 kbytes
remaining ram: 233488 kbytes
total ram: 2113920 kbytes
duplicate: 6876453 pages
normal: 485325 pages
normal bytes: 1941300 kbytes
cache size: 4294967296 bytes
xbzrle transferred: 1029395 kbytes
xbzrle pages: 8042926 pages
xbzrle cache miss: 485213
xbzrle overflow : 112
(qemu)

Comment 2 Orit Wasserman 2012-12-31 09:18:30 UTC
What is the migration speed with XBZRLE ?
What is the cpu usage during migration ?

Comment 3 Qunfang Zhang 2013-01-04 02:56:48 UTC
(In reply to comment #2)
> What is the migration speed with XBZRLE ?
As mentioned in step 4, the migration speed is 100M/s. And I noticed that when the migration just starts and the xbzrle pages have not been transferred (maybe in the first iteration of migration), the real migration speed is about 80M/s. But after the xbzrle pages are transferred, the real migration speed is about 1M/s.

> What is the cpu usage during migration ?
(The stress test is running inside guest. #stress -m 2)

Before migration:

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                           
29540 root      20   0 2447m 1.1g 5960 S 198.3 15.1   4:26.91 qemu-kvm              


During migration:

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND                                           
29540 root      20   0 3534m 2.1g 5948 R 258.2 28.9   3:32.38 qemu-kvm

Comment 4 Orit Wasserman 2013-12-18 12:23:50 UTC
Could you try to reproduce it on the latest QEMU version (1.5.3)?

Comment 5 Qunfang Zhang 2013-12-19 05:13:58 UTC
(In reply to Orit Wasserman from comment #4)
> Could you try to reproduce it on the latest QEMU version (1.5.3)?

Re-test the bug with the following version, and the issue does not exist any more. Steps are the same as comment 0, the migration speed is always around 100M/s until the migration finished.

Host version:
kernel-3.10.0-63.el7.x86_64
qemu-kvm-1.5.3-24.el7.x86_64

Comment 6 Qunfang Zhang 2013-12-19 05:19:31 UTC
Orit, so we close it as current release?

Comment 7 Orit Wasserman 2013-12-19 06:41:40 UTC
yes

Comment 8 Qunfang Zhang 2013-12-20 07:41:44 UTC
Hi, Orit

Is that to say there's some patch that may effect the bug?

Comment 14 Ludek Smid 2014-06-13 12:11:26 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.