Bug 690521
Summary: | Enlarging migrate_set_speed does not raise migration network transfer rates to the real network bandwidth | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Dan Yasny <dyasny> | |
Component: | kvm | Assignee: | Glauber Costa <gcosta> | |
Status: | CLOSED WONTFIX | QA Contact: | Virtualization Bugs <virt-bugs> | |
Severity: | urgent | Docs Contact: | ||
Priority: | urgent | |||
Version: | 5.6 | CC: | acathrow, bcao, cww, ehabkost, gcosta, iheim, juzhang, knoel, llim, lyarwood, michen, mkalinin, mkenneth, mshao, quintela, tburke, virt-maint, vromanov, xfu | |
Target Milestone: | rc | Keywords: | Regression, ZStream | |
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | kvm-83-230.el5 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 713392 (view as bug list) | Environment: | ||
Last Closed: | 2011-07-11 06:36:07 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 580949, 696155, 707606, 713392 |
Description
Dan Yasny
2011-03-24 15:01:06 UTC
Interesting issue. I wonder how rhel6 behaves since we have RCU for the dirty bits. Maybe additional tweaking is needed for the tcp connection, like setting Nigel on or maybe other means. Dan, do they have nested pages for the hardware? This is the hardware I reproduced this on: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz stepping : 6 cpu MHz : 2493.751 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4987.50 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: Bugzilla changed the bug component by itself. Reverting to 'kvm'. Bugzilla bug reported: https://bugzilla.redhat.com/show_bug.cgi?id=693396 Verified this issue on # uname -r 2.6.18-260.el5 # rpm -q kvm kvm-83-232.el5 steps: Steps to Reproduce: 1. Start a VM migration, make sure the migration never ends, eg :in the guest running #stress iozone ,dd at the same time 2.migrate_set_speed 1G 3. watch the info migrate output, and count the transferred ram per second for an transfer speed approximation Actual Results: max speed could reach 500MB/s on kvm-83-224.el5 ,max speed could reach reach ~137MBps Based on above ,migration speed improved alot ,this issue has been fixed. (In reply to comment #41) Hi , Dan What you described should be a regression bug .Could you tell me which application is running in the guest so I can reproduce it ? Best Regards, Mike Hi, Glommer Could you review comment #41 and comment #43 ? According to comment #43 ,the speed improved alot . According to comment #41 ,the patch cause regression .I tried but I have not reproduce yet. Based on above ,how to deal with this issue ? Best Regards, Mike (In reply to comment #59) > From the above dmesg I can see that kvmclock is being used by the guest In RHEL5 so that we can not get the correct clocksource in #cat /sys/devices/system/clocksource/clocksource0/available_clocksource btw ,Does this bug mean enlarge migration speed cause network does not work ? Whether Bug 703112 is dup of this one ? According to Open new Bug 713392 -Increase migration max_downtime/or speed cause guest stalls for more investigation. According to comment #64 & comment #43 Change status to VERIFIED. Tried on kvm-83-239.el5 , found this issue come back ,re-assign this issue , BTW ,this issue blocks me to verify Bug 713392 Steps: 1.start guest with -m 1G -cpu 4 eg:/usr/libexec/qemu-kvm -m 1G -smp 4,sockets=4,cores=1,threads=1 -name RHEL5u7 -uuid 13bd47ff-7458-a214-9c43-d311ed5ca5a3 -monitor stdio -no-kvm-pit-reinjection -boot c -drive file=/mnt/RHEL5.7-virtio.qcow2,if=virtio,format=qcow2,cache=none,boot=on -net nic,macaddr=54:52:00:52:ed:61,vlan=0,model=virtio -net tap,script=/etc/qemu-ifup,downscript=no,vlan=0 -serial pty -parallel none -usb -vnc :1 -k en-us -vga cirrus -balloon none -M rhel5.6.0 -usbdevice tablet 2.in the guest #ping 8.8.8.8 -i 0.1 #stress -c 1 -m 1 3.(qemu)migrate_set_speed 1G 4.(qemu) migrate -d tcp:<hostB>:5888 Actual Results: wait for more than 30 mins ,migration never finished Additional info: on kvm-83-238.el5 ,migration can be finished with the following steps. when do local migration, migration default transfer speed is about 35M/sec after changed migrate_set_speed value to 1G, migration transfer speed is about 160M. I think we need another bug here - it is visible that enlarging the migration speed does help but the migration convergence rate is too low. But 713392 fixed an issue that seems to make migration convergence fast but eventually it left lots of pages that need to be transferred to the destination on the last stage of the migration. So in fact, the result is the same - you can increase the max downtime to 1-2 seconds (from 100ms) and see the migration converges fast enough. (In reply to comment #70) > I think we need another bug here - it is visible that enlarging the migration > speed does help but the migration convergence rate is too low. Referring to comment #0,seems this is the reason for reporting the bug "Migration speed is at it's default ~300Mbps. When updating the migrate_set_speed to 900000000 for example, we expect to see 900Mbps, but we only see a *slight* increase" > But 713392 fixed an issue that seems to make migration convergence fast but > eventually it left lots of pages that need to be transferred to the destination > on the last stage of the migration. So in fact, the result is the same - you > can increase the max downtime to 1-2 seconds (from 100ms) and see the migration > converges fast enough. As Bug 713392 reverted the patch of this Bug ,So no patch added for this bug's fix . Referring to comment #27 &comment #69 ,From the results I can see the speed does not highly increased after (qemu)migrate_set_speed 1G between on kvm-83-224.el5(unfixed) and on kvm-83-239.el5 (fixed). It is clear that the bug does not fixed. Based on above ,May I re-assign this issue ? Mike Referring to comment #71,comment #72 . Since no patch added for this bug's fix ,I will close it as won't fix. |