Hide Forgot
+++ This bug was initially created as a clone of Bug #1026136 +++ Remove comments that not related this bug , make up the needinfo --- Additional comment from time.su on 2013-10-25 03:46:34 EDT --- Test with: libvirt-1.1.1-10.el7.x86_64 qemu-kvm-1.5.3-10.el7.x86_64 The comments 10's steps are okay for me , also test the raw img. But , the vol-download's speed will down to too slow when the image large enough (1G in my environment) #time virsh vol-download --vol vol-test.img --pool default --file /home/test1/a real 1m56.311s user 1m24.267s sys 0m2.743s And the 500M's volumes time: real 0m19.033s virsh # vol-dumpxml vol-test.img --pool default <volume> <name>vol-test.img</name> <key>/var/lib/libvirt/images/vol-test.img</key> <source> </source> <capacity unit='bytes'>1048576000</capacity> <allocation unit='bytes'>1048580096</allocation> <target> <path>/var/lib/libvirt/images/vol-test.img</path> <format type='raw'/> <permissions> <mode>0600</mode> <owner>0</owner> <group>0</group> <label>system_u:object_r:virt_image_t:s0</label> </permissions> <timestamps> <atime>1382683904.221608140</atime> <mtime>1382683904.144609566</mtime> <ctime>1382683904.145609548</ctime> </timestamps> </target> </volume> ll /home/test1/ total 63484 -rw-r--r--. 1 root root 65005760 Oct 25 14:53 a # ll /home/test1/a -rw-r--r--. 1 root root 133943320 Oct 25 14:53 /home/test1/a # ll /home/test1/a -rw-r--r--. 1 root root 152029600 Oct 25 14:53 /home/test1/a ......... -rw-r--r--. 1 root root 853136008 Oct 25 14:54 /home/test1/a # ll /home/test1/a -rw-r--r--. 1 root root 853987976 Oct 25 14:54 /home/test1/a # ll /home/test1/a -rw-r--r--. 1 root root 854643336 Oct 25 14:54 /home/test1/a Is it related libvirt ? --- Additional comment from time.su on 2013-10-30 04:25:22 EDT --- BTW , also reproduced it on rhel6. 1. And in the second time , the speed can be accepted if the target is same . # time virsh vol-download --vol test.img --pool default --file /home/a real 3m38.059s user 3m32.044s sys 0m2.489s # time virsh vol-download --vol test.img --pool default --file /home/a real 0m10.085s user 0m1.719s sys 0m2.323s # time virsh vol-download --vol test.img --pool default --file /home/b ;;cancel ^C real 1m49.476s user 1m45.474s sys 0m1.921s 2. During the process , virsh will use almost 100% CPU , watched via top KiB Mem: 7364840 total, 4982532 used, 2382308 free, 126120 buffers PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 28663 root 20 0 628020 312468 4784 R 98.3 4.2 0:48.99 virsh /*HERE*/ --- Additional comment from Osier Yang on 2013-10-30 09:58:56 EDT --- (In reply to time.su from comment #19) > Test with: > libvirt-1.1.1-10.el7.x86_64 > qemu-kvm-1.5.3-10.el7.x86_64 > > > The comments 10's steps are okay for me , also test the raw img. > But , the vol-download's speed will down to too slow when the image large > enough (1G in my environment) > > #time virsh vol-download --vol vol-test.img --pool default --file > /home/test1/a > > real 1m56.311s > user 1m24.267s > sys 0m2.743s > > And the 500M's volumes time: > real 0m19.033s > > virsh # vol-dumpxml vol-test.img --pool default > <volume> > <name>vol-test.img</name> > <key>/var/lib/libvirt/images/vol-test.img</key> > <source> > </source> > <capacity unit='bytes'>1048576000</capacity> > <allocation unit='bytes'>1048580096</allocation> > <target> > <path>/var/lib/libvirt/images/vol-test.img</path> > <format type='raw'/> > <permissions> > <mode>0600</mode> > <owner>0</owner> > <group>0</group> > <label>system_u:object_r:virt_image_t:s0</label> > </permissions> > <timestamps> > <atime>1382683904.221608140</atime> > <mtime>1382683904.144609566</mtime> > <ctime>1382683904.145609548</ctime> > </timestamps> > </target> > </volume> > > > ll /home/test1/ > total 63484 > -rw-r--r--. 1 root root 65005760 Oct 25 14:53 a > # ll /home/test1/a > -rw-r--r--. 1 root root 133943320 Oct 25 14:53 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 152029600 Oct 25 14:53 /home/test1/a > ......... > -rw-r--r--. 1 root root 853136008 Oct 25 14:54 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 853987976 Oct 25 14:54 /home/test1/a > # ll /home/test1/a > -rw-r--r--. 1 root root 854643336 Oct 25 14:54 /home/test1/a > > > > Is it related libvirt ? I think it's not related with libvirt, but I also have no idea why it becomes slow in the end of the transfering (when we talked f2f, time said that for a 1G volume, it becomes slow about at the point of 800M, before that it just looks fine), Dan, I think you are the more appropriate person for this question?
It's speed also slowly on RHLE6 , with libvirt-0.10.2-29.el6 qemu-kvm-rhev-0.12.1.2-2.415.el6 kernel-2.6.32-429.el6 So cloned here.
@ mkletzan : The original bug 1026136 was closed as NOTABUG, so can we close the bug.
(In reply to yangyang from comment #3) That's true, thanks. Closing as such.
I think we have the exact same problem described in this issue and looked into libvirt source. We can produce this with CentOS 6 libvirt as well as git HEAD. The vol-download uses two functions from src/rpc/virnetclientstream.c: virNetClientStreamQueuePacket() and virNetClientStreamRecvPacket(). The QueuePacket() (I shorten the names a bit) function is the source for data (the VM image in case of vol-download) and RecvPacket() is the sink. With fast system QueuePacket() function can produce data faster than RecvPacket() can consume and RecvPacket() starts to do huge amounts of memmove(). This makes vol-download really slow. For example in our case the vol-download starts with ~80MB/s (as seen from iotop) and after a while drops to below 1MB/s because the buffer starts to fill up. CPU utilisation of virsh is 100% after this issue happens. The source from RecvPacket(): 358 int virNetClientStreamRecvPacket(virNetClientStreamPtr st, 359 virNetClientPtr client, 360 char *data, 361 size_t nbytes, 362 bool nonblock) 363 { ... 399 if (st->incomingOffset) { 400 int want = st->incomingOffset; 401 if (want > nbytes) 402 want = nbytes; 403 memcpy(data, st->incoming, want); 404 if (want < st->incomingOffset) { 405 memmove(st->incoming, st->incoming + want, st->incomingOffset - want); 406 st->incomingOffset -= want; 407 } else { 408 VIR_FREE(st->incoming); 409 st->incomingOffset = st->incomingLength = 0; 410 } 411 rv = want; 412 } else { 413 rv = 0; 414 } The st->incomingOffset is usually (always?) bigger than nbytes (comparison in line 401). nbytes is defined as 64kB (src/libvirt-stream.c: virStreamRecvAll()). So want is same as nbytes (64kB) and thus smaller than st->incomingOffset (comparison in line 404) and the execution goes into lines 405-406. On line 405 memmove() removes already memcpy()ed data from st->incoming. Consider a case where st->incoming contains for example 512MB of data. Then every 64kB handled by the RecvPacket() causes 512MB-64kB memmove() operation. Already half a gig of data requires 8192 memmove()s to resolve and thus poor performance. And the QueuePacket() function just keeps pushing more data into st->incoming. Increasing the nbytes from 64kB to 1MB made vol-download perform better in our environment, but it doesn't remove the possibility of running in the same issue in the future. I hope this helps :) Yours, Ossi Herrala Codenomicon Oy
Forgot to say: Please, consider reopening this issue.
Thanks for that, I think your analysis is sound. I'm re-opening this bug, so we can consider what, if anything, we can do to improve this situation in general. For example, rather than storing one giant 'st->incoming' array, it might be better it we use an iovec, so when reading more data off the wire, we just add entries to the iovec. Then virNetClientStreamRecvPacket could just read data from the iovecs, and if it did need to memmove, it'd only be moving a small amount of data from one iovec - most of the others would be unchanged.
Created attachment 1035176 [details] Vector I/O version Use I/O vector (iovec) instead of one huge memory buffer as suggested in https://bugzilla.redhat.com/show_bug.cgi?id=1026137#c7. This avoids doing memmove() to big buffers and performance doesn't degrade if source (virNetClientStreamQueuePacket()) is faster than sink (virNetClientStreamRecvPacket()).
Thank you for posting the patch. Would you mind sending it to the upstream list in order to speed up the inclusion in libvirt?
Patch sent to list: https://www.redhat.com/archives/libvir-list/2015-June/msg00284.html
This will be handled in RHEL7 releases, bug 1026136. As RHEL6 is at the end of production 1 phase we would need a valid business justification. Thank you.