This bug has been copied from bug #869557 and has been proposed to be backported to 6.3 z-stream (EUS).
As we could backport just the patch to size up RPC limits, it would make libvirt to consume much more memory. That's because before my patchset libvirt was allocating the whole buffer (=the maximum message size) even for small messages. And this buffer was there through whole API execution. That's why I think we should backport the patch before as well: commit eb635de1fed3257c5c62b552d1ec981c9545c1d7 Author: Michal Privoznik <mprivozn> AuthorDate: Fri Apr 27 14:49:48 2012 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Tue Jun 5 17:48:40 2012 +0200 rpc: Size up RPC limits Since we are allocating RPC buffer dynamically, we can increase limits for max. size of RPC message and RPC string. This is needed to cover some corner cases where libvirt is run on such huge machines that their capabilities XML is 4 times bigger than our current limit. This leaves users with inability to even connect. commit a2c304f6872f15c13c1cd642b74008009f7e115b Author: Michal Privoznik <mprivozn> AuthorDate: Thu Apr 26 17:21:24 2012 +0200 Commit: Michal Privoznik <mprivozn> CommitDate: Tue Jun 5 17:48:40 2012 +0200 rpc: Switch to dynamically allocated message buffer Currently, we are allocating buffer for RPC messages statically. This is not such pain when RPC limits are small. However, if we want ever to increase those limits, we need to allocate buffer dynamically, based on RPC message len (= the first 4 bytes). Therefore we will decrease our mem usage in most cases and still be flexible enough in corner cases. These patches are around since 0.9.13 (June 25 2012) and there has been just one bug found so far (probably worth backporting as well) - bug 845521 - fixed in this commit: commit f8ef393ee3a67a61a4c991f50d62652ed81c2ebd Author: Peter Krempa <pkrempa> AuthorDate: Fri Aug 3 16:50:16 2012 +0200 Commit: Peter Krempa <pkrempa> CommitDate: Fri Aug 3 23:30:01 2012 +0200 client: Free message when freeing client The last message of the client was not freed leaking 4 bytes of memory in the client when the remote daemon crashed while processing a message. So I am okay with backporting these three patches as from my POV I consider them safe. BTW: RPC code is something used by *every* libvirt user, so if there were any bugs, they would have been discovered already.
Moving to POST: http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-October/msg01191.html
Verify this bug with libvirt-0.9.10-21.el6_3.6.x86_64 Steps: 1.Prepare a template xml file to define networks. # cat templ.xml <network> <name>NET-#NIC#</name> <forward mode='nat'/> <bridge name='virbr-#NIC#' stp='on' delay='0' /> <ip address='192.168.221.#NIC#' netmask='255.255.255.255'> </ip> </network> 2. Prepare a script to define and start 250 networks automatically # cat vnet.sh #!/bin/sh for i in {1..250}; do sed "s/#NIC#/$i/g" templ.xml > net-$i.xml virsh net-define net-$i.xml virsh net-start NET-$i rm -f net-$i.xml sleep 1 done 3. Redo step1-2 to add other 250 networks, and check the result: # virsh net-list --all |wc -l 504
Hi Michal, I have a question need your confirm. When i define&start more than 500 networks, and restart libvirtd service, then execute any virsh command will take several mins(only the first time). So is this normal? # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # time virsh list --all Id Name State ---------------------------------------------------- 4 test running real 3m16.719s user 0m0.010s sys 0m0.010s # time virsh list --all Id Name State ---------------------------------------------------- 4 test running real 0m0.063s user 0m0.009s sys 0m0.022s
Yes, when libvirt is starting up it autostarts some objects, like domain, network, storage pool. For network, multiple commands are spawned (iptables - usually 13 times for the default network; then dnsmasq to be the dhcp server for domains). I believe this is the source of such delay. You can see it yourself - if your system is under heavy load then this is the case. However, I agree it should not last so long. If you think the same, we should open a new bug and leave this one VERIFIED.
(In reply to comment #9) > However, I agree it should not last so long. If you think the same, we > should open a new bug and leave this one VERIFIED. I agree also, can you open a BZ with the data about how long these operations are taking?
(In reply to comment #12) > (In reply to comment #9) > > However, I agree it should not last so long. If you think the same, we > > should open a new bug and leave this one VERIFIED. > > I agree also, can you open a BZ with the data about how long these > operations are taking? Yes, already open a new bug 877244.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-1484.html