This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 869650 - Can't add more than 256 logical networks
Can't add more than 256 logical networks
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt (Show other bugs)
6.4
x86_64 Linux
urgent Severity urgent
: rc
: ---
Assigned To: Michal Privoznik
Virtualization Bugs
: ZStream
Depends On: 869557
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-24 09:36 EDT by Chris Pelland
Modified: 2012-11-22 04:40 EST (History)
23 users (show)

See Also:
Fixed In Version: libvirt-0.9.10-21.el6_3.6
Doc Type: Bug Fix
Doc Text:
Cause: Libvirt client communicates with the libvirt daemon via our RPC system. The messages have limit for maximum size in order to prevent memory exhaustion. Whenever the daemon was about to receive a message it had to allocate memory up to the limit. So blind lifting of limit would cause libvirtd to be more memory hungry. Consequence: The limit was 65536 bytes (including libvirt headers). This wasn't enough for some big XMLs and hence big messages were dropped leaving client unable to fetch useful data. Fix: The buffer for incoming messages was made dynamic. Usually, messages are small enough so there is no need to allocate 64KB buffer for them. This allows us to size up the limit (up to 1MB) without making libvirtd use more memory than is really needed. Result: Libvirt is able to send bigger messages and hence fetch much more data.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-11-22 04:40:08 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Chris Pelland 2012-10-24 09:36:20 EDT
This bug has been copied from bug #869557 and has been proposed
to be backported to 6.3 z-stream (EUS).
Comment 4 Michal Privoznik 2012-10-24 09:55:08 EDT
As we could backport just the patch to size up RPC limits, it would make libvirt to consume much more memory. That's because before my patchset libvirt was allocating the whole buffer (=the maximum message size) even for small messages. And this buffer was there through whole API execution. That's why I think we should backport the patch before as well:

commit eb635de1fed3257c5c62b552d1ec981c9545c1d7
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Fri Apr 27 14:49:48 2012 +0200
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Tue Jun 5 17:48:40 2012 +0200

    rpc: Size up RPC limits
    
    Since we are allocating RPC buffer dynamically, we can increase limits
    for max. size of RPC message and RPC string. This is needed to cover
    some corner cases where libvirt is run on such huge machines that their
    capabilities XML is 4 times bigger than our current limit. This leaves
    users with inability to even connect.

commit a2c304f6872f15c13c1cd642b74008009f7e115b
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Thu Apr 26 17:21:24 2012 +0200
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Tue Jun 5 17:48:40 2012 +0200

    rpc: Switch to dynamically allocated message buffer
    
    Currently, we are allocating buffer for RPC messages statically.
    This is not such pain when RPC limits are small. However, if we want
    ever to increase those limits, we need to allocate buffer dynamically,
    based on RPC message len (= the first 4 bytes). Therefore we will
    decrease our mem usage in most cases and still be flexible enough in
    corner cases.


These patches are around since 0.9.13 (June 25 2012) and there has been just one bug found so far (probably worth backporting as well) - bug 845521 - fixed in this commit:

commit f8ef393ee3a67a61a4c991f50d62652ed81c2ebd
Author:     Peter Krempa <pkrempa@redhat.com>
AuthorDate: Fri Aug 3 16:50:16 2012 +0200
Commit:     Peter Krempa <pkrempa@redhat.com>
CommitDate: Fri Aug 3 23:30:01 2012 +0200

    client: Free message when freeing client
    
    The last message of the client was not freed leaking 4 bytes of memory
    in the client when the remote daemon crashed while processing a message.


So I am okay with backporting these three patches as from my POV I consider them safe. BTW: RPC code is something used by *every* libvirt user, so if there were any bugs, they would have been discovered already.
Comment 7 yanbing du 2012-11-14 22:40:49 EST
Verify this bug with libvirt-0.9.10-21.el6_3.6.x86_64
Steps:
1.Prepare a template xml file to define networks.
# cat templ.xml 
<network>
  <name>NET-#NIC#</name>
  <forward mode='nat'/>
  <bridge name='virbr-#NIC#' stp='on' delay='0' />
  <ip address='192.168.221.#NIC#' netmask='255.255.255.255'>
  </ip>
</network>
2. Prepare  a script to define and start 250 networks automatically
# cat vnet.sh 
#!/bin/sh
for i in {1..250}; do
sed "s/#NIC#/$i/g"  templ.xml > net-$i.xml
virsh net-define net-$i.xml
virsh net-start NET-$i
rm -f net-$i.xml
sleep 1
done
3. Redo step1-2 to add other 250 networks, and check the result:

# virsh net-list --all |wc -l
504
Comment 8 yanbing du 2012-11-15 01:38:21 EST
Hi Michal,
I have a question need your confirm.
When i define&start more than 500 networks, and restart libvirtd service, then execute any virsh command will take several mins(only the first time).
So is this normal?

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# time virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     test                           running


real	3m16.719s
user	0m0.010s
sys	0m0.010s

# time virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     test                           running


real	0m0.063s
user	0m0.009s
sys	0m0.022s
Comment 9 Michal Privoznik 2012-11-15 03:39:37 EST
Yes, when libvirt is starting up it autostarts some objects, like domain, network, storage pool. For network, multiple commands are spawned (iptables - usually 13 times for the default network; then dnsmasq to be the dhcp server for domains). I believe this is the source of such delay. You can see it yourself - if your system is under heavy load then this is the case. However, I agree it should not last so long. If you think the same, we should open a new bug and leave this one VERIFIED.
Comment 12 Dave Allan 2012-11-15 08:53:06 EST
(In reply to comment #9)
> However, I agree it should not last so long. If you think the same, we
> should open a new bug and leave this one VERIFIED.

I agree also, can you open a BZ with the data about how long these operations are taking?
Comment 13 yanbing du 2012-11-15 21:55:25 EST
(In reply to comment #12)
> (In reply to comment #9)
> > However, I agree it should not last so long. If you think the same, we
> > should open a new bug and leave this one VERIFIED.
> 
> I agree also, can you open a BZ with the data about how long these
> operations are taking?

Yes, already open a new bug 877244.
Comment 15 errata-xmlrpc 2012-11-22 04:40:08 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-1484.html

Note You need to log in before you can comment on or make changes to this bug.