869650 – Can't add more than 256 logical networks

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 869650 - Can't add more than 256 logical networks

Summary: Can't add more than 256 logical networks

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	6.4
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	---
Assignee:	Michal Privoznik
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:	869557
Blocks:
TreeView+	depends on / blocked

Reported:	2012-10-24 13:36 UTC by Chris Pelland
Modified:	2012-11-22 09:40 UTC (History)
CC List:	23 users (show)
Fixed In Version:	libvirt-0.9.10-21.el6_3.6
Doc Type:	Bug Fix
Doc Text:	Cause: Libvirt client communicates with the libvirt daemon via our RPC system. The messages have limit for maximum size in order to prevent memory exhaustion. Whenever the daemon was about to receive a message it had to allocate memory up to the limit. So blind lifting of limit would cause libvirtd to be more memory hungry. Consequence: The limit was 65536 bytes (including libvirt headers). This wasn't enough for some big XMLs and hence big messages were dropped leaving client unable to fetch useful data. Fix: The buffer for incoming messages was made dynamic. Usually, messages are small enough so there is no need to allocate 64KB buffer for them. This allows us to size up the limit (up to 1MB) without making libvirtd use more memory than is really needed. Result: Libvirt is able to send bigger messages and hence fetch much more data.
Clone Of:
Environment:
Last Closed:	2012-11-22 09:40:08 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2012:1484	0	normal	SHIPPED_LIVE	libvirt bug fix update	2012-11-22 14:39:12 UTC

Description Chris Pelland 2012-10-24 13:36:20 UTC

This bug has been copied from bug #869557 and has been proposed
to be backported to 6.3 z-stream (EUS).

Comment 4 Michal Privoznik 2012-10-24 13:55:08 UTC

As we could backport just the patch to size up RPC limits, it would make libvirt to consume much more memory. That's because before my patchset libvirt was allocating the whole buffer (=the maximum message size) even for small messages. And this buffer was there through whole API execution. That's why I think we should backport the patch before as well:

commit eb635de1fed3257c5c62b552d1ec981c9545c1d7
Author: Michal Privoznik <mprivozn>
AuthorDate: Fri Apr 27 14:49:48 2012 +0200
Commit: Michal Privoznik <mprivozn>
CommitDate: Tue Jun 5 17:48:40 2012 +0200

rpc: Size up RPC limits

Since we are allocating RPC buffer dynamically, we can increase limits
for max. size of RPC message and RPC string. This is needed to cover
some corner cases where libvirt is run on such huge machines that their
capabilities XML is 4 times bigger than our current limit. This leaves
users with inability to even connect.

commit a2c304f6872f15c13c1cd642b74008009f7e115b
Author: Michal Privoznik <mprivozn>
AuthorDate: Thu Apr 26 17:21:24 2012 +0200
Commit: Michal Privoznik <mprivozn>
CommitDate: Tue Jun 5 17:48:40 2012 +0200

rpc: Switch to dynamically allocated message buffer

Currently, we are allocating buffer for RPC messages statically.
This is not such pain when RPC limits are small. However, if we want
ever to increase those limits, we need to allocate buffer dynamically,
based on RPC message len (= the first 4 bytes). Therefore we will
decrease our mem usage in most cases and still be flexible enough in
corner cases.

These patches are around since 0.9.13 (June 25 2012) and there has been just one bug found so far (probably worth backporting as well) - bug 845521 - fixed in this commit:

commit f8ef393ee3a67a61a4c991f50d62652ed81c2ebd
Author: Peter Krempa <pkrempa>
AuthorDate: Fri Aug 3 16:50:16 2012 +0200
Commit: Peter Krempa <pkrempa>
CommitDate: Fri Aug 3 23:30:01 2012 +0200

client: Free message when freeing client

The last message of the client was not freed leaking 4 bytes of memory
in the client when the remote daemon crashed while processing a message.

So I am okay with backporting these three patches as from my POV I consider them safe. BTW: RPC code is something used by *every* libvirt user, so if there were any bugs, they would have been discovered already.

Comment 5 Michal Privoznik 2012-10-29 10:52:09 UTC

Moving to POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2012-October/msg01191.html

Comment 7 yanbing du 2012-11-15 03:40:49 UTC

Verify this bug with libvirt-0.9.10-21.el6_3.6.x86_64
Steps:
1.Prepare a template xml file to define networks.
# cat templ.xml 
<network>
  <name>NET-#NIC#</name>
  <forward mode='nat'/>
  <bridge name='virbr-#NIC#' stp='on' delay='0' />
  <ip address='192.168.221.#NIC#' netmask='255.255.255.255'>
  </ip>
</network>
2. Prepare  a script to define and start 250 networks automatically
# cat vnet.sh 
#!/bin/sh
for i in {1..250}; do
sed "s/#NIC#/$i/g"  templ.xml > net-$i.xml
virsh net-define net-$i.xml
virsh net-start NET-$i
rm -f net-$i.xml
sleep 1
done
3. Redo step1-2 to add other 250 networks, and check the result:

# virsh net-list --all |wc -l
504

Comment 8 yanbing du 2012-11-15 06:38:21 UTC

Hi Michal,
I have a question need your confirm.
When i define&start more than 500 networks, and restart libvirtd service, then execute any virsh command will take several mins(only the first time).
So is this normal?

# service libvirtd restart
Stopping libvirtd daemon:                                  [  OK  ]
Starting libvirtd daemon:                                  [  OK  ]

# time virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     test                           running


real	3m16.719s
user	0m0.010s
sys	0m0.010s

# time virsh list --all
 Id    Name                           State
----------------------------------------------------
 4     test                           running


real	0m0.063s
user	0m0.009s
sys	0m0.022s

Comment 9 Michal Privoznik 2012-11-15 08:39:37 UTC

Yes, when libvirt is starting up it autostarts some objects, like domain, network, storage pool. For network, multiple commands are spawned (iptables - usually 13 times for the default network; then dnsmasq to be the dhcp server for domains). I believe this is the source of such delay. You can see it yourself - if your system is under heavy load then this is the case. However, I agree it should not last so long. If you think the same, we should open a new bug and leave this one VERIFIED.

Comment 12 Dave Allan 2012-11-15 13:53:06 UTC

(In reply to comment #9)
> However, I agree it should not last so long. If you think the same, we
> should open a new bug and leave this one VERIFIED.

I agree also, can you open a BZ with the data about how long these operations are taking?

Comment 13 yanbing du 2012-11-16 02:55:25 UTC

(In reply to comment #12)
> (In reply to comment #9)
> > However, I agree it should not last so long. If you think the same, we
> > should open a new bug and leave this one VERIFIED.
> 
> I agree also, can you open a BZ with the data about how long these
> operations are taking?

Yes, already open a new bug 877244.

Comment 15 errata-xmlrpc 2012-11-22 09:40:08 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-1484.html

Note You need to log in before you can comment on or make changes to this bug.

abaron
acathrow
bazulay
cpelland
dallan
danken
dyasny
dyuan
iheim
jdenemar
lpeer
mavital
mprivozn
myakove
mzhan
pm-eus
rvaknin
rwu
weizhan
whuang
ydu
ykaul
zhpeng