RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 981729 - Improve handling of "max_clients" setting
Summary: Improve handling of "max_clients" setting
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: TRACKER-bugs-affecting-libguestfs 992980 1058606 1086175
TreeView+ depends on / blocked
 
Reported: 2013-07-05 15:32 UTC by Daniel Berrangé
Modified: 2014-06-18 00:52 UTC (History)
11 users (show)

Fixed In Version: libvirt-1.1.1-3.el7
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 992980 1058606 1070221 (view as bug list)
Environment:
Last Closed: 2014-06-13 10:00:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Daniel Berrangé 2013-07-05 15:32:09 UTC
Description of problem:
The 'max_clients' setting controls how many clients are allowed to connect to libvirt, as a protection against DOS attacker from unauthenticated users.

It is problematic when dealing with concurrent start of large numbers of containers, because this can trigger a a very large number of connections in a short period of time.

With the default max_clients=20, this easily causes container startup failure.

We can't simply raise the limit since it is DOS protection, but we can improve the behaviour somewhat.

Currently libvirt will unconditionally accept() any incoming socket, regardless of the current number of clients. Thus if the max limit is hit, libvirt will accept and immediately  close client connections.

It would be better if libvirt simply did not accept() the connection. This would let pending connections wait for a previous connection to close before continuing. The max number of queued connections can be controlled via the listen() syscall, and could be a fairly large number, since they consume minimal resources. This would be a new "max_queued_clients" setting in libvirtd.conf, perhaps as much as 1000 

A new 'max_anonymous_clients' setting could limit only those connections which are accept()ed, but not yet authenticated. This could be fairly low (perhaps current 20)

The existing 'max_clients' setting could then be used as a limit on total number of connections, and set to a far higher value (perhaps several 100 or more)

Version-Release number of selected component (if applicable):
1.1.0-1.el6.

How reproducible:
Always

Steps to Reproduce:
1. Attempt to open 30 concurrent connections to libvirt
2.
3.

Actual results:
Only first 20 succeed, the rest are dropped

Expected results:
10 connections are queued, pending close of 10 earlier connections

Additional info:

Comment 2 Alex Jia 2013-07-08 10:23:37 UTC
# tail -2 /etc/libvirt/libvirtd.conf
max_clients = 20
max_workers = 20


# for i in {1..30}; do virt-sandbox-service create -C -u httpd.service -N dhcp myapache$i;done

# for i in {1..30}; do virt-sandbox-service create start myapache$i & done

XXX

# Unable to open connection: Unable to open lxc:///: Cannot recv data: Connection reset by peer
Unable to open connection: Unable to open lxc:///: Cannot recv data: Connection reset by peer
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe
Unable to open connection: Unable to open lxc:///: Cannot write data: Broken pipe

And check libvirtd log:

2013-07-08 10:13:58.933+0000: 8034: error : virNetServerAddClient:262 : Too many active clients (20), dropping connection from 127.0.0.1;0
2013-07-08 10:13:58.941+0000: 8034: error : virNetServerAddClient:262 : Too many active clients (20), dropping connection from 127.0.0.1;0
2013-07-08 10:13:58.943+0000: 8034: error : virNetServerAddClient:262 : Too many active clients (20), dropping connection from 127.0.0.1;0

Comment 3 Michal Privoznik 2013-07-25 14:24:16 UTC
I've just proposed patches upstream:

https://www.redhat.com/archives/libvir-list/2013-July/msg01646.html

Comment 5 Daniel Berrangé 2013-08-05 09:58:33 UTC
NB the upstream patches only implement half of this bug. There's still no separation of the limits for anonymous vs authenticated clients.

Comment 6 Michal Privoznik 2013-08-05 10:27:36 UTC
Ah, then I shouldn't have moved this to POST. Sorry.

Comment 7 Michal Privoznik 2013-08-05 10:50:47 UTC
After IRC discussion with Dan, we agreed to split this bug into two. The first part which is done (introducing "max_client" setting) is to be done in this bug. For the issue Dan's mentioning in comment #5 I've cloned this bug into bug 992980. Hence moving to POST again.

Comment 9 Alex Jia 2013-12-02 09:58:09 UTC
(In reply to Alex Jia from comment #2)
> # tail -2 /etc/libvirt/libvirtd.conf
> max_clients = 20
> max_workers = 20 

# tail -3 /etc/libvirt/libvirtd.conf

max_clients = 20
max_workers = 20
max_queued_clients = 20

> # for i in {1..30}; do virt-sandbox-service create -C -u httpd.service -N
> dhcp myapache$i;done
> 
> # for i in {1..30}; do virt-sandbox-service start myapache$i & done
> 

# rpm -q libvirt-sandbox libvirt kernel
libvirt-sandbox-0.5.0-6.el7.x86_64
libvirt-1.1.1-13.el7.x86_64
kernel-3.10.0-0.rc7.64.el7.x86_64

Using new 'virsh' method parallel to start containers:

# for i in {1..30}; do virsh -c lxc:/// start myapache$i & done

And can't hit the following issues, Michal, is it an expected result? or I must run many more containers to reproduce this?

> 
> Unable to open connection: Unable to open lxc:///: Cannot write data: Broken
> pipe
> 
> And check libvirtd log:
> 
> 2013-07-08 10:13:58.933+0000: 8034: error : virNetServerAddClient:262 : Too
> many active clients (20), dropping connection from 127.0.0.1;0

<slice>

2013-12-02 08:52:16.173+0000: 12384: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 51
2013-12-02 08:52:16.178+0000: 12383: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 56
2013-12-02 08:52:16.183+0000: 12021: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 54
2013-12-02 08:52:16.191+0000: 12380: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 69
2013-12-02 08:52:16.198+0000: 15201: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 75
2013-12-02 08:52:16.202+0000: 15202: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 62
2013-12-02 08:52:16.203+0000: 12387: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 83
2013-12-02 08:52:16.212+0000: 15359: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 81
2013-12-02 08:52:16.227+0000: 12378: debug : lxcContainerWaitForContinue:392 : Wait continue on fd 67
2013-12-02 08:52:16.291+0000: 12384: debug : lxcContainerWaitForContinue:394 : Got continue on fd 51 1
2013-12-02 08:52:16.292+0000: 12018: debug : virLXCMonitorHandleEventInit:107 : Event init 19730
2013-12-02 08:52:16.292+0000: 12384: debug : virDomainFree:2428 : dom=0x7f9258014df0, (VM: name=myapache24, uuid=b295c4f7-7921-46e7-8142-ed795724671e)
2013-12-02 08:52:16.293+0000: 12024: debug : virDomainLookupByUUID:2186 : conn=0x7f929c004560, uuid=b295c4f7-7921-46e7-8142-ed795724671e
2013-12-02 08:52:16.293+0000: 12024: debug : virDomainFree:2428 : dom=0x7f9284008db0, (VM: name=myapache24, uuid=b295c4f7-7921-46e7-8142-ed795724671e)
2013-12-02 08:52:16.294+0000: 12018: debug : virConnectClose:1523 : conn=0x7f929c004560
2013-12-02 08:52:16.308+0000: 12383: debug : lxcContainerWaitForContinue:394 : Got continue on fd 56 1
2013-12-02 08:52:16.308+0000: 12018: debug : virLXCMonitorHandleEventInit:107 : Event init 19776
2013-12-02 08:52:16.309+0000: 12383: debug : virDomainFree:2428 : dom=0x7f926400d3b0, (VM: name=myapache27, uuid=0d1096c3-792c-4be8-a701-3e0067d12e0a)
2013-12-02 08:52:16.310+0000: 12385: debug : virDomainLookupByUUID:2186 : conn=0x7f9298011110, uuid=0d1096c3-792c-4be8-a701-3e0067d12e0a
2013-12-02 08:52:16.310+0000: 12385: debug : virDomainFree:2428 : dom=0x7f925c010e60, (VM: name=myapache27, uuid=0d1096c3-792c-4be8-a701-3e0067d12e0a)
2013-12-02 08:52:16.312+0000: 12018: debug : virConnectClose:1523 : conn=0x7f9298011110

</slice>

Comment 10 Alex Jia 2014-02-25 10:18:06 UTC
Daniel, I can successfully start 41 containers not 40 now, is it an expected result? 

# tail -3 /etc/libvirt/libvirtd.conf 
max_clients = 20
max_workers = 20
max_queued_clients = 20

# for i in {1..50}; do virt-sandbox-service create -C -u httpd.service -N dhcp myapache$i;done

# for i in {1..50}; do virsh -c lxc:/// start myapache$i & done

# virsh -c lxc:/// -q list |wc -l
41

# rpm -q libvirt-daemon libvirt-sandbox kernel
libvirt-daemon-1.1.1-23.el7.x86_64
libvirt-sandbox-0.5.0-9.el7.x86_64
kernel-3.10.0-86.el7.x86_64

Additional info:

error: Failed to start domain myapache36
error: internal error: Failed to allocate free veth pair after 10 attempts

error: Failed to start domain myapache29
error: internal error: Failed to allocate free veth pair after 10 attempts

NOTE: Maybe, 10 attempts are too few for some users then they possibly want to change this, so I think it will be better if we have a configuration item for it, otherwise, we should document 10 attempts in libvirtd.conf or relevant guide.

Comment 11 Michal Privoznik 2014-02-25 15:51:28 UTC
(In reply to Alex Jia from comment #10)
> Daniel, I can successfully start 41 containers not 40 now, is it an expected
> result? 
> 
> # tail -3 /etc/libvirt/libvirtd.conf 
> max_clients = 20
> max_workers = 20
> max_queued_clients = 20
> 
> # for i in {1..50}; do virt-sandbox-service create -C -u httpd.service -N
> dhcp myapache$i;done
> 
> # for i in {1..50}; do virsh -c lxc:/// start myapache$i & done
> 
> # virsh -c lxc:/// -q list |wc -l
> 41

Yes and no. Kernel does some caching on sockets and some partial opening even if the server is not currently responsive too. So you may end up with more than 40 guests running. Hence I think anything above or equal to 40 is okay.

> 
> # rpm -q libvirt-daemon libvirt-sandbox kernel
> libvirt-daemon-1.1.1-23.el7.x86_64
> libvirt-sandbox-0.5.0-9.el7.x86_64
> kernel-3.10.0-86.el7.x86_64
> 
> Additional info:
> 
> error: Failed to start domain myapache36
> error: internal error: Failed to allocate free veth pair after 10 attempts
> 
> error: Failed to start domain myapache29
> error: internal error: Failed to allocate free veth pair after 10 attempts
> 

This is an internal (buggy) implementation. Let me see if I can fix this.

Comment 12 Michal Privoznik 2014-02-25 16:08:22 UTC
Patch proposed upstream:

https://www.redhat.com/archives/libvir-list/2014-February/msg01548.html

Comment 14 Michal Privoznik 2014-02-26 12:32:26 UTC
So, after discussion on my backport, the bug raised in comment 10 is a separate issue and deserves own bug. I'm moving this back to MODIFIED, as the request is complete and creating a new bug for the veth issue: bug 1070221.

Comment 15 dyuan 2014-03-12 03:40:50 UTC
Move to VERIFIED since the separate bug is filed and already verified.

Comment 16 Ludek Smid 2014-06-13 10:00:59 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.


Note You need to log in before you can comment on or make changes to this bug.