Bug 1372589 - Please bump up qemu.conf the max_files to 131072 and max_processes to 65536
Summary: Please bump up qemu.conf the max_files to 131072 and max_processes to 65536
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 8.0 (Liberty)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: Upstream M3
: 11.0 (Ocata)
Assignee: Giulio Fidente
QA Contact: Yogev Rabl
Don Domingo
URL:
Whiteboard:
: 1389503 (view as bug list)
Depends On:
Blocks: 1386905 1387431 1414466 1414467 1430002
TreeView+ depends on / blocked
 
Reported: 2016-09-02 07:22 UTC by Robin Cernin
Modified: 2022-08-16 14:00 UTC (History)
18 users (show)

Fixed In Version: openstack-tripleo-heat-templates-6.0.0-0.20170127041112.ce54697.el7ost.1.noarch
Doc Type: Enhancement
Doc Text:
It is now possible to use puppet hieradata to set the max_files and max_processes for QEMU instances spawned by libvirtd. This can be done through an environment file containing the appropriate puppet classes. For example, to set the max_files and max_processes to 32768 and 131072 respectively, use: parameter_defaults: ExtraConfig nova::compute::libvirt::qemu::max_files: 32768 nova::compute::libvirt::qemu::max_processes: 131072 This update also sets these values as the default, since QEMU instances launched by libvirtd might consume a large number of file descriptors or threads. This depends on Compute guest hosted on each compute node and of Ceph RBD images each instance attaches to. It is necessary to be able to configure these limits in large clusters. With these new default values, the Compute service should be able to use more than 700 OSDs. This was previously identified as the limit imposed by the low number of max_files (originally 1024).
Clone Of:
: 1430002 (view as bug list)
Environment:
Last Closed: 2017-05-17 19:32:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 411983 0 None MERGED Add nova::compute::libvirt::qemu class to configure limits 2020-06-08 12:58:26 UTC
OpenStack gerrit 411984 0 None MERGED Include nova::compute::libvirt::qemu from the libvirt profile 2020-06-08 12:58:26 UTC
OpenStack gerrit 411987 0 None MERGED Increase libvirt/qemu.conf max_files and max_processes 2020-06-08 12:58:26 UTC
Red Hat Issue Tracker OSP-4565 0 None None None 2022-08-16 14:00:07 UTC
Red Hat Knowledge Base (Solution) 1602683 0 None None None 2016-09-05 05:29:39 UTC
Red Hat Product Errata RHEA-2017:1245 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 11.0 Bug Fix and Enhancement Advisory 2017-05-17 23:01:50 UTC

Description Robin Cernin 2016-09-02 07:22:45 UTC
Request for adding deployment option to director to bump up the max_files and max_processes in /etc/libvirt/qemu.conf

We are seeing that the current default limit deployed in RHEL7 is 1024 that is not enough for deployment with Ceph cluster.

When we have a Ceph cluster, each librbd needs 1 fd and 2 threads(read and write), so in worst case each RBD is talking to each OSD in cluster.

For a medium size cluster it is 200 OSDs:

  Each OSD needs 1 fd and 2 threads:
  - This means we will need 200 fds and 400 threads for 1 RBD image.

  max_files = fds (open fds)
  max_processes = threads

 - Lets consider that user has not more than 500 RBD images in these 200 OSDs
 
  max_files = 500(RBD images) * 200(fds) = 100000
  max_processes = 400(threads) * 200(fds) =  80000

Yet this is the worst scenario, and we think it would make sense to bump the value for OSP director to max_files = 131072 max_processes = 65536:

/etc/libvirt/qemu.conf
  max_files = 131072
  max_processes = 65536

Thank you,
Robin Cernin

Comment 2 Vikhyat Umrao 2016-09-02 13:49:26 UTC
little correction:

Each connection from RBD image to OSD needs 1 fd and 2 threads for example if you have 200 OSDs:
  - This means we will need 200 fds and 400 threads for 1 RBD image.

  max_files = fds (open fds)
  max_processes = threads

 - Lets consider that user has not more than 500 RBD images in these 200 OSDs
 
  max_files = 500(RBD images) * 200(fds) = 100000
  max_processes = 500(RBD images) * 400 (threads) =  200000

But as number of files limit is per process basis and number of process limit is system wide user.

So we hit mostly FD limits not the max_processes limit.


- Same is given in man page of getrlimit function , means number of files limit is per process and processes limit is for system wide user.
   # man 2 getrlimit

   RLIMIT_NOFILE
              Specifies a value one greater than the maximum file descriptor number that can be opened by this process.  Attempts (open(2), pipe(2), dup(2), etc.)  to exceed  this  limit  yield  the  error
              EMFILE.  (Historically, this limit was named RLIMIT_OFILE on BSD.)

       RLIMIT_NPROC
              The  maximum  number of processes (or, more precisely on Linux, threads) that can be created for the real user ID of the calling process.  Upon encountering this limit, fork(2) fails with the
              error EAGAIN.  This limit is not enforced for processes that have either the CAP_SYS_ADMIN or the CAP_SYS_RESOURCE capability.

- For now we can start from here ---

/etc/libvirt/qemu.conf
  max_files = 131072
  max_processes = 65536

Comment 3 Vikhyat Umrao 2016-09-02 13:51:39 UTC
For more information please check below given article:

Ceph - VM hangs when transfering large amounts of data to RBD disk
https://access.redhat.com/solutions/1602683

Comment 4 Jaromir Coufal 2016-10-18 14:30:10 UTC
This seems as Ceph requirement and its related issue in QEMU. Please re-assign if the evaluation of the group assignment is wrong.

Comment 5 Giulio Fidente 2016-11-23 10:57:09 UTC
*** Bug 1389503 has been marked as a duplicate of this bug. ***

Comment 9 Ben England 2016-12-16 22:29:09 UTC
Thanks for adjusting this!  max_files seems fine.

I'm a little unclear about what max_processes means, exactly.  Is it the maximum number of threads per user?  Sorry to be pedantic, I may be overthinking this, just trying to get a picture of what is being tuned.  How does this max_processes relate to kernel.pid-max change here:

https://bugzilla.redhat.com/show_bug.cgi?id=1389502#c8

Comment 10 Giulio Fidente 2016-12-19 10:32:49 UTC
(In reply to Ben England from comment #9)
> Thanks for adjusting this!  max_files seems fine.
> 
> I'm a little unclear about what max_processes means, exactly.  Is it the
> maximum number of threads per user?  Sorry to be pedantic, I may be
> overthinking this, just trying to get a picture of what is being tuned.  How
> does this max_processes relate to kernel.pid-max change here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1389502#c8

hi Ben, yes max_processes in qemu.conf seems to set the maximum number of processes (counting threads too) for the user which libvirtd uses to launch qemu instances.

The pid-max systctl is global, not per-user and I've basically multiplied all three (max_files too) by a factor of 32, yet I guess the real goal for the bugs is to make them customizable.

Comment 18 Yogev Rabl 2017-04-13 13:30:38 UTC
verified on openstack-tripleo-heat-templates-6.0.0-3.el7ost.noarch

1) the heat templates have been merged to tripleo-heat-templates
2) the configuration on /etc/libvirt/qemu.conf has been set properly

Comment 19 errata-xmlrpc 2017-05-17 19:32:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1245

Comment 20 Ben England 2017-06-13 19:33:59 UTC
The problem reoccurred on RHOSP 11, see comment https://bugzilla.redhat.com/show_bug.cgi?id=1430002#c10.  Does /etc/security/limits.d/20-nproc.conf need to bump up the process limit from 4096 to a higher value?  If so I will re-open bz.

Comment 21 Ben England 2017-06-13 20:52:14 UTC
I verified that the qemu user does not have privs to create enough threads for librados with 1000 OSDs.   This problem is going away with RHCS 3 with async messenger, which requires way fewer threads.

The program here:

http://perf1.perf.lab.eng.bos.redhat.com/bengland/public/openstack/
thread-create.c

tests thread creation limits, and indeed it fails because of limits on the qemu account placed by /etc/security/limits.d/20-nproc.conf , but if you raise this limit or change it with ulimit, then the problem goes away.  So I guess that's the workaround.  I had to change qemu account in /etc/passwd to allow a shell to do this.

# su - qemu
Last login: Tue Jun 13 20:43:25 UTC 2017 on pts/0
-bash-4.2$ /tmp/thread-create 4096
thread count: 4096
cat /proc/358010/limits
Limit                     Soft Limit           Hard Limit           Units  
...   
Max processes             4096                 1030485              processes 
...
x: 0
fatal: Error creating thread
errno 11: Resource temporarily unavailable

-bash-4.2$ tail /etc/security/limits.d/20-nproc.conf 
# Default limit for number of user's processes to prevent
# accidental fork bombs.
# See rhbz #432903 for reasoning.

*          soft    nproc     4096
root       soft    nproc     unlimited


Note You need to log in before you can comment on or make changes to this bug.