Bug 1313568 - [MariaDB] Increase file descriptors for mariadb
[MariaDB] Increase file descriptors for mariadb
Status: CLOSED CURRENTRELEASE
Product: Red Hat OpenStack
Classification: Red Hat
Component: rhosp-director (Show other bugs)
7.0 (Kilo)
All All
high Severity high
: ga
: 8.0 (Liberty)
Assigned To: James Slagle
Amit Ugol
: TestOnly
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-01 18:21 EST by Joe Talerico
Modified: 2016-04-28 09:51 EDT (History)
8 users (show)

See Also:
Fixed In Version: instack-undercloud-2.2.7-1.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-28 09:51:54 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 293675 None None None 2016-03-16 15:38 EDT
OpenStack gerrit 294204 None None None 2016-03-21 08:54 EDT

  None (edit)
Description Joe Talerico 2016-03-01 18:21:20 EST
Description of problem:
nova list fails with :

but in the nova-api.log we see:
https://gist.github.com/jtaleric/a5f9036da5fc7a65c865


Version-Release number of selected component (if applicable):


How reproducible:
100% (at scale)

Steps to Reproduce:
1. Deploy with ospd - run multiple scales, eventually you will hit this.

Actual results:
[stack@rackspace ~]$ nova list
ERROR (ClientException): The server has either erred or is incapable of performing the requested operation. (HTTP 500) (Request-ID: req-f0c15265-c3a4-46fa-b1a8-ba2f7a0cd232)


mariadb log
160301 17:48:09 [ERROR] Error in accept: Too many open files

[stack@rackspace ~]$ sysctl fs.file-max 
fs.file-max = 26225232
[stack@rackspace ~]$  cat /proc/$(pgrep mysqld$)/limits
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             1030571              1030571              processes 
Max open files            1024                 4096                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       1030571              1030571              signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us 

Expected results:
Not to run out of fd for mariadb
Comment 2 Michael Bayer 2016-03-02 13:15:58 EST
the 1024 is too low and mysqld_safe must be run as root, so that it can alter the file handle limit on startup.  when it is not run as root, we see the warning in the logs: "160216 15:13:29 [Warning] Could not increase number of max_open_files to more than 1024 (request: 4907)".
Comment 3 Michael Bayer 2016-03-02 13:26:15 EST
alternatively, for a systemd run, need to set LimitNOFILE to a bigger number than what it needs, in this case 4907.
Comment 5 Jaromir Coufal 2016-03-10 06:22:33 EST
This is very simple fix for OSP8 and unblocks our scale deployments which are failing due to this issue. Though this is not a blocker for the release, it is asked to be fixed for the OSP8 - I am requesting blocker flag mainly for the reason if we don't manage to land fix in time, we need doc_text for OSP8 workaround.
Comment 6 Michael Bayer 2016-03-10 09:11:21 EST
I would note that I think LimitNOFILE needs to be set to a number, and not the special "Infinity" value, but I have not confirmed this.  Joe T's experiments seemed to suggest this was the case, however.
Comment 7 James Slagle 2016-03-15 14:57:07 EDT
since we're not using systemd, i think we'd need to set this in /etc/security/limits.conf per:
https://ma.ttias.be/increase-open-files-limit-in-mariadb-on-centos-7-with-systemd/

can we get confirmation on the exact value that we should be using?
Comment 8 James Slagle 2016-03-15 14:58:08 EDT
(In reply to James Slagle from comment #7)
> since we're not using systemd, i think we'd need to set this in
> /etc/security/limits.conf per:
> https://ma.ttias.be/increase-open-files-limit-in-mariadb-on-centos-7-with-
> systemd/

sorry wrong link. should be:
https://mariadb.com/kb/en/mariadb/server-system-variables/#open_files_limit

> 
> can we get confirmation on the exact value that we should be using?
Comment 9 Michael Bayer 2016-03-15 15:07:23 EDT
if you're not using systemd, then mysqld_safe should be run as root.  That will allow it to set the file limits that are optimal for its configuration, before it spawns off the mysqld service itself which it will run as the mysql user.
Comment 10 James Slagle 2016-03-15 17:42:38 EDT
ok, i think we can call this one fixed then. using latest osp-d 8 packages, I see:

[root@overcloud-controller-0 ~]# ps aux | grep mysqld
root     14065  0.0  0.0 112644   924 pts/0    R+   21:39   0:00 grep --color=auto mysqld
root     14607  0.0  0.0  11768  1624 ?        S    21:33   0:00 /bin/sh /usr/bin/mysqld_safe --defaults-file=/etc/my.cnf --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --datadir=/var/lib/mysql --log-error=/var/log/mysqld.log --user=mysql --open-files-limit=16384 --wsrep-cluster-address=gcomm://overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
mysql    15529  2.2  4.6 1506672 181804 ?      Sl   21:33   0:08 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/usr/lib64/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 --log-error=/var/log/mysqld.log --open-files-limit=16384 --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --port=3306 --wsrep_start_position=00000000-0000-0000-0000-000000000000:-1


mysqld_safe started as the root user and we actually pass --open-files-limit=16384

and the limit is set to that value:

[root@overcloud-controller-0 ~]# cat /proc/14607/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             15011                15011                processes 
Max open files            16384                16384                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       15011                15011                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        


I'm not entirely sure where the 16384 value is coming from. It could be either the galera resource agent or puppet module, but it seems like a reasonable default.
Comment 11 Michael Bayer 2016-03-15 17:51:38 EDT
OK take a look at /proc/15529/limits, because that's the actual mysqld process. mysqld_safe is just a wrapper script.
Comment 12 James Slagle 2016-03-16 11:29:32 EDT
i've actually rebooted the environment, so the pids are different:

[root@overcloud-controller-0 ~]# ps aux | grep  mysql
root     14608  0.0  0.0  11768  1624 ?        S    14:33   0:00 /bin/sh /usr/bin/mysqld_safe --defaults-file=/etc/my.cnf --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --datadir=/var/lib/mysql --log-error=/var/log/mysqld.log --user=mysql --open-files-limit=16384 --wsrep-cluster-address=gcomm://overcloud-controller-0,overcloud-controller-1,overcloud-controller-2
mysql    15517  1.2  1.7 1720380 143364 ?      Sl   14:33   0:41 /usr/libexec/mysqld --defaults-file=/etc/my.cnf --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib64/mysql/plugin --user=mysql --wsrep-provider=/usr/lib64/galera/libgalera_smm.so --wsrep-cluster-address=gcomm://overcloud-controller-0,overcloud-controller-1,overcloud-controller-2 --log-error=/var/log/mysqld.log --open-files-limit=16384 --pid-file=/var/run/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock --port=3306 --wsrep_start_position=82c57d9f-eaf5-11e5-8ea4-365f2dfd153b:60857
root     17570  0.0  0.0 112648   924 pts/0    R+   15:28   0:00 grep --color=auto mysql
[root@overcloud-controller-0 ~]# cat /proc/14608/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        unlimited            unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             31139                31139                processes 
Max open files            16384                16384                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       31139                31139                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        
[root@overcloud-controller-0 ~]# cat /proc/15517/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             31139                31139                processes 
Max open files            20485                20485                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       31139                31139                signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us        


looks like the limit is 20485 for the actual mysql process.
Comment 13 James Slagle 2016-03-16 12:30:15 EDT
mike, does the last comment make it look like this is resolved?
Comment 14 Michael Bayer 2016-03-16 15:29:19 EDT
yes, you can see that mysqld_safe allowed the open files to be a pretty custom-tailored value for mysqld itself.
Comment 15 James Slagle 2016-03-16 15:38:10 EDT
alright, so the limit is already 16384 for the overcloud, but this bug is actually for the undercloud (sorry I missed that).

I submitted a patch to address that: https://review.openstack.org/293675

Note You need to log in before you can comment on or make changes to this bug.