Bug 1267883 - Unable to control the file_descriptors limit for rabbitmq-server via the director.
Unable to control the file_descriptors limit for rabbitmq-server via the dire...
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
7.0 (Kilo)
x86_64 Linux
high Severity urgent
: y1
: 7.0 (Kilo)
Assigned To: Giulio Fidente
Marius Cornea
: Triaged, ZStream
Depends On: 1240587
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-01 05:47 EDT by Mike Burns
Modified: 2015-10-08 08:19 EDT (History)
18 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-71.el7ost
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1240587
Environment:
Last Closed: 2015-10-08 08:19:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 201796 None None None Never

  None (edit)
Description Mike Burns 2015-10-01 05:47:10 EDT
This is the THT part of the fix

+++ This bug was initially created as a clone of Bug #1240587 +++

Description of problem:
Unable to control the file_descriptors limit for rabbitmq-server via the director.

Version-Release number of selected component (if applicable):
Current.

How reproducible:
Always.

Steps to Reproduce:
1. Deploy an overcloud using OSP-d.

Actual results:
rabbitmq-sever is launched via the rabbitmq resource agent using the following file_descriptors limit :

# pcs resource show rabbitmq
 Resource: rabbitmq (class=ocf provider=heartbeat type=rabbitmq-cluster)
  Attributes: set_policy="ha-all ^(?!amq\.).* {"ha-mode":"all"}" 
  Operations: start interval=0s timeout=100 (rabbitmq-start-timeout-100)
              stop interval=0s timeout=90 (rabbitmq-stop-timeout-90)
              monitor interval=10 timeout=40 (rabbitmq-monitor-interval-10)

# rabbitmqctl report | grep -A3 file_descriptors
 {file_descriptors,[{total_limit,924},
                    {total_used,3},
                    {sockets_limit,829},
                    {sockets_used,1}]},
--
 {file_descriptors,[{total_limit,924},
                    {total_used,3},
                    {sockets_limit,829},
                    {sockets_used,1}]},
--
 {file_descriptors,[{total_limit,924},
                    {total_used,152},
                    {sockets_limit,829},
                    {sockets_used,150}]},

Expected results:
rabbitmq-server is launched via the rabbitmq resource agent but is able to control the file_descriptors limit via a parameter. As requested in BZ#1240561.

Additional info:


--- Additional comment from David Vossel on 2015-07-07 11:31:00 EDT ---

(In reply to Rafael Rosa from comment #3)
> Can the file_descriptors limit be increased manually as a work around? I'm
> trying to assess the urgency and if needs to be a blocker for GA. I would
> rather deal with it on A1 or A2, given our time constraints.

yes. the fix is trivial. 

echo "ulimit -S -n 4096" >> /etc/rabbitmq/rabbitmq-env.conf

after restarting rabbitmq we can see this setting is used.

rabbitmqctl status | grep -C 3 total_limit
 {vm_memory_limit,667051622},
 {disk_free_limit,50000000},
 {disk_free,11562962944},
 {file_descriptors,[{total_limit,3996},
                    {total_used,3},
                    {sockets_limit,3594},
                    {sockets_used,1}]},

--- Additional comment from Emilien Macchi on 2015-07-14 18:00:25 EDT ---

if it can help, you can also use this parameter[1] to customize the limit.

[1] https://github.com/puppetlabs/puppetlabs-rabbitmq/blob/95498e4174915dbce81c81bd42d6ec3b87df14b3/manifests/init.pp#L63

--- Additional comment from John Eckersberg on 2015-08-19 17:05:39 EDT ---

reposting from https://bugzilla.redhat.com/show_bug.cgi?id=1255091#c3 ... don't want it to get lost:

upstream pr - https://github.com/puppetlabs/puppetlabs-rabbitmq/pull/381

--- Additional comment from Emilien Macchi on 2015-08-19 17:27:27 EDT ---

I merged it.

--- Additional comment from James Slagle on 2015-08-27 12:18:26 EDT ---

we still need the tripleo-heat-templates update to set this parameter

--- Additional comment from  on 2015-09-24 10:02:10 EDT ---

Failed QA -Tested on:
openstack-puppet-modules-2015.1.8-17.el7ost.noarch

steps:
 - Changed the values from 16384 to 8192 on instack: /etc/security/limits.d/rabbitmq-server.conf
 - Deployed the overcloud (tried with and without tuskar)
 - Overcloud controller rabbitmq FD limit was still 16384 

Tried editing:
 - /usr/share/openstack-puppet/modules/rabbitmq/manifests/params.pp
 - Deployed the overcloud (tried with and without tuskar)
 - Overcloud controller rabbitmq FD limit was still 16384

--- Additional comment from Martin Magr on 2015-09-24 10:49:21 EDT ---

According to comment #15 tht does not have support. If you would try to run puppet manifest manually without RHOSd, the parameter would be changed.

--- Additional comment from Marius Cornea on 2015-09-30 17:55:23 EDT ---

The tripleo-heat-templates patch attached to the bug doesn't seem to be present in the 09-25 puddle. Was it updated so it can take the fd limit as parameter? I'm trying to verify this but we should be able to pass the fd limit as a parameter, not run puppet manually.
Comment 3 Marius Cornea 2015-10-02 09:29:52 EDT
stack@instack:~>>> cat templates/rabbit.yaml 
parameters:
    RabbitFDLimit: 7192

stack@instack:~>>> openstack overcloud deploy --templates ~/templates/my-overcloud -e ~/templates/my-overcloud/environments/network-isolation.yaml -e ~/templates/network-environment.yaml --control-scale 3 --compute-scale 1 --ceph-storage-scale 0 --ntp-server clock.redhat.com --libvirt-type qemu -e ~/templates/rabbit.yaml 

stack@instack:~>>> for ip in $(nova list  | grep controller |  awk '{print $12;}' | cut -d "=" -f2); do ssh heat-admin@$ip 'sudo rabbitmqctl report | grep -A3 file_descriptors';done
 {file_descriptors,[{total_limit,7092},
                    {total_used,3},
                    {sockets_limit,6380},
                    {sockets_used,1}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,4},
                    {sockets_limit,6380},
                    {sockets_used,2}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,197},
                    {sockets_limit,6380},
                    {sockets_used,195}]},
 {file_descriptors,[{total_limit,7092},
                    {total_used,3},
                    {sockets_limit,6380},
                    {sockets_used,1}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,197},
                    {sockets_limit,6380},
                    {sockets_used,195}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,4},
                    {sockets_limit,6380},
                    {sockets_used,2}]},
 {file_descriptors,[{total_limit,7092},
                    {total_used,197},
                    {sockets_limit,6380},
                    {sockets_used,195}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,4},
                    {sockets_limit,6380},
                    {sockets_used,2}]},
--
 {file_descriptors,[{total_limit,7092},
                    {total_used,3},
                    {sockets_limit,6380},
                    {sockets_used,1}]},
Comment 5 errata-xmlrpc 2015-10-08 08:19:55 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:1862

Note You need to log in before you can comment on or make changes to this bug.