Description of problem: Tracking bug for https://bugs.launchpad.net/oslo.rootwrap/+bug/1760471
The issue is specific to Docker containers which are spawn with NOFILE ulimit equals to 1048576. Using close_fds=False in oslo.rootwrap can impact security (introduce a risk of leaking a sensitive file descriptor). Backporting the Python 3 code to optimize close_fds=True to Python 2.7 is non trivial, I suggest to not do that. As I wrote in the Launchpad issue, the bug is not specific to Python: slapd and rpm are impacted as well by the high NOFILE ulimit value. I suggest instead to ajust the docker configuration, at least for specific containers. Example: "sudo docker run --ulimit nofile=1024:1024 ...".
Just for record: to add that performance hit is noted in phase 2 job as well:- Some operation logs by l3 agent with time taken:- 1) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-0.tar.gz?controller-0/var/log/containers/neutron/l3-agent.log ----------------25 seconds-------------------- 2018-04-02 12:36:10.818 131470 DEBUG neutron.agent.l3.agent [req-55ab23f4-0385-4da7-9b8a-0d50e64fc2ca 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'7974c7f9-b57a-4395-b12e-e520693fbc0b'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424 2018-04-02 12:36:35.355 131470 DEBUG neutron.agent.l3.agent [-] Finished a router update for 7974c7f9-b57a-4395-b12e-e520693fbc0b _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574 2) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-1.tar.gz?controller-1/var/log/containers/neutron/l3-agent.log --------------------20 seconds------------------------- 2018-04-02 12:48:03.290 109676 DEBUG neutron.agent.l3.agent [req-09271469-1c18-4fb3-9682-486a600de182 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'35036160-53f3-4592-822c-aab5155a6c05'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424 2018-04-02 12:48:23.809 109676 DEBUG neutron.agent.l3.agent [-] Finished a router update for 35036160-53f3-4592-822c-aab5155a6c05 _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574 Yes as Victor said the best would be to chage ulimit per container basis.
Both https://review.openstack.org/#/c/559631 and https://review.openstack.org/#/c/559462/ have been merged upstream and downstream for OSP13.
VERIFIED python-paunch-2.5.0-1.el7ost.noarch openstack-tripleo-heat-templates-8.0.2-0.20180414062830.5f869f2.el7ost.noarch [heat-admin@compute-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"] [heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"] [heat-admin@controller-0 ~]$ sudo docker inspect neutron_dhcp |grep ulimit "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"] [heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"] [heat-admin@controller-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086