Bug 1564138 - oslo-rootwrap-daemon performing badly in docker containers
Summary: oslo-rootwrap-daemon performing badly in docker containers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-paunch
Version: 13.0 (Queens)
Hardware: All
OS: All
urgent
urgent
Target Milestone: beta
: 13.0 (Queens)
Assignee: Emilien Macchi
QA Contact: Artem Hrechanychenko
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-05 12:54 UTC by wes hayutin
Modified: 2018-06-27 13:51 UTC (History)
15 users (show)

Fixed In Version: python-paunch-2.5.0-1.el7ost openstack-tripleo-heat-templates-8.0.2-0.20180410061339.b937f35.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-06-27 13:50:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1760471 0 None None None 2018-04-05 12:55:19 UTC
OpenStack gerrit 559462 0 None stable/queens: MERGED paunch: Add ulimit option for run action (I0cfcf4e3e3e13578ec42e12f459732992fb3a760) 2018-04-11 03:04:35 UTC
OpenStack gerrit 559631 0 None stable/queens: MERGED tripleo-heat-templates: Set ulimit for neutron agent containers (Iec722cdfd7642ff3149f50d940d8079b9e1b7147) 2018-04-11 03:04:29 UTC
Red Hat Product Errata RHEA-2018:2086 0 None None None 2018-06-27 13:51:34 UTC

Description wes hayutin 2018-04-05 12:54:46 UTC
Description of problem:

Tracking bug for https://bugs.launchpad.net/oslo.rootwrap/+bug/1760471

Comment 2 Victor Stinner 2018-04-05 13:21:58 UTC
The issue is specific to Docker containers which are spawn with NOFILE ulimit equals to 1048576. Using close_fds=False in oslo.rootwrap can impact security (introduce a risk of leaking a sensitive file descriptor). Backporting the Python 3 code to optimize close_fds=True to Python 2.7 is non trivial, I suggest to not do that.

As I wrote in the Launchpad issue, the bug is not specific to Python: slapd and rpm are impacted as well by the high NOFILE ulimit value. I suggest instead to ajust the docker configuration, at least for specific containers. Example: "sudo docker run --ulimit nofile=1024:1024 ...".

Comment 3 Yatin Karel 2018-04-05 16:00:22 UTC
Just for record: to add that performance hit is noted in phase 2 job as well:-

Some operation logs by l3 agent with time taken:-

1) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-0.tar.gz?controller-0/var/log/containers/neutron/l3-agent.log

----------------25 seconds--------------------
2018-04-02 12:36:10.818 131470 DEBUG neutron.agent.l3.agent [req-55ab23f4-0385-4da7-9b8a-0d50e64fc2ca 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'7974c7f9-b57a-4395-b12e-e520693fbc0b'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424
2018-04-02 12:36:35.355 131470 DEBUG neutron.agent.l3.agent [-] Finished a router update for 7974c7f9-b57a-4395-b12e-e520693fbc0b _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574


2) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-1.tar.gz?controller-1/var/log/containers/neutron/l3-agent.log

--------------------20 seconds-------------------------
2018-04-02 12:48:03.290 109676 DEBUG neutron.agent.l3.agent [req-09271469-1c18-4fb3-9682-486a600de182 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'35036160-53f3-4592-822c-aab5155a6c05'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424

2018-04-02 12:48:23.809 109676 DEBUG neutron.agent.l3.agent [-] Finished a router update for 35036160-53f3-4592-822c-aab5155a6c05 _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574


Yes as Victor said the best would be to chage ulimit per container basis.

Comment 5 Emilien Macchi 2018-04-11 02:08:37 UTC
Both https://review.openstack.org/#/c/559631 and https://review.openstack.org/#/c/559462/ have been merged upstream and downstream for OSP13.

Comment 13 Artem Hrechanychenko 2018-04-19 17:09:00 UTC
VERIFIED

python-paunch-2.5.0-1.el7ost.noarch
openstack-tripleo-heat-templates-8.0.2-0.20180414062830.5f869f2.el7ost.noarch


[heat-admin@compute-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit
                "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]

[heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit
                "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]

[heat-admin@controller-0 ~]$ sudo docker inspect neutron_dhcp |grep ulimit
                "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]


[heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit
                "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]

[heat-admin@controller-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit
                "config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]

Comment 16 errata-xmlrpc 2018-06-27 13:50:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:2086


Note You need to log in before you can comment on or make changes to this bug.