Bug 1564138
| Summary: | oslo-rootwrap-daemon performing badly in docker containers | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | wes hayutin <whayutin> |
| Component: | python-paunch | Assignee: | Emilien Macchi <emacchi> |
| Status: | CLOSED ERRATA | QA Contact: | Artem Hrechanychenko <ahrechan> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 13.0 (Queens) | CC: | abeekhof, ahrechan, apevec, bhaley, dbecker, emacchi, jcoufal, jschluet, lhh, mburns, morazi, rhel-osp-director-maint, sclewis, srevivo, ykarel |
| Target Milestone: | beta | Keywords: | Triaged |
| Target Release: | 13.0 (Queens) | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | python-paunch-2.5.0-1.el7ost openstack-tripleo-heat-templates-8.0.2-0.20180410061339.b937f35.el7ost | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-06-27 13:50:52 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
wes hayutin
2018-04-05 12:54:46 UTC
The issue is specific to Docker containers which are spawn with NOFILE ulimit equals to 1048576. Using close_fds=False in oslo.rootwrap can impact security (introduce a risk of leaking a sensitive file descriptor). Backporting the Python 3 code to optimize close_fds=True to Python 2.7 is non trivial, I suggest to not do that. As I wrote in the Launchpad issue, the bug is not specific to Python: slapd and rpm are impacted as well by the high NOFILE ulimit value. I suggest instead to ajust the docker configuration, at least for specific containers. Example: "sudo docker run --ulimit nofile=1024:1024 ...". Just for record: to add that performance hit is noted in phase 2 job as well:- Some operation logs by l3 agent with time taken:- 1) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-0.tar.gz?controller-0/var/log/containers/neutron/l3-agent.log ----------------25 seconds-------------------- 2018-04-02 12:36:10.818 131470 DEBUG neutron.agent.l3.agent [req-55ab23f4-0385-4da7-9b8a-0d50e64fc2ca 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'7974c7f9-b57a-4395-b12e-e520693fbc0b'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424 2018-04-02 12:36:35.355 131470 DEBUG neutron.agent.l3.agent [-] Finished a router update for 7974c7f9-b57a-4395-b12e-e520693fbc0b _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574 2) http://cougar11.scl.lab.tlv.redhat.com/phase2-13_director-rhel-7.5-virthost-3cont_1comp_3ceph-ipv4-vxlan-ceph-containers/14/controller-1.tar.gz?controller-1/var/log/containers/neutron/l3-agent.log --------------------20 seconds------------------------- 2018-04-02 12:48:03.290 109676 DEBUG neutron.agent.l3.agent [req-09271469-1c18-4fb3-9682-486a600de182 76e23c896d5d4bdba0ae71ab562859f2 d6d063764f424df58818a4b11bd15b3b - - -] Got routers updated notification :[u'35036160-53f3-4592-822c-aab5155a6c05'] routers_updated /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:424 2018-04-02 12:48:23.809 109676 DEBUG neutron.agent.l3.agent [-] Finished a router update for 35036160-53f3-4592-822c-aab5155a6c05 _process_router_update /usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py:574 Yes as Victor said the best would be to chage ulimit per container basis. Both https://review.openstack.org/#/c/559631 and https://review.openstack.org/#/c/559462/ have been merged upstream and downstream for OSP13. VERIFIED
python-paunch-2.5.0-1.el7ost.noarch
openstack-tripleo-heat-templates-8.0.2-0.20180414062830.5f869f2.el7ost.noarch
[heat-admin@compute-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit
"config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
[heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit
"config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
[heat-admin@controller-0 ~]$ sudo docker inspect neutron_dhcp |grep ulimit
"config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
[heat-admin@controller-0 ~]$ sudo docker inspect neutron_l3_agent |grep ulimit
"config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
[heat-admin@controller-0 ~]$ sudo docker inspect neutron_ovs_agent |grep ulimit
"config_data": "{\"start_order\": 10, \"ulimit\": [\"nofile=1024\"]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:2086 |