Bug 1397418
Summary: | neutron-keepalived-state-change lives behind big processes | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Attila Fazekas <afazekas> |
Component: | openstack-neutron | Assignee: | Daniel Alvarez Sanchez <dalvarez> |
Status: | CLOSED ERRATA | QA Contact: | GenadiC <gcheresh> |
Severity: | unspecified | Docs Contact: | |
Priority: | high | ||
Version: | 10.0 (Newton) | CC: | amuller, bperkins, chrisw, dalvarez, jschluet, mlopes, nyechiel, srevivo |
Target Milestone: | z2 | Keywords: | Triaged, ZStream |
Target Release: | 10.0 (Newton) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | openstack-neutron-9.1.1-6.el7ost | Doc Type: | Bug Fix |
Doc Text: |
Previously, every time neutron-keepalived-state-change was killed, the IP monitor process it spawned remained in an orphaned state. This resulted in leaked memory over time and required manual actions from administrators.
With this update, the process is killed gracefully and its child IP monitor process will be killed as well, avoiding this memory leak.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-02-23 16:34:02 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Attila Fazekas
2016-11-22 13:40:21 UTC
Hi, I have done some tests and indeed, there are processes leaked. In my particular case I've observed these new processes after running the following two tests: neutron.tests.functional.agent.l3.test_ha_router.L3HATestCase.test_ha_router_lifecycle neutron.tests.functional.agent.l3.test_ha_router.LinuxBridgeL3HATestCase.test_ha_router_lifecycle 246a246,250 > root 21276 1 0 11:51 ? 00:00:00 ip -o monitor address > root 21680 1 0 11:52 ? 00:00:00 ip -o monitor address > root 21825 1 0 11:52 ? 00:00:00 sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf > root 21826 21825 1 11:52 ? 00:00:00 /usr/bin/python /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf [centos@devstack bug1397418]$ ps -o rss,sz,vsz 21276 RSS SZ VSZ 788 1666 6664 [centos@devstack bug1397418]$ ps -o rss,sz,vsz 21680 RSS SZ VSZ 788 1666 6664 [centos@devstack bug1397418]$ ps -o rss,sz,vsz 21825 RSS SZ VSZ 2796 48593 194372 [centos@devstack bug1397418]$ ps -o rss,sz,vsz 21826 RSS SZ VSZ 20936 76311 305244 Regarding the 'ip -o monitor' process, that's indeed because keepalived-state-change process is spawning that one and when a HA router is deleted, the keepalived-state-change is being stopped through a SIGKILL leaving 'ip -o monitor' orphaned. @Attila: does it make sense compared against what you have observed? @Assaf In my opinion, the patch to fix this could be capturing the kill signal within keepalived_state_change and get rid of children processes. Also, I need to investigate further about the rootwrap-daemon processes being leaked. Sent a patch to upstream gerrit: https://review.openstack.org/#/c/411968/ In my setup, at least no 'ip -o monitor processes' are orphaned anymore Daniel looks like patches have landed on stable/newton. Build with the fix has been released in openstack-neutron-9.1.1-6.el7ost Steps to test: 1. 3 controller default setup 2. install tempest 3. run any tempest scenario test which creates and uses and deletes a router. for ex.: ostestr -r 'minimum' To verify I had to run test_minimum_basic.TestMinimumBasicScenario.test_minimum_basic_scenario test and made sure that ps aux | grep "monitor address" didn't return anything, so the process didn't exist Verified in openstack-neutron-ml2-9.2.0-2.el7ost.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0314.html |