1116034 – [performance] Memory leak exist on node

Bug 1116034 - [performance] Memory leak exist on node

Summary: [performance] Memory leak exist on node

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	OpenShift Online
Classification:	Red Hat
Component:	Containers
Sub Component:
Version:	2.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Miciah Dashiel Butler Masters
QA Contact:	libra bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1021908
TreeView+	depends on / blocked

Reported:	2014-07-03 14:13 UTC by Luke Meyer
Modified:	2017-05-31 18:22 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:	1021908
Environment:
Last Closed:	2017-05-31 18:22:11 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Luke Meyer 2014-07-03 14:13:40 UTC

+++ This bug was initially created as a clone of Bug #1021908 +++

Dcription of problem:
During longevity testing, found node service mcollectived have about 9% memory increased increased.


Sever Environment:  On BJ openstack, Broker and Node OS config :
Broker : 2 cpus/4G memory/10G
Node   : 1 cpu/8G memory/200G

How reproducible:
Reference to "OpenShift Enterprise Performance Test Plan" 10.3.1:

our's script(longevity_app_create.sh) is the loop of one unit, the unit's logic:
sshkey remove/add--> add cart with all type's app--> select all apps-->rm carts/apps
Running Longevity test script about 14 days, and we get 191 cycles record. Selected the first app created as a compare point with every cycles.
From memory consumed results we can easy to get if existed memory leak on our's broker and node.

Steps to Reproduce:
1. Running Longevity script about 2  sweeks    
2.
3.

Actual results:
Depending on the selected point, we can get 191 memory results, we can 
see the Node's system memory consumed increased obviously. check the detail
from server monitor log, we get the mcollectived service existed memory leak.

Selected 3 cycle's mcollectived service status:
CYCLE   USER   PID   %CPU   %MEM   COMMAND
1st     root   27658  1.3   3.0    mcollectived
100th   root   27658  3.5   9.0    mcollectived
191th   root   27658  4.5   12.1   mcollectived

Expected results:
Memory consumeption growth should not more than 10% by itself.

--- Additional comment from Anping Li on 2014-07-03 02:00:30 EDT ---

Memory leak of mcollectived process still exists on ose-2.1

Version-Release number of selected component (if applicable):  
2.1/2014-5-29.3

[root@node ~]# rpm -qa|grep mcollective
ruby193-mcollective-2.4.1-5.el6op.noarch
ruby193-mcollective-common-2.4.1-5.el6op.noarch
openshift-origin-msg-node-mcollective-1.22.2-1.el6op.noarch

Broker: KVM (2 VCPU|4G RAM|10G Disk)
Node  : KVM (1 VCPU|8G RAM|200G Disk)

Test Result:
In this test, The test matrix was divided into test cycles. all supported cartridges are used in sequence in cycle. One application using these cartridges run the following actions as an unit.
sshkey remove --> sshkey add --> add each cartridge to an application -->access this application--> run "rhc domain show"  --> delete this application

Loop cycle, and record the consumed memory and CPU for core services after every unit is executed.

mcollective usage in the first:
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND^M
root      1156  8.8 32.3 3071504 2610812 ?     Sl   Jun18 731:49 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^

mcollective usage after 10 days. There are about 10936 units(apps) was executed until now.
SER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root      1156 16.2 78.3 7396880 6313108 ?     Sl   Jun18 3492:53 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^M

--- Additional comment from Luke Meyer on 2014-07-03 10:11:11 EDT ---

Good information, thanks. Given this problem persists, I would have to assume it's present upstream as well. It's just not such an urgent problem since it requires a higher level of activity than we're likely to see on a single node normally, and the memory is reclaimed any time mcollective is restarted. Still, it's a bug, and we want to reduce reasons to have to restart mcollective.

Comment 1 Eric Paris 2017-05-31 18:22:11 UTC

We apologize, however, we do not plan to address this report at this time. The majority of our active development is for the v3 version of OpenShift. If you would like for Red Hat to reconsider this decision, please reach out to your support representative. We are very sorry for any inconvenience this may cause.

Note You need to log in before you can comment on or make changes to this bug.