| Summary: | [performance] mcollectived memory leak exist on node | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | nsun <nsun> | |
| Component: | Containers | Assignee: | Brenton Leanhardt <bleanhar> | |
| Status: | CLOSED EOL | QA Contact: | libra bugs <libra-bugs> | |
| Severity: | high | Docs Contact: | ||
| Priority: | low | |||
| Version: | 2.2.0 | CC: | anli, gpei, jeder, jialiu, libra-onpremise-devel, lmeyer, rthrashe, xtian | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1116034 (view as bug list) | Environment: | ||
| Last Closed: | 2017-01-13 22:35:56 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Bug Depends On: | 1116034 | |||
| Bug Blocks: | ||||
|
Description
nsun
2013-10-22 10:12:39 UTC
I would really be interested in seeing the results of this test against 2.0. :) QE will run a round of performance testing after ose-2.0 code freeze ( ~ Nov 7th). Let us wait to see the result of 2.0 performance testing. Memory leak of mcollectived process still exists on ose-2.0 Version-Release number of selected component (if applicable): 2.0/2013-11-22.1 [root@node ~]# rpm -qa|grep mcollective ruby193-mcollective-2.2.3-4.el6op.noarch ruby193-mcollective-common-2.2.3-4.el6op.noarch openshift-origin-msg-node-mcollective-1.17.2-2.el6op.noarch Broker: KVM (2 VCPU|4G RAM|10G Disk) Node : KVM (1 VCPU|8G RAM|200G Disk) We have a script which would keep creating and deleting all kinds of cartridges. After one week running, the mcollectived process has taken 3.4% more memory. mcollective usage in the first: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 5775 2.5 2.8 713884 232504 ? Sl Nov26 35:36 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^M mcollective usage after one week: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 5775 10.2 6.2 2943768 501500 ? Sl Nov26 1302:55 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^M Memory leak of mcollectived process still exists on ose-2.1 Version-Release number of selected component (if applicable): 2.1/2014-5-29.3 [root@node ~]# rpm -qa|grep mcollective ruby193-mcollective-2.4.1-5.el6op.noarch ruby193-mcollective-common-2.4.1-5.el6op.noarch openshift-origin-msg-node-mcollective-1.22.2-1.el6op.noarch Broker: KVM (2 VCPU|4G RAM|10G Disk) Node : KVM (1 VCPU|8G RAM|200G Disk) Test Result: In this test, The test matrix was divided into test cycles. all supported cartridges are used in sequence in cycle. One application using these cartridges run the following actions as an unit. sshkey remove --> sshkey add --> add each cartridge to an application -->access this application--> run "rhc domain show" --> delete this application Loop cycle, and record the consumed memory and CPU for core services after every unit is executed. mcollective usage in the first: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND^M root 1156 8.8 32.3 3071504 2610812 ? Sl Jun18 731:49 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^ mcollective usage after 10 days. There are about 10936 units(apps) was executed until now. SER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1156 16.2 78.3 7396880 6313108 ? Sl Jun18 3492:53 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg^M Good information, thanks. Given this problem persists, I would have to assume it's present upstream as well. It's just not such an urgent problem since it requires a higher level of activity than we're likely to see on a single node normally, and the memory is reclaimed any time mcollective is restarted. Still, it's a bug, and we want to reduce reasons to have to restart mcollective. OpenShift Enterprise v2 has officially reached EoL. This product is no longer supported and bugs will be closed. Please look into the replacement enterprise-grade container option, OpenShift Container Platform v3. https://www.openshift.com/container-platform/ More information can be found here: https://access.redhat.com/support/policy/updates/openshift/ |